I have CSV files that contains data of Cache performance on a source with different workloads for a particular time period ! For each time interval data is recorded , It includes columns like ReadHits , WriteHits , Cacheusage , ReadMiss Etc .
Ex of CSV FILE contents:
Interval,ReadHits,WriteHits,Cacheusage,ReadMiss
1 , 150 , 0 , 15474 , 12
2 , 0 , 0 , 700375, 245
3 , 15426 , 1546 , 45121,195
Note : Each interval will be of same time period , Eg 1 interval = 40Sec
In each column data will be from 0 to 60k+ , this varies for each interval !!
Eg : Interval 7 8 9 10 11
Readhits 0 240 1680 0 2091
So this way it contains data with lots of fluctuation ranging between 0 and 60k+
Suppose i have data till 60 intervals ,how can i predict data from intervals 61 to 70 ?
I have used ARIMA model , random forest , kmeans and different machine learning algorithms but have never been able to predict close to actual values !
Which algorithm will be better on this kind of data for predicting data of next intervals?
Apart from prediction what other useful and innovative things i can do from Machine learning algorithms for above kind of data that can be useful for the user ?