time-series classification, event detection in large data sets

Question

I have been looking for a way to classify events in a time-series I have. My data is streaming to the cloud so computation time is not really an issue. The data is coming in every second for up to 4 - 5 hours. During that time there are events that cause spikes in the data. I did some basic analysis using spectral power to "see" these events, which mostly works, but I need to distinguish types of events. I attempted kNN where I computed three different types of spectral entropy when an event is seen. I could not see any clustering using kNN.

I found the package tsfresh and questions like this: Time-series classification - very poor results but when I mimic their results my output for features_filtered = select_features(extracted_features, y) is empty. However, the output from extracted_features = extract_features(df, column_id="id", column_sort="t") yields 618 features.

My first question is do I need to shorten the data to around where the event occurs? Right now I am simply marking each dataset as a binary of event or non-event. Secondly, it seems strange I would get 0 select_features. Are there any sanity checks to help me out?

Did you check the CRAN package TSClust? I know you aren't using this directly, but it might be worth having a look for some inspiration. Additionally, maybe worth checking out [this paper](http://www.cs.ucr.edu/~eamonn/meaningless.pdf). — LE Rogerson, Nov 08 '16 at 08:37
I forgot to mention that this can be supervised learning. I am looking at physical phenomena and I have control over test data. `tsfresh` is supposed to find relevant features based on a training or known set, and then you can pass that feature vector onto a classifier algorithm. @LERogerson I *think* that papers looks at unsupervised clustering? — superhero, Nov 08 '16 at 15:49
Its hard to judge what causes tsfresh to kick out all features with out seeing the data. How many datasets with event or no-event do you have? — MaxBenChrist, Dec 27 '16 at 22:59

time-series classification, event detection in large data sets

0 Answers0