8

I have a series of transactions like the following:

  • [0, 2, 2, 3, 1, 0, 0, 0, 1]
  • [1, 0, 0]
  • [3, 3, 1, 1]

I would like to classify each transaction as being part of one of two categories: class A or class B, the issue is that a transaction can have any length from 2 to 150 in my case. What is the best approach for classifying similar variable size time series?

Slartibartfast
  • 183
  • 1
  • 3

1 Answers1

6

You can use Dynamic Time Warping distance measure, which can tell distance between different length time series. This question has some nice answers about how to do that.

However it would be instructive to construct certain summary statistics for the series and do the clustering on them.

mpiktas
  • 33,140
  • 5
  • 82
  • 138
  • Thank you, I applied an algorithm for the compression of the time series and applied DTW with KNN, it's working like a charm. I tested DTW before posting here but wasn't getting good results, you gave me the input to adapt the time series and retry with DTW. – Slartibartfast Apr 04 '15 at 17:29