1

Now, I am trying to understand how to segment a multivariate time series using k- means. I understand that the basic concept is to use centroids of segments rather than centroids of data points and minimize the residual error.

But what does a 'segment' mean in this context? How can I use a vector of data points as a segment or number of segments?

I use R so if you could provide example code or link to time series segmentation using k-means it would be really nice.

Ferdi
  • 4,882
  • 7
  • 42
  • 62
umair durrani
  • 361
  • 4
  • 16
  • No, Kmeans clustering does not work on dependent/time series data. Try searching keyword "time series clustering" you will get some good techniques. DTW + Hierarchical clustering is a popular method for time series clustering/segmentation. – forecaster Apr 06 '17 at 16:59
  • 1
    Can you edit your question to provide a laser-like focus per the rules of the community? You seem to be asking, "What is a segment as is referred to in segmented regression?". – AdamO Aug 29 '17 at 15:59

1 Answers1

0

I don't think k-means is a good approach here, as it is not aimed for time-series data and especially of higher dimensionality. You should look into something that takes into account the time component, like hidden markov models, autoregressive models or a combination of both for example.

DimP
  • 236
  • 3
  • 13
  • Isn't this the same thing forecaster just mentioned in a comment? – Michael R. Chernick Apr 06 '17 at 17:22
  • 2
    Actually, if you hover over the time answered, you will see that we both posted at *almost* the same time (16:56 mine, 16:59 @forecaster 's answer). Weird? Totally. – DimP Apr 06 '17 at 18:56