Differences between clustering and segmentation

Question

I have read about piecewise aggregate approximation (PAA) mining time series data, sliding window, top down and bottom up approaches for time series segmentation but these are applicable to single dimension time series.

What are the techniques for multi dimensional time series segmentation? Can Gaussian mixture model or K means clustering be used for segmentation? If so then what is the difference between segmentation and clustering?

What is the difference between segmenting and clustering
How do I segment motion time series data so that the temporal information is retained.
What are the algorithms for doing so - What are the techniques for multi dimensional time series segmentation??

Please provide links or ideas thank you.

With regards to first question, none in my opinion. It is just Semantics, both try to group data to get meaning full information. clustering is a statistical term, segmentation is a a marketing/business term. — forecaster, Jan 01 '14 at 02:05

score 4 · Answer 1 · edited Nov 19 '20 at 18:36

Segmentation vs. Clustering

In control system engineering, the ideas of controllability and measurability are, through the Cayley-Hamilton theorem, two faces of the same phenomena. One implies the other.

Segmentation and clustering are two faces of the same coin, too. The line of equal probability of cluster membership is the segmentation boundary. This is a deep topic and discussions about convergence, nature of space, and appropriate basis functions are beyond the scope of this answer.

Retaining Temporal In formation

If I were doing this, then I would augment each temporal membership with cluster membership. I would use both cluster index and cluster Mahalanobis distance. So if you took one measurement at each instant, then clustered the data, your augmented time-series would have three values at each instant - cluster index, Mahalanobis distance (useful), and the measurement itself.

Algorithms

I have not done much with time-series as a formalism, so this is all hands-on. When I have temporal data and I want to cluster, I just use the time-measurement as another measurement.

This means that if you had one measurement per instant, then you have a 2d walk, of sorts, where time is strictly increasing. You can throw away the time and cluster in measurement only. (Here is a link to lots of appropriate approaches: AutonLab) You can look at both. You can transform to lagged coordinates, or time-difference coordinates and think in terms of velocities, accelerations or such. The classic drunkards walk is 2d-random, and is a diffusion process. Being able to contrive your data as such a walk opens up those analysis tools for use. (link,link, link, link) Diffusion is studied in many disciplines including genetics, mathematics, materials science, epidemiology, and computer science.

There is no perfect "pepsi" - no silver bullet that solves all problems with ease. There are many good "pepsis", tools in the toolbox for which some will outperform others in particular areas. K-means, Gaussian Mixtures, Radial Basis Function Neural Networks, Support Vector Machines, even Q-learning lookups - these might have use for you.

Without a clearer description of the nature of the data, of what you are looking to cluster, it is harder to say which tool to use. If I don't know whether it is a nail or a bolt - I can't say "try to use a wrench" or "try to use a hammer". I hope that you find a tool that works for you.

Best of Luck.

score 2 · Answer 2 · answered Oct 26 '14 at 02:28

2

I think that, in general, the difference is in that clustering does not imply any prior knowledge about groups, whereas segmentation in many cases implies prior knowledge about groups, including their number and names (often used in business, for example, customer segmentation).

answered Oct 26 '14 at 02:28

Aleksandr Blekh

7,867
2
27
93

1

Automated image segmentation - not about prior knowledge. – EngrStudent Oct 27 '14 at 12:03
3

@EngrStudent: That's why I said "in many cases", not "always". – Aleksandr Blekh Oct 27 '14 at 23:55

kjetil b halvorsen · Accepted Answer · 2020-12-08T14:23:37.880

What is the difference between segmenting and clustering?

First, let us define the two terms:

Segmentation partitioning of some whole, some object, into parts vased on similarity and contiguity. See Wikipedia which gives as an example Segmentation (biology), the division of body plans into a series of repetitive segments and also Oxford.
Clustering Wikipedia says the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters).

This is, in some sense, closely associated. If we consider some whole ABC as consisting of many atoms, like a market consisting of customers, or a body consisting of body parts, we can say that we segment ABC but cluster the atoms. But it seems that segmentation is more used when there is some concept of (spatial) contiguity of the atoms within the whole.

There seems to be confusion of this usage. On this site customer segmentation is often used, it should be market segmentation. The customers are not segmented (hopefully!), they are clustered. Wikipedia got it right.

Use in connection with time series With multiple (parallel) time series, we can cluster the series into groups of similar series, while segmentation typically refers to partitioning a single series in similar, contiguous, parts. See the tag timeseries-segmentation and this list of posts about time series clustering. That points to a connection to change-point-detection. See Wikipedia.

On this site there are many posts on image-segmentation.

Differences between clustering and segmentation

3 Answers3