3

I have a multivariate dataset with some linear positive variables and some circular variables (ranging $0$ to $2\pi$). I was planning on clustering the samples using spectral clustering, but to do that I have to first generate a similarity matrix. I have found a couple of papers on clustering pure circular data but almost nothing for mixed datasets.

Is there standard distance metric for use with mixed linear/circular data?

  • It might help to know what these variables represent. Often the circumstances suggest appropriate similarity measures. – whuber Nov 28 '16 at 16:36
  • 1
    They are a series of measures of environmental variables. Some of the linear data are things like distance to woodland, ect. The circular data represent phases from the fourier transform of various satellite derived landsurface measures, so are related the to the temporal offsets of various bi- and tri- annual changes. – David Pascall Nov 28 '16 at 16:42

1 Answers1

1

The canonical approach would be to project your data.

I.e. convert your angle to (sin(x),cos(x)) and it is "linear".

Has QUIT--Anony-Mousse
  • 39,639
  • 7
  • 61
  • 96