6

I am used to working with PCA, tSNE, LLEs... They all do a great job projecting the data on a plane (or on linear subspaces of $\mathbb{R}^n$). Is there any other embedding technique that projects the data on non-linear spaces ? Like a sphere, or any other manifold ?

The aim would be to specify a space and a distance (the example will be a sphere and the geodesic distance) to project onto.

I am looking for any reference or open source project!

Per example, calling $x_i$ the elements of the data set and $d$ the initial distance, $d_g$ the geodesic distance on the sphere, an "MDS-like" formulation could be :

$$ min_{y} \sum_{i,j}||d(x_i,x_j)-d_g(y_i,y_j)||^2 $$

Is there any other way than brute force to solve this problem ?

RUser4512
  • 9,226
  • 5
  • 29
  • 59
  • What sort of data are you looking to project? If you have longitudes and latitudes (or azimuthal and zenith angles), then you're done; if you have 3-vectors, just normalize them. – J. M. is not a statistician Jul 07 '16 at 15:54
  • Can you say anything about the input data you're looking to project (e.g. does it lie on some particular kind of manifold)? Also, what properties do you want the projections to have? It's easy to project points onto an n-sphere, but you may not like the results in all cases. – user20160 Jul 07 '16 at 16:21
  • No I do not make any assumption on the input data, I am looking for a general purpose tool – RUser4512 Jul 07 '16 at 16:42
  • *Any* projection into linear subspaces of $\mathbb{R}^n$ may be considered a projection onto any manifold for which those subspaces might be charts. This trivial mathematical solution to your question shows that it needs more information to have a *statistical* answer; namely, *what would the purpose of this projection be?* – whuber Jul 07 '16 at 16:53
  • @whuber I added some information in the question – RUser4512 Jul 07 '16 at 16:56
  • Can you give a 2 or 3 dimensional example of data, and show your manifold decomposition in it? – Aksakal Jul 07 '16 at 17:26
  • 1
    Two CMU statisticians, Xie and Xing, have a paper on *Cauchy PCA* that extends the method from gaussian to extreme valued distributions. Their discussion does not include the geometry underlying these extensions, but further research could develop that. http://www.cs.cmu.edu/~pengtaox/papers/cpca.pdf – Mike Hunter Jul 07 '16 at 18:27

3 Answers3

6

It might be possible to solve your example problem using a procedure similar to nonclassical metric MDS (using the stress criterion). Initialize the 'projected points' to lie on a sphere (more on this later). Then, use an optimization solver to find the projected points that minimize the objective function. There are a few differences compared to ordinary MDS. 1) Geodesic distances must be computed between projected points, rather than Euclidean distances. This is easy when the manifold is a sphere. 2) The optimization must obey the constraint that the projected points lie on a sphere.

Fortunately, there are existing, open source solvers for performing optimization where the parameters are constrained to lie on a particular manifold. Since the parameters here are the projected points themselves, this is exactly what's needed. Manopt is a package for Matlab, and Pymanopt is for Python. They include spheres as one of the supported manifolds, and others are available.

The quality of the final result will depend on the initialization. This is also the case for ordinary, nonclassical MDS, where a good initial configuration is often obtained using classical MDS (which can be solved efficiently as an eigenvalue problem). For 'spherical MDS', you could take the following approach for initialization. Perform ordinary MDS, isomap, or some other nonlinear dimensionality reduction technique to obtain coordinates in a Euclidean space. Then, map the resulting points onto the surface of a sphere using a suitable projection. For example, to project onto a 3-sphere, first perform ordinary dimensionality reduction to 2d. Map the resulting points onto a 3-sphere using something like a stereographic projection. If the original data lies on some manifold that's topologically equivalent to a sphere, then it might be more appropriate to perform initial dimensionality reduction to 3d (or do nothing if they're already in 3d), then normalize the vectors to pull them onto a sphere. Finally, run the optimization. As with ordinary, nonclassical MDS, multiple runs can be performed using different initial conditions, then the best result selected.

It should be possible to generalize to other manifolds, and to other objective functions. For example, we could imagine converting the objective functions of other nonlinear dimensionality reduction algorithms to work on spheres or other manifolds.

user20160
  • 29,014
  • 3
  • 60
  • 99
  • Are there any concerns in this case of Manopt finding a non-global local optimum, or a critical point which is not even a local optimum? Is there a possibility that the manifold optimization solution projection could be a global optimum for some points, and not so for others, thereby introducing an inconsistency across points? – Mark L. Stone Jul 07 '16 at 20:28
  • Nonclassical MDS w/ stress criterion does have the possibility of finding non-global local minima, and this behavior depends on the target distances. I've seen scattered mention of saddle points, but can't say much about it. I don't imagine that constraining the parameters to a manifold would alleviate the situation. That's the motivation for seeking a good initialization (and multiple restarts, although more sophisticated alternatives exist). But, MDS does seem to work in practice despite this, so that gives hope for the sphere/manifold case. – user20160 Jul 07 '16 at 21:12
  • Not sure I understand the question about inconsistency across points – user20160 Jul 07 '16 at 21:13
  • O.k., I withdraw that part of the comment. – Mark L. Stone Jul 07 '16 at 21:44
1

Maybe something like that on R ?

http://planspace.org/2013/02/03/pca-3d-visualization-and-clustering-in-r/

The PCA project on a plane if you only take the first two axes. If you take the 3rd one you have the projection on sphere.

el Josso
  • 392
  • 1
  • 14
1

Relational Perspective Map (http://www.visumap.net/index.aspx?p=Resources/RpmOverview) might be something interesting for you. RPM has been originally purposed for torus surface; then extended to other low dimensional manifolds like 3D sphere and projective plane. A key to design MDS on closed manifold is that the "distance" metric has to be extended to a kind of multi-path-aggregation. The VisuMap software package also provides animation/nevigation to visualize data in spaces like 3D sphere.

James LI
  • 446
  • 2
  • 5