34

As far as I can tell, Kohonen-style SOMs had a peak back around 2005 and haven't seen as much favor recently. I haven't found any paper that says that SOMs have been subsumed by another method, or proven equivalent to something else (at higher dimensions, anyhow). But it seems like tSNE and other methods get a lot more ink now-a-days, for example in Wikipedia, or in SciKit Learn, and SOM is mentioned more as a historical method.

(Actually, a Wikipedia article seems to indicate that SOMs continue to have certain advantages over competitors, but it's also the shortest entry in the list. EDIT: Per gung's request, one of the articles that I'm thinking about is: Nonlinear Dimensionality Reduction. Note that SOM has less written about it than the other methods. I can't find the article that mentioned an advantage that SOMs seem to retain over most other methods.)

Any insights? Someone else asked why SOMs are not being used, and got references from a while ago, and I have found proceedings from SOM conferences, but was wondering if the rise of SVMs or tSNE, et al, just eclipsed SOMs in pop machine learning.

EDIT 2: By pure coincidence, I was just reading a 2008 survey on nonlinear dimensionality reduction this evening, and for examples it mentions only: Isomap (2000), locally linear embedding (LLE) (2000), Hessian LLE (2003), Laplacian eigenmaps (2003), and semidefinite embedding (SDE) (2004).

user91213
  • 1,786
  • 13
  • 26
Wayne
  • 19,981
  • 4
  • 50
  • 99
  • 3
    Can you link to any of the resources you are referring to? (Eg, which Wikipedia article "seems to indicate..."?) – gung - Reinstate Monica Oct 19 '15 at 02:38
  • 12
    They seem to have fallen out of favor to an extent that I do not know what SOM refers to. – Matthew Drury Oct 19 '15 at 02:54
  • 5
    apparently, self-organizing map – Christoph Hanck Oct 19 '15 at 07:00
  • SOM is just a variant of multidimensional scaling (MDS) which is much older. – kjetil b halvorsen Oct 20 '15 at 07:37
  • @kjetilbhalvorsen: Do you have any references about SOM and MDS? As I understand it, MDS is global in nature (related to PCA), while SOM is local in nature. Or maybe I misunderstand them. – Wayne Oct 20 '15 at 13:29
  • @Wayne: Venables&Ripley: MASS (fourth edition) treats MDS and Kohonen's SOM under the same heading as "distance methods" (pp 305-310) and writes (p310) "Kohonen describes his own motivation as "..." (left out here) which is the same aim as most variants of MDS. MDS can be local/global, linear or non-linear. – kjetil b halvorsen Oct 20 '15 at 13:35
  • @kjetilbhalvorsen: I've thought about it a bit, and SOMs are actually a subset of LVQ's (Learning Vector Quantization) and are probably more directly related to K-means than to MDS, since SOM is working with centroids in the high-dimensional space. The SOM distinction is that it imposes a geometric constraing on the centroids via the 2D neuron graph. – Wayne Oct 22 '15 at 15:17
  • I'm currently studing Computer Science at the University and SOMs are part of the Neural Networks course. We learn other techniques and Kohonen's maps as the non-supervised model of neural networks. Thus I guess they are still used at least for academic purposes. – davidivad Dec 08 '15 at 11:53
  • They are still used by some people in astronomy https://iopscience.iop.org/article/10.1088/0004-637X/813/1/53 – usernumber Apr 22 '20 at 12:36

3 Answers3

18

I think you are on to something by noting the influence of what the machine learning currently touts as the 'best' algorithms for dimensionality reduction. While t-SNE has shown its efficacy in competitions, such as the Merck Viz Challenge, I personally have had success implementing SOM for both feature extraction and binary classification. While there are certainly some who dismiss SOMs without justification besides the algorithm's age (check out this discussion, there are also a number of articles that have been published within the last few years that implemented SOMs and achieved positive results (see Mortazavi et al., 2013; Frenkel et al., 2013 for instance). A Google Scholar search will reveal that SOMs are still utilized within a number of application domains. As a general rule, however, the best algorithm for a particular task is exactly that - the best algorithm for a particular task. Where a random forest may have worked well for a particular binary classification task, it may perform horribly on another. The same applies to clustering, regression, and optimization tasks. This phenomenon is tied to the No Free Lunch Theorem, but that is a topic for another discussion. In sum, if SOM works best for you on a particular task, that is the algorithm you should use for that task, regardless of what's popular.

Dirigo
  • 376
  • 3
  • 8
5

I have done research on comparing SOMs with t-SNE and more and also proposed an improvement on SOM that takes it to a new level of efficiency. Please check it out here and let me know your feedback. Would love to get some idea on what people think about it and if it is worth publishing in python for people to use.

IEEE link to paper: http://ieeexplore.ieee.org/document/6178802/

Matlab implementation. https://www.mathworks.com/matlabcentral/fileexchange/35538-cluster-reinforcement--cr--phase

Thanks for your feedback.

Narine Hall
  • 51
  • 1
  • 2
2

My subjective view is that SOMs are less well known and perceived as being less 'sexy' than many other methods, but are still highly relevant for certain classes of problems. It may well be the case that they would have a significant contribution to make if they were more widely used. They are invaluable in the early stages of exploratory data science for getting a feel for the 'landscape' or 'topology' of multivariate data.

The development of libraries such as Somoclu, and research such as that by Guénaël Cabanes (among many others) shows that SOMs are still relevant.

Matt Wenham
  • 402
  • 2
  • 11