What is the point of singular value decomposition?

Question

I don't understand why reduction in dimension is important. What is the benefit of taking some data and reducing their dimension?

The tone of the question does not invite constructive answers. Please consider rewording your question. — Sasha, Dec 09 '11 at 19:03
The point may be to reduce the volume of data needed to store certain information as an expense of slight loss of accuracy (e.g. JPEG image compression). — Sasha, Dec 09 '11 at 19:04
Thank you for your comments, @Sasha. It's a reasonable question, so I made a minor edit to avoid the impression of bluntness (surely unintended) conveyed by the original wording. — whuber, Jan 24 '12 at 21:10
See https://stats.stackexchange.com/questions/177102/what-is-the-intuition-behind-svd/179042#179042 for an example! — kjetil b halvorsen, Dec 21 '17 at 01:09
You do SVD for topic modelling that is NOT probabilistic. For topic modelling that is probabilistic use LDA. If you are NOT doing topic modelling then use PCA. — Brad, May 18 '18 at 12:09

jbowman · Answer 1 · 2011-12-09T20:39:35.053

Singular value decomposition (SVD) is not the same as reducing the dimensionality of the data. It is a method of decomposing a matrix into other matrices that has lots of wonderful properties which I won't go into here. For more on SVD, see the Wikipedia page.

Reducing the dimensionality of your data is sometimes very useful. It may be that you have a lot more variables than observations; this is not uncommon in genomic work. It may be that we have several variables that are very highly correlated, e.g., when they are heavily influenced by a small number of underlying factors, and we wish to recover some approximation to the underlying factors. Dimensionality-reducing techniques such as principal component analysis, multidimensional scaling, and canonical variate analysis give us insights into the relationships between observations and/or variables that we might not be able to get any other way.

A concrete example: some years ago I was analyzing an employee satisfaction survey that had over 100 questions on it. Well, no manager is ever going to be able to look at 100+ questions worth of answers, even summarized, and do more than guess at what it all means, because who can tell how the answers are related and what is driving them, really? I performed a factor analysis on the data, for which I had over 10,000 observations, and came up with five very clear and readily interpretable factors which could be used to develop manager-specific scores (one for each factor) that would summarize the entirety of the 100+ question survey. A much better solution than the Excel spreadsheet dump that had been the prior method of reporting results!

A method called "thin SVD" is used for dimensionality reduction. See Wikipedia on SVD. — cyborg, Feb 14 '12 at 23:12

score 5 · Answer 2 · answered Dec 09 '11 at 19:22

Regarding your secont point of the question, benefits of dimensionality reduction for a data set may be:

reduce the storage space needed
speed up computation (for example in machine learning algorithms), less dimensions mean les computing, also less dimensions can allow usage of algorithms unfit for a large number of dimensions
remove redundant features, for example no point in storing a terrain's size in both sq meters and sq miles (maybe data gathering was flawed)
reducing a data's dimension to 2D or 3D may allow us to plot and visualize it, maybe observe patterns, give us insights

Other than that, beyond PCA, SVD's has many applications in Signals Processing, NLP and many more

score 2 · Answer 3 · edited Apr 13 '17 at 12:19

2

Take a look at this answer of mine. The singular value decomposition is a key component of principal components analysis, which is a very useful and very powerful data analysis technique.

It is often used in facial recognition algorithms, and I make frequent use of it in my day job as a hedge fund analyst.

edited Apr 13 '17 at 12:19

Community

1

answered Dec 09 '11 at 19:07

Chris Taylor

3,432
1
25
29

1

Are not SVD and PCA (while related) different procedures? – B_Miner Dec 09 '11 at 21:17
2

You are right. SVD is a method to obtain a solution to the PCA problem. – bayerj Dec 09 '11 at 21:51
1

@B_Miner Yes - that's why I said that svd is a *key component of* pca. I focused on pca because the question concerns dimension reduction (for which pca is appropriate, and svd isn't) – Chris Taylor Dec 10 '11 at 00:48
Perhaps the word choice of *component* was what temporarily threw @B_Miner off. :) – cardinal Jan 24 '12 at 21:14

What is the point of singular value decomposition?

3 Answers3