Generate data for dimension reduction algorithm testing

Question

I would like to ask if somebody can help me with the following problem. I would like to generate synthetic data for dimension reduction algorithm testing. Specifically, I would like to have for example a matrix with 1000 rows (=points) and 50 columns (=features) but the real dimension of such matrix after dimension reduction should have only 10 features. How can I generate such matrix? I prefer python code but any advice helps me.

I found out that I can multiply a random matrix (with 1000 rows and 10 columns) by the transposed first matrix from SVD decomposition (with 10 rows and 50 columns) but I do not know why.

might be related: http://stats.stackexchange.com/questions/220654/toy-example-dataset-for-testing-pca-implementation — jeff, Jul 06 '16 at 20:01
This is elementary linear algebra, as explained at http://stats.stackexchange.com/questions/60622, for instance. You don't need SVD. — whuber, Jul 06 '16 at 20:37
And please could you explain to me why SVD decomposition is used in the official documentation in scikit-learn.org - Python library for machine learning? The mentioned example is on this url http://scikit-learn.org/stable/auto_examples/decomposition/plot_pca_vs_fa_model_selection.html#example-decomposition-plot-pca-vs-fa-model-selection-py at the beginning of code section in a short section called "Create the data". I think that your proposed links are related to a bit another problem.. — Andrew_123, Jul 07 '16 at 11:04

Generate data for dimension reduction algorithm testing

0 Answers0