1


I would like to ask if somebody can help me with the following problem. I would like to generate synthetic data for dimension reduction algorithm testing. Specifically, I would like to have for example a matrix with 1000 rows (=points) and 50 columns (=features) but the real dimension of such matrix after dimension reduction should have only 10 features. How can I generate such matrix? I prefer python code but any advice helps me.

I found out that I can multiply a random matrix (with 1000 rows and 10 columns) by the transposed first matrix from SVD decomposition (with 10 rows and 50 columns) but I do not know why.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Andrew_123
  • 11
  • 2
  • might be related: http://stats.stackexchange.com/questions/220654/toy-example-dataset-for-testing-pca-implementation – jeff Jul 06 '16 at 20:01
  • This is elementary linear algebra, as explained at http://stats.stackexchange.com/questions/60622, for instance. You don't need SVD. – whuber Jul 06 '16 at 20:37
  • And please could you explain to me why SVD decomposition is used in the official documentation in scikit-learn.org - Python library for machine learning? The mentioned example is on this url http://scikit-learn.org/stable/auto_examples/decomposition/plot_pca_vs_fa_model_selection.html#example-decomposition-plot-pca-vs-fa-model-selection-py at the beginning of code section in a short section called "Create the data". I think that your proposed links are related to a bit another problem.. – Andrew_123 Jul 07 '16 at 11:04

0 Answers0