I have an apparently simple question, but quite surprising I can't find any answer online. The question is: is it possible to run PCA on proportions? Since I am not specifically trained in statistics or mathematics it would be great if someone could give a simple answer, besides giving an appreciable mathematical grounded answer. Thank you
Asked
Active
Viewed 111 times
0
-
1This is much discussed in the literature on _compositional data analysis_, on which new books appear almost yearly. (About half the experts in this field are based in Catalonia, curiously.) I am no expert but expect that the constraint that the proportions sum to 1 is key to any naive application to the original data. – Nick Cox Feb 27 '20 at 16:10
-
Thank you very much for this hint! It is very useful to know about "compositional data analysis" to refine my online research. I already found (e.g. [here](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.387.1787&rep=rep1&type=pdf)) that "the correlation structure of compositional data is strongly biased" thus PCA is not appropriate and some kind of data transformation should be used. I would like to specify that I also have not proportional data, but I transformed them in proportions cause absolute values would be misleading since they come from population with very different size – kk68 Feb 27 '20 at 16:39
-
Do your proportions add to 1? By the way, taking logarithms is often a better way to compare variables on quite different scales, naturally so long as all variables are positive. – Nick Cox Feb 27 '20 at 16:43
-
1We should back up, because it is evident you have some underlying problem and that these "proportions" might not be an appropriate way to solve it. I would suggest replacing your question with one that describes your original problem. – whuber Feb 27 '20 at 16:44
-
Thank you. Yes @Nick, I have a matrix n rows (cases) x m columns (variables) and each row add up to 1. – kk68 Feb 27 '20 at 16:56
-
Thanks @whuber I'll follow your advice and open another more general question. I hope you can help me! Thanks – kk68 Feb 27 '20 at 16:57
-
I replaced the question with another which describes my problem. The new question is [here](https://stats.stackexchange.com/questions/451670/how-to-analyze-a-small-matrix-to-discover-associations-between-cases-and-variabl). Thank you – kk68 Feb 27 '20 at 17:18