0

I've already read the answers to this question, but they were written so long ago that I wonder if some progress has been made on the subject since then.

I would like to know if there's some more objective way to choose the correlation or covariance matrix to do PCA than «careful thought and some experience» or «similar scales»?

Any help would be appreciated.

Edit: I asked this because there could be maybe some measure, and a respective scale of interpretation for when to use a covariance or correlation matrix. In statistics, we can observe some examples of this like for bayes factors, p-values, or at least an interpretative comparison like what happens with AIC or BIC, albeit these measures are for different objectives than the one I'm searching.

An old man in the sea.
  • 5,070
  • 3
  • 23
  • 57
  • 2
    PCA was invented in 1901. I think expecting progress on it on the course of the last two years (the most recent answers in that question) is unreasonable, and doesn't warrant a new question in this site. – Firebug May 27 '17 at 11:14
  • 1
    +1 to what Firebug wrote. I am tempted to vote to close as a duplicate of the thread you linked to. – amoeba May 27 '17 at 11:15
  • 1
    Applying "careful thought and some experience" is advice that will never go out of date. Indeed, although this advice might *sound* subjective, what could be more objective than careful consideration and application of all relevant information to a problem? – whuber May 27 '17 at 12:49
  • @Firebug the most recent answers are just 'isomorphic' to the oldest ones, from 2010. So, to expect progress in 7 years doesn't sound to me so unreasonable. Well, even in two years much progress can be made, and being created/invented in 1901 is not important for the topic at hand, unless you explain why...An answer to a question can found after one year. If I were to ask for a proof to Fermat's last theorem in 1993, you would say none was available, but if I were to ask it again in 1994 you would direct me to Wile's proof. And I really doubt this would be more difficult than Fermat's. – An old man in the sea. May 27 '17 at 18:52
  • @amoeba why not just say that no advancements have been made instead? – An old man in the sea. May 27 '17 at 18:53
  • 1
    The problem with your Q isn't that there have been no advancements (I have no idea about that), but that it's unclear and unmotivated. It's unclear what application of PCA you have in mind and without that it's impossible to say what is more or less appropriate course of action. And it's unmotivated because one can similarly repeat any other answered question from this forum and ask whether there perhaps have been any updates; IMHO one should have better grounds to revisit an existing thread. – amoeba May 27 '17 at 18:59
  • @whuber I agree with your first sentence. I asked this because there could be maybe some measure, and a respective scale of interpretation, as seems to exist for bayes factors, p-values, or at least an interpretative comparison like what happens with AIC or BIC... – An old man in the sea. May 27 '17 at 18:59
  • @amoeba would my edit increase the motivation of the question? Or does the question need an explicit model with respective data? – An old man in the sea. May 27 '17 at 19:05
  • What application do you have in mind? – amoeba May 27 '17 at 19:10
  • @amoeba this came to my mind, since I was doing an exercise, and I had data on the some companies. Some of the variables were sales( in magnitudes of 10^2 billions) and profits( in magnitudes of 10^1 billions). And the solution stated that since the scales were 'very different' we should use the correlation. But well, why are they 'so different'? This difference of magnitude doesn't seem to justify doing a PCA with the correlation matrix... – An old man in the sea. May 27 '17 at 19:35
  • Okay, so that's the data, but what would be the goal of doing PCA? – amoeba May 27 '17 at 22:38
  • @amoeba I'm not sure I understand your question. Isn't the goal of a PCA to summarize the sample variation by a few linear combinations of the original variables, and if possible to give some interpretations to these principal components? Or are you asking what is the goal of this PCA, as if it were a just one step in a bigger analysis/project? – An old man in the sea. May 27 '17 at 22:52
  • Yes, I was asking if PCA was a step in a bigger analysis pipeline. If not, then PCA is an *exploratory* analysis. There cannot be any "objective" indicators in the exploratory analysis; one is exploring the data and one can do it however one pleases. Everything that you mentioned in the post -- p-values, Bayes factors, AIC/BIC -- are confirmatory tools. – amoeba May 28 '17 at 10:13

0 Answers0