1

If the input sample correlation matrix consists of low correlations (say below 0.3), do not perform factor analysis. There is not much to model!

I read this from statistics textbook. I don't understand the meaning of input sample correlation matrix
How to explain it?

WhiteGirl
  • 437
  • 1
  • 5
  • 15
  • 1
    The idea behind factor analysis is that, when two (or more, I take two to keep it simple) variables are strongly correlated, that this correlation has its cause in the fact that both depend on one same underlying variable. So it is correlation because of some common underlying 'factor'. But when you have no correlation its not worth looking for a 'factor' that causes that correlation ? –  Jul 28 '17 at 13:16
  • 2
    Your title does not seem to correspond to the question you ask about the highlighted text. Which one do you need help with? – mdewey Jul 28 '17 at 13:18
  • 2
    See pt. 4 https://stats.stackexchange.com/a/198684/3277 – ttnphns Jul 28 '17 at 13:53
  • @tthphns, thanks for your good answer.[this post](https://stats.stackexchange.com/questions/290589/what-kind-of-outlier-should-be-removed-from-factor-analysis/290609?noredirect=1#comment561343_290609)said need not care about outlier. – WhiteGirl Jul 28 '17 at 14:29
  • @ttnphns,is `MSA>0.5` equal to `correlation>0.3`? – WhiteGirl Jul 28 '17 at 15:38
  • White, see https://stats.stackexchange.com/a/229267/3277 – ttnphns Jul 28 '17 at 15:44
  • @ttnphns,is `Bartlett’s sphericity test` necessary before FA? – WhiteGirl Jul 29 '17 at 07:02

2 Answers2

0

The FA would be based upon the correlation matrix for your sample. This is what you "feed" the analysis. For the other part of your question, low correlation coefficients mean that your manifest variables are not related to underlying latent variables (are not related facets of an underlying concept). You could use other data reduction methods such as Multidimensional Scaling or Clustering to explore your data.

0

It's a rule of thumb, and by the language used you can see that 0.3 is not a hard threshold. They just threw a number out there.

For instance, let's say you have 40 observations and the correlation coefficient is 0.3, then the significance test would give you 0.06 p-value. So, for small samples 0.3 is a very weak evidence of the correlation, which might have showed up randomly while there wasn't an actual correlation.

Aksakal
  • 55,939
  • 5
  • 90
  • 176
  • Should remove the item which correlation<0.3? – WhiteGirl Jul 28 '17 at 14:17
  • "Removing" can be tricky, as if you deal with multiple variables, the covariance matrix can become not positive definite, which may cause other issues depending on your modeling approach. – Aksakal Jul 28 '17 at 15:10
  • I saw lots of FA examples in text books, no one check `correlation>0.3`before FA.what's the reason? – WhiteGirl Jul 28 '17 at 15:39
  • As I wrote it's just a rule of thumb, you don't have to follow it. Generally in text books they pick examples where the model can be built, i.e. correlations are high etc. I rarely see text books talk about the cases where the model can't be built. Like your source says if the correlations are low to nonexistent then there's very little that can be done with essentially correlation based methods such as FA – Aksakal Jul 28 '17 at 15:42