3

I'm running factor analysis on a survey data set. Across any number of factors, the % variance explained for all factors is very low. No factor ever has a % variance explained greater than 9%. In basic terms, this means that it's not possible to construct a factor, or principal component for that matter, that accounts for or "explains" very much of the variability in the data.

Beyond that, what implications might we draw about the data? Does this suggest very high variances and/or covariances in the data? Anything else?

(I'm not expecting that there is a single thing we can infer about the data given uniformly weak factors, but I'm looking for ideas so I can then explore the data.)

Dr. Beeblebrox
  • 1,120
  • 1
  • 11
  • 16

1 Answers1

5

If you actually mean (as I suspect) 9% and not 0.09%, then that suggests about 20 variables.

The first thing I would look at is the correlation matrix.

I would also check the variable type - surveys often have variables that are not continuous. This can cause problems for factor analysis - but there are ways to lessen that.

I would look at the distribution of each variable as well - if many are highly skewed, that can cause problems too.

Peter Flom
  • 94,055
  • 35
  • 143
  • 276
  • 1
    In addition, mixing variables with quite different meaning is likely to create blocks of very weak correlations making it difficult for any factor to capture a large share of the variability. – Nick Cox Dec 12 '13 at 16:47
  • Thank you. You're right that the variables are not continuous. Could you elaborate on ways to lessen problems from that? Although that cannot be the only cause, because we ran the same survey in 4 countries. In 3 of 4 countries we get factors with maximum % variance explained is more like 30%. Only in one country do we see the uniformly weak factors I asked about. – Dr. Beeblebrox Dec 12 '13 at 16:47
  • 1
    It depends on what the variables are. Many people do "regular" FA on ordinal variables, although it may cause problems. See [Joreskog & Moustaki](http://www.ssicentral.com/lisrel/techdocs/orfiml.pdf). Also see [this thread](http://stats.stackexchange.com/questions/11899/factor-analysis-on-mixed-continuous-ordinal-nominal-data) here on CrossValidated. – Peter Flom Dec 12 '13 at 16:51
  • @PeterFlom Any sense of what could be unique about this country's data, given that the problem doesn't arise in the other three countries? You mention skewnesss. Also, I've loaded the 20x20 correlation matrix, but I'm not sure what to look for. – Dr. Beeblebrox Dec 12 '13 at 17:08