1

Lets imagine a factor analysis on 20 different variables.

X1, X2 and X3 all load onto Factor 1.

X1 and X2 are highly correlated.

Is there a way to know if the relationship between X1 and X2 is: coming from those variables themselves, or due to the shared latent factor? Or a way to figure out how much is one or the other?

Edit: Would it make sense to do a mediational analysis, to see if the relationship between X1 and X2 is mediated by Factor 1? Or to look at the correlation between X1 and X2, partialing out Factor 1 (to see the X1-X2 partial relationship)?

ttnphns
  • 51,648
  • 40
  • 253
  • 462
Dave
  • 1,641
  • 2
  • 14
  • 27
  • @ttnphns I'm not sure how that answers my question? I may have not explained it well. I am curious if I can tell how much shared variance between X1 and X2 can be explained by the latent factor they have in common. – Dave Aug 25 '21 at 22:43

1 Answers1

2

By definition, a loading is the amount of variability shared by variables due to the common factor. 100% of the common varaince of the pair of variables is explained by the common factor(s).

By definition of common factor analysis, it aims to explain 100% of correlation by the common latent factor(s). FA assumes pairwise partial correlations are negligible. Either they are zero or they are small and constitute those "rubbish" common factors which we do not bother to model.

Check "Population noise added" here and read this.

Factor analysis output displays the reproduced (by the factors) correlations. You can always compare it with the input correlations. Ideally, the two matrices should be close or almost identical. If they are not, consider extracting more factors (going along the road of overfitting). But do not expect FA will explain (model) pairwise partial associations for you. FA assumes they are neglibible enough not to be modelled, and if the aren't small - then the data are not well suited for FA.

Distinction between "shared variance" and "common variance". Factor analysis explains shared variance in pairs of variables by the common variance they are invested by the factor. The variables are denied having their private (partial) share, instead, that load from the factor is what makes them covariate, by the amount by which they get loaded.

Partial, i.e. naturally two-party correlations, are almost nonsense theoretically. Take 100 correlating variables. Do you think there truly exist, in population, 100(100-1)/2 special factors to constitute/regulate their relationship? I think no. Rather, there exist a small number of common factors (say, up to 10) which force the 100 items to correlate. And to the extent they don't correlate with r=1, they vary individually for that rest. If, above all that, they still uniquely co-vary in pairs, then this could be only for few pairs and only weakly. That is a reasonable view embodied in FA.

ttnphns
  • 51,648
  • 40
  • 253
  • 462
  • That's very clear, thanks you! One follow up question: what about say X3 and X4, which load onto different factors. Is it sensible to examine their correlations? Or are these correlations accounted for be the correlation between their two factors? – Dave Aug 26 '21 at 17:41
  • All variables in the analysis are loaded by all the factors extracted. The question then is "how much?". According to the fundamental _factor theorem_, reproduced correlation for any pair of variables (this correlation in good factor solution is close to the observed one) is defined by the sum of multiplications of their loadings for the same factors. [See](https://stats.stackexchange.com/a/94104/3277). – ttnphns Aug 26 '21 at 18:11
  • Factors themselves are not correlated (by default setting of the FA). They are orthogonal "regressional" predictors of the manifest variables. – ttnphns Aug 26 '21 at 18:12
  • Dave, it is important, for you asking your main question, that you clearly understand the difference between FA and PCA. If you don't yet, I recommend to read starting from https://stats.stackexchange.com/a/288646/3277 and the whole thread there. – ttnphns Aug 26 '21 at 18:19
  • Thanks again for the helpful information! I'll just note that in my analysis the factors are allowed to correlate with an oblique rotation. (A textbook I consulted said that if correlations are > .30 I should use oblique rather than orthogonal rotation.) – Dave Sep 01 '21 at 21:59