6

Some methods of factor extraction (e.g. principal component analysis, PCA) are based on all variance in the data, while other methods (like principal axis factoring, PAF) are based on (or perhaps target) only common variance.

  1. How is this common variance defined mathematically?
  2. How is it estimated empirically?

I thought maybe it is the variance of a variable within the space spanned by all of the variables. Then it could be estimated by regressing each of the variables on the other variables and looking at the fitted values. But that does not seem correct.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
  • Can you give an example of method that uses common variance? – Łukasz Deryło Apr 04 '19 at 14:40
  • @ŁukaszDeryło, e.g. principal axis factoring, as in the answer of Mark White in the thread ["What is the difference between PCA and PAF method in factor analysis? "](https://stats.stackexchange.com/questions/280746/what-is-the-difference-between-pca-and-paf-method-in-factor-analysis). And here instead of *uses* I could have said something like *is ideologically based on*, since we only obtain common variance in the process of factor extraction, not from raw data. – Richard Hardy Apr 04 '19 at 14:42
  • PAF uses common factor model, as mentioned in thread you linked to. There is nothing about common variance there... or maybe common variance is variance of common factor... – Łukasz Deryło Apr 04 '19 at 14:47
  • @ŁukaszDeryło, oh, right, it says *communality* there, not *common variance*. I have seen *common variance* in a similar context in other places and thought perhaps these notions are interchangeable. I am not 100% sure. The term *common variance* is mentioned in the thread ["PCA vs PAF for exploratory factor analysis"](https://stats.stackexchange.com/questions/307130/pca-vs-paf-for-exploratory-factor-analysis), for example. – Richard Hardy Apr 04 '19 at 14:50

1 Answers1

2

1.

According to Mulaik (2009) p. 133-134, given a factor model \begin{aligned} Y_1&=\lambda_{11}\xi_1+\dots+\lambda_{1r}\xi_r+\Psi_1\varepsilon_1 \\ Y_2&=\lambda_{21}\xi_1+\dots+\lambda_{2r}\xi_r+\Psi_2\varepsilon_2 \\ &\dots \\ Y_n&=\lambda_{n1}\xi_1+\dots+\lambda_{nr}\xi_r+\Psi_n\varepsilon_n \\ \end{aligned} where $\text{Var}(\xi_i)=1 \ \forall \ i$ and $\text{Var}(\varepsilon_j)=1 \ \forall \ j$,
common variance a.k.a. communality of variable $Y_j$ is $$ \text{Var}(\lambda_{j1}\xi_1+\dots+\lambda_{jr}\xi_r), $$ that is, it is the variance of the part of $Y_j$ that is explained by the factors $\xi_1$ to $\xi_r$. If the factors are uncorrelated, then the common variance becomes $\sum_{i=1}^r\lambda_{ji}^2$.

2.

According to Mulaik (2009) p. 184, the $R^2$ of the regression of $Y_j$ on all the other $Y$s ($Y_{-j}$) is the lower bound for the communality of $Y_j$. My impression from reading Mulaik (2009) Chapter 8 is that this could be used as an initial estimate of communality (e.g. equation 8.51 on p. 196).

References

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219