0

I'm pretty new to factor analysis, but am able to go through the basic process including doing the initial analysis, getting the varimax rotation, and pca scores for each observation in R. In my specific case, I am reducing 16 features into 3 factors.

However, if I wanted to look at a new observation (i.e., not one in which I conducted the initial analysis) with the aforementioned 16 features, how could I get its values in terms of the 3 factors obtained in the earlier analysis? TIA

Haris
  • 21
  • 1
  • To compute factor/component scores for old or for new data we use factor score coefficient matrix which is a function of the loadings. Overview: https://stats.stackexchange.com/q/126885/3277 – ttnphns Oct 27 '21 at 22:10

1 Answers1

0

Yes, you can get the factor values for a new observation. You need to use the loadings matrix. Consider the following example using the food dataset (it has 5 features, which rely on 2 factors):

> food <- read.csv("https://userpage.fu-berlin.de/soga/300/30100_data_sets/food-texture.csv", row.names = "X")
> food.fa <- factanal(food, factors = 2)
> food.fa$loadings

Loadings:
         Factor1 Factor2
Oil      -0.816         
Density   0.919         
Crispy   -0.745   0.635 
Fracture  0.645  -0.573 
Hardness          0.764 

               Factor1 Factor2
SS loadings      2.490   1.316
Proportion Var   0.498   0.263
Cumulative Var   0.498   0.761

Now let x <- rep(1, 5) be our new sample. In order to get the factor values, we simply multiply it with the loadings matrix:

> x <- rep(1,5)
> x %*% (food.fa$loadings)
       Factor1   Factor2
[1,] 0.1023468 0.8296464
Spätzle
  • 2,331
  • 1
  • 10
  • 25
  • Although your example seems to give logical answers for your example I am getting nonsensical ones in my case. I would appreciate your feedback based on the output below. – Haris Oct 19 '21 at 10:06
  • which output do you refer to? – Spätzle Oct 19 '21 at 10:11
  • The example works, but I am getting loadings that make no sense, perhaps due to the last command. I would appreciate your feedback based on this abbreviated code. >fit$loadings Loadings: Factor1 Factor2 Factor3 ampl 0.232 0.906 amplSD -0.174 0.936 ... Factor1 Factor2 Factor3 SS loadings 3.970 2.686 1.963 Proportion Var 0.248 0.168 0.123... > ex.features [1] 6.94e-02 6.89 ... > ex.features %*% (fit$loadings) Factor1 Factor2 Factor3 [1,] 5850.236 2475.226 24.49456 (values too high!) – Haris Oct 19 '21 at 10:18
  • I cannot read this. please add it as an edit to your post and format as code. – Spätzle Oct 19 '21 at 10:59
  • I cannot format code+Rstudio output. Code is in {}, output is everything else: {ex.features} [1] 6.948084e-02 6.890209e-02 3.152680e+02 4.285756e+02 1.000853e-01 1.027555e-01 7.253998e+00 5.539933e+00 [9] 1.099967e+03 1.882548e+03 2.248258e+02 6.636951e+01 1.933375e+01 3.684684e+00 2.458555e+03 1.879186e+03 {fit$loadings} Loadings: Factor1 Factor2 Factor3 ampl 0.232 0.906 amplSD -0.174 0.936 ... (Note: 16 features, 3 factors) {ex.features %*% (fit$loadings)} Factor1 Factor2 Factor3 [1,] 5850.236 2475.226 24.49456 – Haris Oct 21 '21 at 12:12
  • Hi can you help me please? – Haris Nov 02 '21 at 19:33