1

I plan to run a series of exploratory factor analysis (EFA) models to investigate the factor structure of a scale in development using the R package psych. My N > 300 and each manifest variable (i.e., indicator/item) uses a 5-point Likert-type response option. Some items were reverse-coded (to my chagrin) but handled accordingly; there are no missing data. The structure of the data is as follows:

str(dat)
'data.frame':   315 obs. of  33 variables:

Given the scale's domain content (social science/higher ed), I presumed the data were ordinal, but I calculated both the Pearson and polychoric correlations (because science!) first:

library(psych)
library(GPArotation)
#library(corrplot)

corr_list = list(
    pearson = cor(dat), 
    poly = polychoric(dat)$rho)

When running polychoric correlations, a warning message is generated that states a correction for continuity was applied and 526 were adjusted (see here). I understand that polychoric correlations use a table of proportions and since some responses end up being empty, a correction is needed. My first question is with regard to this:

  • Are the polychoric correlation coefficients stable (p. 21) given that 526 cells were adjusted? This was a bit perturbing considering dat only has 315 obs -- OR is the solution adequate because the adjusted cells only account for about 5% of the overall data structure (315 rows * 33 variables = 10, 359 total cells)?

Different estimation methods for EFAs are available in the psych package, but the function also has an arg that requires a correlation call (e.g., "cor" [Pearson], "poly", "mixed", etc.); the type of correlation in conjunction with the estimation method can a drastic impact on the solutions provided, so I wanted to gain as much clarity about the correlation output as possible before moving forward. EFA follow-up on the horizon!

0 Answers0