I'm fitting a fixed effect model with plm
and know that I'm dealing with multi-collinearity between two of the independent variables. I working on identifying multicolliearity in models as a practice and have identified the variable with alias()
, then verified with vif()
. I was also able to use kappa()
to show a very large conditional number verifying the multicollinearity.
My question is why does plm()
ommit this multicolliearity variable from the coefficients? There is no output clarifying why and I couldn't find anything in the documentation. Stata automatically omits this variable and I'm curious if plm()
does a check and then omits.
Multicollinearity variable dfmfd98
Reproducible example :
dput :
data <-
structure(list(lexptot = c(8.28377505197124, 9.1595012302023,
8.14707583238833, 9.86330744180814, 8.21391453619232, 8.92372556833205,
7.77219149815994, 8.58202430280175, 8.34096828565733, 10.1133857229336,
8.56482997492403, 8.09468633074053, 8.27040804817704, 8.69834992618814,
8.03086333985764, 8.89644392254136, 8.20990433577082, 8.82621293136669,
7.79379981225575, 8.16139809188569, 8.25549748271241, 8.57464947213076,
8.2714431846277, 8.72374048671495, 7.98522888221012, 8.56460042433047,
8.22778847721461, 9.15431416391622, 8.25261818916933, 8.88033778695326
), year = c(0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
1L), dfmfdyr = c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0,
0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0), dfmfd98 = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 0, 0, 0, 0), nh = c(11054L, 11054L, 11061L, 11061L,
11081L, 11081L, 11101L, 11101L, 12021L, 12021L, 12035L, 12035L,
12051L, 12051L, 12054L, 12054L, 12081L, 12081L, 12121L, 12121L,
13014L, 13014L, 13015L, 13015L, 13021L, 13021L, 13025L, 13025L,
13035L, 13035L)), .Names = c("lexptot", "year", "dfmfdyr", "dfmfd98",
"nh"), class = c("tbl_df", "data.frame"), row.names = c(NA, -30L
))
Regression code :
library(plm)
lm <- plm(lexptot ~ year + dfmfdyr + dfmfd98 + nh, data = data, model = "within", index = "nh")
summary(lm)
Output :
Oneway (individual) effect Within Model
Call:
plm(formula = lexptot ~ year + dfmfdyr + dfmfd98 + nh, data = data,
model = "within", index = "nh")
Balanced Panel: n=15, T=2, N=30
Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-4.75e-01 -1.69e-01 4.44e-16 1.69e-01 4.75e-01
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
year 0.47552 0.23830 1.9955 0.06738 .
dfmfdyr 0.34635 0.29185 1.1867 0.25657
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Total Sum of Squares: 5.7882
Residual Sum of Squares: 1.8455
R-Squared : 0.68116
Adj. R-Squared : 0.29517
F-statistic: 13.8864 on 2 and 13 DF, p-value: 0.00059322