1

I have unbalanced panel data and want to fit a regression model with the degree of internationalization of a firm as dependent variable, measured in the ratio to foreign sales to total sales (fsts). One of my control dummy variables (the respective industry, gind) is time invariant. Due to the time invariant nature of the industry dummy variable, it is dropped automatically from my fixed effect model:

fixed<- plm(fsts~firm_size+rota+debt_to_assets+r_d_intensity+factor(gind), data=firm_ceo, index=c("gvkey", "fyear"), model="within")
summary(fixed)

Oneway (individual) effect Within Model

Call:
plm(formula = fsts ~ firm_size + rota + debt_to_assets + r_d_intensity + 
    factor(gind), data = firm_ceo, model = "within", index = c("gvkey", 
    "fyear"))

Unbalanced Panel: n = 944, T = 1-7, N = 4433

Residuals:
       Min.     1st Qu.      Median     3rd Qu.        Max. 
-5.2679e-01 -1.1023e-02 -6.1760e-06  1.0698e-02  6.7683e-01 

Coefficients:
                  Estimate  Std. Error t-value Pr(>|t|)   
firm_size       0.00031128  0.00339678  0.0916 0.926989   
rota           -0.05715082  0.01738153 -3.2880 0.001019 **
debt_to_assets -0.00145714  0.00930919 -0.1565 0.875627   
r_d_intensity   0.00726665  0.03168240  0.2294 0.818603   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares:    11.145
Residual Sum of Squares: 11.093
R-Squared:      0.0046326
Adj. R-Squared: -0.26584
F-statistic: 4.05495 on 4 and 3485 DF, p-value: 0.0027785

However, I conducted a Hausmann test as suggested here Panel data model estimation with dummy variables and found that using a fixed effect regression model is better than a random effect model (the compared models did not include the industry dummy).

Now, as said I cannot use a simple fixed effects model because the industry is very important to my research.

If I were to use the random model instead would the interpretation of model fit and coefficients even vary significantly? Or how could I use a fixed effect model and keep the time invariant variables specifically in R? The answers in this post How to keep time invariant variables in a fixed effects model unfortunately seem very specific to the interpretation of the meaning of gender and do not help with the implementation in R.

[EDIT]

The answer seems to be to use a correlated random effects model which combines fixed and random effects. The model is also known as within-between model, Mundlak procedure or hybrid approach. For anyone having the same problem I recommend this paper: https://www.researchgate.net/publication/336608555_On_Ignoring_the_Random_Effects_Assumption_in_Multilevel_Models_Review_Critique_and_Recommendations which comes with a short and intuitive explanatory video: https://www.youtube.com/watch?v=mnMB8MnBlqI

Unfortunately, I am having trouble coding this in R so anyone who has seen this before and can help please check out my question on stackoverflow on how to implement CRE for unbalanced panels in R: https://stackoverflow.com/questions/68040949/how-to-calculate-the-cluster-means-of-variables-in-a-correlated-random-effects-m

Rebekka
  • 11
  • 3
  • Welcome. Is `gvkey` a unique firm/company identifier? And, is `gind` a unique industry identifier? – Thomas Bilach Jun 15 '21 at 16:43
  • Thanks. Yes, gvkey is a unique firm identifier and gind is a unique industry identifier. there are 24 different industries. fyear is the respective year. there is only one observation per year per company. – Rebekka Jun 16 '21 at 08:10

0 Answers0