I am trying to run a fixed-effects Poisson Quasi Maximum Likelihood estimator on 3-dimensional(year, country, industry) unbalanced Panel data.
The dependent variable is the number of patents(non-negative and non-integer). The patent data is non-integer because some patents were registered in more than one country and were 'shared' between them.
The main independent variable is deregulation(a dummy variable which equals 0 before the year deregulation was implemented in a country and 1 starting from the implementation year). I am trying to catch the effect of deregulation on patent activity. I also have some independent variables as controls(size and share).
The data looks like this
country industry year pt size_emp size_val dereg share id
Austria Food and beverages 1990 NA 59.76742 2844243.5 0 0.10035470 1
Austria Food and beverages 1991 2.023 61.84432 3121737.4 0 0.10254747 1
Austria Food and beverages 1992 NA 61.50290 3724826.9 0 0.11448406 1
Austria Food and beverages 1993 3.344 61.19843 3699648.1 0 0.12175012 1
Austria Food and beverages 1994 6.000 61.83063 3808057.0 0 0.11251291 1
Austria Food and beverages 1995 6.665 17.42631 1073797.4 0 0.11605032 1
Austria Food and beverages 1996 5.020 16.52287 1020846.7 1 0.11730912 1
Austria Food and beverages 1997 7.467 16.84073 811186.0 1 0.10929066 1
Austria Food and beverages 1998 5.433 17.16993 837194.7 1 0.10477675 1
Austria Food and beverages 1999 4.556 17.23248 795350.9 1 0.09516772 1
I want to run quasi-Poisson regression and include year, country, and industry fixed effects. I ran this model
model <- feglm(pt ~ dereg + log(size_emp) + share|country + industry + year,
data = pdata,cluster = c("country","industry", "year"),
family = quasipoisson)
The results are as follows
GLM estimation, family = quasipoisson, Dep. Var.: pt
Observations: 4,248
Fixed-effects: Country: 14, Industry: 15, year: 24
Standard-errors: Three-way (Country & Industry & year)
Estimate Std. Error t value Pr(>|t|)
dereg 0.049263 0.068041 0.724017 0.481882
log(size_emp) 0.172708 0.070838 2.438100 0.029875 *
share 3.779300 0.832383 4.540400 0.000555 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Squared Cor.: 0.986039
My questions are
- Is quasi-poisson the right model to run in this case? Are the results valid with non-integer data?
- Is there any other critical issue in the model that I need to pay attention to?
Any help would be much appreciated.