0

I want to estimate a conditional negative binomial model, which an economist might call a negative binomial model with individual fixed effects. I use Python and statsmodels, which has a conditional Poisson model but not a conditional negative binomial model, as far as I know. I want to follow Allison and Waterman to estimate a conditional Poisson model and correct standard errors for over-dispersion as follows:

A relatively simple approach is to estimate the β coefficients under the fixed-effects Poisson model, but adjust the standard errors upward for overdispersion. A commonly-used adjustment is to multiply the standard errors by the square root of the ratio of the goodness-of-fit chi-square to the degrees of freedom. (Either Pearson’s chi-square or the deviance could be used. )

However, I can't find Pearson's $\chi^2$ in the output of statsmodels' ConditionalPoisson fit summary. A comment on a recent question suggests that I should separately estimate $\chi^2$. However, I don't see a fit method to estimate $\chi^2$, and the fit predict method (to manually calculate $\chi^2$) returns NotImplementedError.

How do I correct conditional Poisson standard errors in statsmodels for overdispersion?

Here's starter code:

import numpy as np
import pandas as pd
import statsmodels.discrete.conditional_models as cm

df = pd.DataFrame(
    {
        'i': (np.arange(10000) // 200) + 1,
        't': (np.arange(10000) % 200) + 1
    }
)

np.random.seed(2001)
df['x'] = np.random.normal(loc=df['i'])
df['y'] = np.random.negative_binomial(np.exp((df['i'] + df['x'])/10), 0.1)

df1 = df.groupby('i').filter(lambda x: x['y'].var() > 0)

m1 = cm.ConditionalPoisson(
    endog=df1['y'],
    exog=df1['x'],
    groups=df1['i']
)

f1 = m1.fit(method='newton', cov_type='cluster')

print(f1.summary())


                 Conditional Logit Model Regression Results                  
==============================================================================
Dep. Variable:                      y   No. Observations:                10000
Model:             ConditionalPoisson   No. groups:                         50
Log-Likelihood:           -1.1662e+09   Min group size:                    200
Method:                        newton   Max group size:                    200
Date:                Sat, 29 May 2021   Mean group size:                 200.0
Time:                        10:06:49                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
x              0.1003   6.95e-05   1443.904      0.000       0.100       0.100
==============================================================================
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Richard Herron
  • 1,161
  • 2
  • 13
  • 20
  • 1
    Use quasilikelihood. R has a family function `quasipoisson` for that, Python maybe have something similar. Alternatively, use robust standard errors. See https://stats.stackexchange.com/questions/201903/how-to-deal-with-overdispersion-in-poisson-regression-quasi-likelihood-negativ – kjetil b halvorsen May 30 '21 at 01:21
  • 1
    It looks like statsmodels does not allow overdispersion corrected or heteroscedasticity robust standard errors for conditional models for Logit and Poisson. Those are available for the "standard" models. GLM has both. Discrete models like Logit, Poisson. NegativeBinomial only have robust standard errors, HC, cluster, ..., but not simple overdispersion correction. – Josef May 30 '21 at 20:49
  • @kjetilbhalvorsen Thanks for the link! I hoped for a way to modify SEs in statsmodels. I use `feglm` and `feglm.nb` from the `alpaca` package for fixed-effect models in R, but I want to switch to Python completely. – Richard Herron Jun 01 '21 at 18:22
  • @Josef, I'm OK without a canned solution. Is there a way to calculate $\chi^2$ for `ConditionalPoisson` results? If I can calculate $\chi^2$, I can adjust SEs following the Allison paper linked above. If you're the Josef P. from statsmodels, thanks for all the hard work! statsmodels gets me very close to a complete switch from R to Python! – Richard Herron Jun 01 '21 at 18:27
  • I have not looked at the literature for this. I'm not sure how chi2 is defined for this. The conditional model removes the individual specific intercept and that will not be available for prediction. i.e. How are fittedvalues and Pearson residuals defined for conditional models? https://github.com/statsmodels/statsmodels/issues/7469 – Josef Jun 01 '21 at 22:24
  • AFAICS, `feglm.nb` just fits a standard fixed effects model and is not a conditional model that conditions out the individual heterogeneity. The only difference to standard GLM/NegativeBinomial seems to be computational in that it can handle more fixed effects. – Josef Jun 02 '21 at 14:19

0 Answers0