6

I'm trying to re-create an analysis done using Stata function xtreg (though I don't have the code) with R package plm, and I'm having trouble translating between the two. A minimal example of the model I'm trying to estimate would be as follows:

$$ \text{Dependent}_{i,t} = \beta_0 + \gamma_i + \delta_t + \beta_1*\text{Dependent}_{i,t-1} + \beta_2*\text{Independent}_{i,t} + \epsilon_{i,t} $$

where $\gamma_i$ is a control for the cross-sectional factor (States), $\delta_t$ is a control for each time period (Year), $\text{Dependent}_{i,t}$ is a dependent variable in the form of a proportion, and $\text{Independent}_{i,t}$ is an attribute of a record that changes with both the cross-sectional and time component of the data.

I've attempted to estimate the coefficients for the model using R with the following (generalized) code:

model <- plm(Dependent_it ~ Independent_it + lag(Dependent_it,1), 
             data   = data, 
             method = "within", #fixed effects model
             effect = "twoway", #does the gamma_i and delta_t parts (I think)
             index  = c("State", "Year")
             )

I've tried to re-create it in Stata using the following (after running xtset State Year, yearly of course)

xtreg Dependent Independent l.Dependent i.Year, fe

I get different results, so I must be doing something wrong with either xtreg or plm, or both. Can someone enlighten me as to if I've coded the above model appropriately in either language?

Helix123
  • 1,265
  • 9
  • 15
gwatson
  • 171
  • 1
  • 1
  • 5
  • 3
    You should keep in mind that the within estimators requires a strict exogeneity assumption which is necessarily violated if you include lagged outcomes as regressors, e.g. www.nber.org/WNE/lect_2_linpanel.pdf‎ – Arne Jonas Warnke Aug 29 '13 at 12:51

1 Answers1

10

Welcome to the site, @gwatson! You are right that effect = "twoways" sets up both "individual" and "year" effects.

I tested with Produc data from R package plm and found the main results are the same (see the codes and outputs below). The only apparent difference I found is the year effect, which is caused by contrast (xtreg sets the first year as reference, while plm directly estimates the effect for each year).

## R code
data("Produc", package = "plm")
zz <- plm(gsp ~ unemp + lag(gsp), data = Produc, index = c("state","year"), method = "within", effect = "twoways")
summary(zz)

## plm output
Coefficients :
            Estimate  Std. Error  t-value  Pr(>|t|)    
unemp    -5.4525e+02  6.8611e+01  -7.9469 7.614e-15 ***
lag(gsp)  1.0125e+00  9.1789e-03 110.3029 < 2.2e-16 ***


## Stata code
use Produc, clear
xtset state year, yearly
xtreg gsp unemp l.gsp i.year, fe

## xtreg output
------------------------------------------------------------------------------
         gsp |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       unemp |   -545.246   68.61136    -7.95   0.000    -679.9537   -410.5383
      gsp L1.|   1.012464   .0091789   110.30   0.000     .9944422    1.030485
-------------+----------------------------------------------------------------
Randel
  • 6,199
  • 4
  • 39
  • 65