Best practice for dynamic panel data estimation with multilevel structure in a $T \gg N$ setting

Question

We plan to estimate a dynamic panel model with both, varying intercept and varying slopes. Further, we also want to include group-level predictors for the varying effects in second-stage regressions.

Our panel structure is pretty $T \gg N$ (i.e., $N=20, T=200$, where $N$ denotes the number of cross-section and $T$ the number of measures for each cross-section, resulting in a total number of 4,000 observations).

The model we want to fit looks as follows:

$y_{i,t}=\alpha_i+\beta_ix_{it}+\delta_iy_{i,t-1}+\epsilon_{i,t}$, where $\epsilon_{i,t}$ ~ $N(0, \sigma_\epsilon)$

with $\alpha_i = \bar\alpha+\psi_i$, where $\psi_{i}$ ~ $N(0, \sigma_\psi)$,

$\delta_i = \bar\delta+\eta_i$, where $\eta_{i}$ ~ $N(0, \sigma_\eta)$,

and

$\beta_i = \bar\beta+\gamma z_i+\omega_i$, where $\omega_{i}$ ~ $N(0, \sigma_\omega)$.

So, we want to regress $y_{i,t}$ on $x_{i,t}$ and the lagged dependent variable $y_{i,t-1}$ with random effects on all regression parameters. Further, we model $\beta_i$ as a function of $z_i$, which is a group-level predictor.

What is the best practice to deal with such a situation? Do we even need to worry about a dynamic panel bias with such a long $T$?

Please try to avoid domain-specific terminology - what exactly are $T$ and $N$ ? Please can you describe the study design in more detail. — Robert Long, Jan 22 '20 at 11:06
Sorry, I didn't know that $T$ and $N$ are that domain specific. $N$ denotes the number of our cross-sections (e.g., groups, brands, individuals) and $T$ denotes the number of time points per cross-section. So we have 200 measures for each of our 20 cross-sections — Dirk Buttke, Jan 22 '20 at 12:56
It's always good to define the terms that you use. $N$ is very often total observations, and it is quite rare to see $T$ in a regression model at all. It would be good if you can also describe your study design in more detail (by editing the question rather than as a comment). I had not previously heard of dynamic regression so you might also want to explain what that is since models used in different domains have different names. — Robert Long, Jan 22 '20 at 13:13
Sorry, I've edited the initial question accordingly. I hope that our design has become clearer now. — Dirk Buttke, Jan 22 '20 at 19:39
It's becoming a little clearer. But what is a "cross section" ? What exactly are you measuring? What exactly are the variables in your dataset and how were they measured ? AND what is your research question? — Robert Long, Jan 22 '20 at 20:28
I'd start by estimating the 200 time series regressions, and then looking at the scatter plot of the estimated beta coefficients against z. Depending on signal to noise ratio for the individual time series, this should give you very similar results as estimating the hierarchical model you've written down with MLE. The dynamic bias occurs in a fixed (small) $T$ increasing (large) $N$ setting, you have the opposite. And you allow the autoregression coefficient to vary across units too which should alleviate (no dynaic bias for the individual regressions which are the extreme case of that) — CloseToC, Jan 22 '20 at 22:09
I think you can do what you want with a mixed effects model, at least in principle, by creating the lagged variable as a covariate. I just need to think about how to model $\beta$ in the way you want — Robert Long, Jan 23 '20 at 08:44
This is a big topic in econometrics. See https://stats.stackexchange.com/questions/122741/whats-the-difference-between-time-series-econometrics-and-panel-data-econometri for some recommended readings. — Erik Ruzek, Jan 23 '20 at 15:24

Best practice for dynamic panel data estimation with multilevel structure in a $T \gg N$ setting

0 Answers0