Questions tagged [panel-data]

Panel data refers to multi-dimensional data frequently involving measurements over time in econometrics. It is also called longitudinal data in biostatistics.

Panel data (also called longitudinal data) consist of data that are collected repeatedly on the same study units (e.g., firms or subjects). This type of data allows one to exploit both cross-sectional and time series information on the sampled subjects. This makes it possible to eliminate endogeneity problems due to unobserved factors which are invariant over time. Such fixed effects can be absorbed or differenced out (see fixed effects estimation). If such effects are of no concern, it is possible to improve on OLS in terms of efficiency by using the random effects estimator which utilizes the between and within information in the data more effectively.

Many estimation techniques rely on so-called "small T large N" asymptotic, i.e. many subjects or series that are observed for a relatively short time period. As the time dimension increases, the data becomes more dynamic, leading to inconsistencies in the standard panel estimators. Methods for dealing with dynamic panel data have been developed by Anderson and Hsiao, and Arellano and Bond, among others.

Examples of longitudinal data sets include the Panel Study of Income Dynamics (PSID), the British Household Panel Survey (BHPS) or the National Longitudinal Survey (NLS).

For an extensive overview of panel econometric and statistical techniques see for instance:
Wooldridge, J. (2010). Econometric Analysis of Cross Section and Panel Data (2nd ed.). Cambridge, MA: MIT Press.

2058 questions
61
votes
5 answers

How exactly does a "random effects model" in econometrics relate to mixed models outside of econometrics?

I used to think that "random effects model" in econometrics corresponds to a "mixed model with random intercept" outside of econometrics, but now I am not sure. Does it? Econometrics uses terms like "fixed effects" and "random effects" somewhat…
amoeba
  • 93,463
  • 28
  • 275
  • 317
40
votes
4 answers

Standard error clustering in R (either manually or in plm)

I am trying to understand standard error "clustering" and how to execute in R (it is trivial in Stata). In R I have been unsuccessful using either plm or writing my own function. I'll use the diamonds data from the ggplot2 package. I can do fixed…
38
votes
4 answers

Difference between longitudinal design and time series

What is/are the difference(s) between a longitudinal design and a time series?
DrWho
  • 799
  • 4
  • 12
  • 23
38
votes
3 answers

What percentage of a population needs a test in order to estimate prevalence of a disease? Say, COVID-19

A group of us got to discussing what percentage of a population needs to be tested for COVID-19 in order to estimate the true prevalence of the disease. It got complicated, and we ended the night (over zoom) arguing about signal detection and…
36
votes
1 answer

How to interpret variance and correlation of random effects in a mixed-effects model?

I hope you all don't mind this question, but I need help interpreting output for a linear mixed effects model output I've been trying to learn to do in R. I am new to longitudinal data analysis and linear mixed effects regression. I have a model I…
Zeda
  • 461
  • 1
  • 5
  • 3
30
votes
1 answer

What is an acceptable value of the Calinski & Harabasz (CH) criterion?

I have done a data analysis trying to cluster longitudinal data using R and the kml package. My data contains of around 400 individual trajectories (as it is called in the paper). You can see my results in the following picture: After reading…
greg121
  • 399
  • 1
  • 3
  • 13
26
votes
2 answers

Specifying a difference in differences model with multiple time periods

When I estimate a difference in differences model with two time periods, the equivalent regression model would be a. $Y_{ist} = \alpha +\gamma_s*Treatment + \lambda d_t + \delta*(Treatment*d_t)+ \epsilon_{ist}$ where $Treatment$ is a dummy which…
23
votes
6 answers

What is the difference between pooled cross sectional data and panel data?

They seem so similar. Are they the same thing but just referred to as different names?
Kyle
  • 1,119
  • 6
  • 13
  • 22
22
votes
5 answers

What are differences between the terms "time series analysis" and "longitudinal data analysis"

When talking about longitudinal data, we may refer to data collected over time from the same subject / study unit repeatedly, thus there are correlations for the observations within the same subject, i.e., within-subject similarity. When talking…
askming
  • 577
  • 1
  • 4
  • 15
22
votes
1 answer

Can splines be used for prediction?

I cannot be specific about the nature of the data as it is proprietary, but suppose we have data like this: Each month, some people sign up for a service. Then, in each subsequent month, those people may upgrade the service, discontinue the service…
Peter Flom
  • 94,055
  • 35
  • 143
  • 276
20
votes
6 answers

Difference between panel data & mixed model

I would like to know the difference between panel data analysis & mixed model analysis. To my knowledge, both panel data & mixed models use fixed & random effects. If so, why do they have different names? Or are they synonymous? I've read the…
Beta
  • 5,784
  • 9
  • 33
  • 44
17
votes
1 answer

Do autocorrelated residual patterns remain even in models with appropriate correlation structures, & how to select the best models?

Context This question uses R, but is about general statistical issues. I'm analysing the effects of mortality factors (% mortality due to disease and parasitism) on moth population growth rate over time, where larval populations were sampled from 12…
17
votes
4 answers

Propensity score matching with panel data

I have a longitudinal data set of individuals and some of them were subject to a treatment and others were not. All individuals are in the sample from birth until age 18 and the treatment happens at some age in between that range. The age of the…
Andy
  • 18,070
  • 20
  • 77
  • 100
17
votes
5 answers

What's the difference between time-series econometrics and panel data econometrics?

This question may be very naive, but the way I'm taught econometrics I'm very confused if there's a difference between time-series and panel data method. Regarding time series, I've covered topics such as covariance stationary, AR, MA,…
Heisenberg
  • 4,239
  • 3
  • 23
  • 54
16
votes
2 answers

What are the differences between "Mixed Effects Modelling" and "Latent Growth Modelling"?

I'm decently familiar with mixed effects models (MEM), but a colleague recently asked me how it compares to latent growth models (LGM). I did a bit of googling, and it seems that LGM is a variant of structural equation modelling that is applied to…
Mike Lawrence
  • 12,691
  • 8
  • 40
  • 65
1
2 3
99 100