I'm looking for help with regard to the notation for a regression equation in a repeated measures model with nested data, $\eqref{eq:2}$, and connecting the notation back to my model specification, Model.3
, in r.
My starting point is, i.e. I am familiar, the notation used in Wooldridge's Introductory (2013). Wooldridge's notation for the general longitudinal model for both Random Effects and Fixed Effects Estimation is can be written as,
$$ y_{it} = \beta_{0}+\beta_{1}x_{it} + a_{i}+u_{it} \tag{1} \label{eq:1} $$ where individuals are indexed by $i = 1, 2, …, n$ and time is indexed by $t = 1,2, …, T$. The error term is in two parts; $a_i$ an unobserved individual specific component, which captures unobserved, time-constant, factors and $u_{it}$ the idiosyncratic error, capturing unobserved factors that change over time.
In r I've been estimating a Random Effects version of this model using the plm package like this. First some required packages and some data,
# install.packages(c("plm", "lme4", "texreg", "mlmRev"), dependencies = TRUE)
data(egsingle, package = "mlmRev")
the data-set egsingle
is a unbalanced panel consisting of 1721 school children, grouped in 60 schools, across five time points. For details see ?mlmRev::egsingle
Some light data management
dta <- egsingle
dta$Female <- with(dta, ifelse(female == 'Female', 1, 0))
Also, a snippet of the relevant data
dta[118:127,c('schoolid','childid','math','year','size','Female')]
#> schoolid childid math year size Female
#> 118 2040 289970511 -1.830 -1.5 502 1
#> 119 2040 289970511 -1.185 -0.5 502 1
#> 120 2040 289970511 0.852 0.5 502 1
#> 121 2040 289970511 0.573 1.5 502 1
#> 122 2040 289970511 1.736 2.5 502 1
#> 123 2040 292772811 -3.144 -1.5 502 0
#> 124 2040 292772811 -2.097 -0.5 502 0
#> 125 2040 292772811 -0.316 0.5 502 0
#> 126 2040 293550291 -2.097 -1.5 502 0
#> 127 2040 293550291 -1.314 -0.5 502 0
Now, here’s how I would specify the Random Effects Model in r, ignoring the schoolid
, based on $\eqref{eq:1}$, using plm()
and estimating with FGLS,
library(plm)
Model.1 <- plm(math~Female+size+year, dta, index = c("childid", "year"), model="random")
# summary(reg.re.plm)
However, as mentioned at the top, the data is also nested. That is, childid
is nested in schoolid
. To write this regression equation I've simply extended $\eqref{eq:1}$ by adding a school-subscript, $s$,
$$ y_{ist} = \beta_{0}+\beta_{1}x_{ist} + a_{i}+\nu_{s}+u_{ist} \tag{2} \label{eq:2} $$ now $y$, $x$, and the idiosyncratic error, $u$, is extended with a $s$ dimension, and the combined error, that in $\eqref{eq:1}$ consist of two parts, is in $\eqref{eq:2}$ extended by a term, $\nu_{s}$. This term then captures the unobserved group/school specific component. I am not confident that this specification is correct. I might be confused by the differences in jargon across the literature.
Part 1 Is $\eqref{eq:2}$ a correct way to specify a regression equation for repeated measures random effects model with a nested structure? Any authoritative literature that use notation similar to this?
This next part, Part 2, is no longer that relevant.
I have tried finding a way to estimate what I believe is $\eqref{eq:2}$ using plm, but I haven't succeeded in that. Part 2 Is it possible to estimate a repeated measures random effects model with a nested structure using the plm package?Based on this question I believe this part is answered by a yes it is estimate to estimate a _repeated measures random effects model with a nested structure_ using the plm package, see the the question linked above
I have estimated, after studding this great answer by Robert Long, a repeated measures model, with childid
nested in schoolid
, using the lme4 package. Like this,
dta$year <- as.factor(dta$year)
require(lme4)
As the lme4 package is relying on a likelihood framework I begin by estimating a model similar to Model.1
above (for later comparison). Like this,
Model.2 <- lmer(math ~ Female + size + year + (1 | childid), dta)
Now, relying on Robert Long's answer I've specified the nested model like this,
Model.3 <- lmer(math~Female+size+year+(1| schoolid /childid), dta)
Assuming Model.3
is correct specified.
Part 3.a What authoritative source do you recommend, preferably with notation similar to Wooldridge (2013), that presents and discuss the notation for the regression equations for what I am estimating in
Model.3
?Part 3.b Is $\eqref{eq:2}$ actually what I am estimating in
Model.3
?
Below is the actual estimation results form the three models,
# require(texreg)
texreg::screenreg(list(Model.1, Model.2, Model.3), digits = 3)
#> =============================================================================
#> Model 1 Model 2 Model 3
#> -----------------------------------------------------------------------------
#> (Intercept) -2.671 *** -2.669 *** -2.693 ***
#> (0.085) (0.086) (0.152)
#> Female -0.025 -0.025 0.008
#> (0.046) (0.047) (0.042)
#> size -0.000 *** -0.000 *** -0.000
#> (0.000) (0.000) (0.000)
#> year-1.5 0.878 *** 0.876 *** 0.866 ***
#> (0.059) (0.059) (0.059)
#> year-0.5 1.882 *** 1.880 *** 1.870 ***
#> (0.059) (0.058) (0.058)
#> year0.5 2.575 *** 2.574 *** 2.562 ***
#> (0.059) (0.059) (0.059)
#> year1.5 3.149 *** 3.147 *** 3.133 ***
#> (0.060) (0.059) (0.059)
#> year2.5 3.956 *** 3.954 *** 3.939 ***
#> (0.060) (0.060) (0.060)
#> -----------------------------------------------------------------------------
#> R^2 0.735
#> Adj. R^2 0.735
#> Num. obs. 7230 7230 7230
#> AIC 16855.629 16590.715
#> BIC 16924.489 16666.461
#> Log Likelihood -8417.815 -8284.357
#> Num. groups: childid 1721
#> Var: childid (Intercept) 0.857
#> Var: Residual 0.334 0.334
#> Num. groups: childid:schoolid 1721
#> Num. groups: schoolid 60
#> Var: childid:schoolid (Intercept) 0.672
#> Var: schoolid (Intercept) 0.180
#> =============================================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05
mixed-model repeated-measures nested-data r lme4-nlme plm
Wooldridge, Jeffrey M. (2013). Introductory Econometrics: A Modern Approach. 5th edition. South-Western College, 2013. isbn: 9781285414645. url: https://www.cengage.co.uk/books/9781111531041/