Linear regression with one generated regressor

Question

Suppose I have the regression model: $Y_i=T^{\top}_{i}\beta_0+e_{i}$ with $E(e_i|X_i)=0$, where we have two regressors $X_i,\ E(D|X_{i})$ so that $T^{\top}_{i}=[X_i,\ E(D|X_{i})]$. $X_{i}$ is a discrete random variable with support $\{1,2,3\}$ and $D$ is a dummy variable. Here $E(D|X_{i})$ denotes the conditional expectation of $D$ given $X_i$. Data is a random sample for $(Y,X,D)$: $\{Y_i,X_i,D_i\}_{i=1}^{n}$. In order to estimate $\beta_0$ we need to estimate the second regressor first with a frequency estimator:

$\widehat{E}(D|X_i=k)=\frac{\sum_{i=1}^{n}\mathbf{1}(D_i=1, X_i=k)}{\sum_{i=1}^{n}\mathbf{1}(X_{i}=k)}$ for $k=1,2,3$.

In the second step, we estimate $\beta_0$ using generated regressor $\widehat{T}^{\top}_{i}=[X_i,\ \widehat{E}(D|X_{i})]$.

$\widehat{\beta}=(\frac{1}{n}\sum_{i=1}^{n}\widehat{T}_{i}\widehat{T}_{i}^{\top})^{-1}\frac{1}{n}\sum_{i=1}^{n}\widehat{T}_{i}y_{i}$.

Consider another an infeasible version that uses true value of $E(D|X_i)$:

$\widetilde{\beta}=(\frac{1}{n}\sum_{i=1}^{n}T_{i}T_{i}^{\top})^{-1}\frac{1}{n}\sum_{i=1}^{n}T_{i}y_{i}$.

Do we have:

$\sqrt{n}(\widehat{\beta}-\beta_{0})=\sqrt{n}(\widetilde{\beta}-\beta_0)+o_{p}(1)$?

Thanks!

T34driver · Accepted Answer · 2020-08-30T17:31:48.537

The claimed equation is not true. Or, $\sqrt{n}(\widehat{\beta}-\beta_0)$ and $\sqrt{n}(\widetilde{\beta}-\beta_0)$ is not asymptotically equivalent. To see it, note that

$\frac{1}{n}\sum_{i=1}^{n}\widehat{T}_{i}\widehat{T}_{i}^{\top}=\frac{1}{n}\sum_{i=1}^{n}\mathbf{1}(X_i=1)\begin{bmatrix}1&\widehat{E}(D|X_i=1)\\ \widehat{E}(D|X_i=1)&(\widehat{E}(D|X_i=1))^2\end{bmatrix}+...+\frac{1}{n}\sum_{i=1}^{n}\mathbf{1}(X_i=3)\begin{bmatrix}3^2&\widehat{E}(D|X_i=3)\\ \widehat{E}(D|X_i=3)&(\widehat{E}(D|X_i=3))^2\end{bmatrix}$

Note that $\widehat{E}(D|X_i=k)$ does not change with $i$, so we have $\frac{1}{n}\sum_{i=1}^{n}\widehat{T}_{i}\widehat{T}_{i}^{\top}=\widehat{p}_{1}\begin{bmatrix}1&\widehat{E}(D|X_i=1)\\ \widehat{E}(D|X_i=1)&(\widehat{E}(D|X_i=1))^2\end{bmatrix}+...+\widehat{p}_{3}\begin{bmatrix}3^2&\widehat{E}(D|X_i=3)\\ \widehat{E}(D|X_i=3)&(\widehat{E}(D|X_i=3))^2\end{bmatrix}$,

where $\widehat{p}_k=\frac{1}{n}\sum_{i=1}^{n}\mathbf{1}(X_i=k)$. By law of large numbers, Slutsky's theorem and law of total expectation, we know that $\frac{1}{n}\sum_{i=1}^{n}\widehat{T}_{i}\widehat{T}_{i}^{\top}=p_{1}\begin{bmatrix}1&E(D|X_i=1)\\ E(D|X_i=1)&(E(D|X_i=1))^2\end{bmatrix}+...+p_{3}\begin{bmatrix}3^2&E(D|X_i=3)\\ E(D|X_i=3)&(E(D|X_i=3))^2\end{bmatrix}+o_{p}(1)=E(T_{i}T_{i}^{\top})+o_{p}(1).$.

Also note that $\frac{1}{n}\sum_{i=1}^{n}\widehat{T}_i y_i=\begin{bmatrix}\frac{1}{n}\sum_{i=1}^{n}X_iy_i\\ \frac{1}{n}\sum_{i=1}^{n}\widehat{E}(D|X_i)y_i \end{bmatrix}$, thus it suffices to examine the relationship between $\frac{1}{n}\sum_{i=1}^{n}\widehat{E}(D|X_i)y_i$ and $\frac{1}{n}\sum_{i=1}^{n}E(D|X_i)y_i$. These two are clearly not asymptotically equivalent. As $\frac{1}{n}\sum_{i=1}^{n}E(D|X_i)y_i=\sum_{k=1}^{3}\overline{y}_{k}E(D|X_i=k),$ where $\overline{y}_{k}=\frac{\sum_{i=1}^{n}y_{i}\mathbf{1}(X_i=k)}{n}$. while $\frac{1}{n}\sum_{i=1}^{n}\widehat{E}(D|X_i)y_i=\sum_{k=1}^{3}\overline{y}_{k}E(D|X_i=k)+\sum_{k=1}^{3}\overline{y}_{k}(\widehat{E}(D|X_i=k)-E(D|X_i=k)).$

So $\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\widehat{E}(D|X_i)y_i=\sqrt{n}\sum_{k=1}^{3}\overline{y}_{k}E(D|X_i=k)+\sum_{k=1}^{3}\overline{y}_{k}(\sqrt{n}\widehat{E}(D|X_i=k)-E(D|X_i=k)),$

where the second term obviously do not converge in probability to zero.

Linear regression with one generated regressor

1 Answers1