3

After the very satisfying answer to: How to do regression with known correlations among the errors? I take the question to my next point of interest:

What can you do when you have a regression with error in variables problem:

$$\begin{cases} Y_t = \beta_1x_t^* + \beta_0 + \varepsilon_t \\ Y_t = y_t^* +\varepsilon_t\\ X_t = x_t^* +\eta_t\\ \left(\varepsilon_1,\eta_1,\ldots,\varepsilon_n,\eta_n\right)\sim\mathcal{N}_{2n}\left(\mathbf{0},D \right) \\ 1\leq t \leq n \end{cases}$$

you observe $(Y_t,X_t)$ and you know $D$, the covariance matrix of the errors?

I have been reading Wayne and Fuller Measurment Error Model, it's a bit cumbersome and only deals with special structures for $D$.

Is there any well established estimator for $(\beta_0,\beta_1)$ ? Something like Generalized least squares when there are no error in the variables.

Manuel
  • 1,517
  • 12
  • 19
  • Can you explain the first two lines of math a little more? This would make more sense to me without the second line of math. Maybe, it is a typo? – Bill Sep 12 '13 at 19:16
  • The idea is that you know that $y^*_t=\beta_1x^*_t+\beta_0$ But you only observe $(Y_t,X_t)$ which are obsevation of the real values but with error. Does this help? – Manuel Sep 12 '13 at 19:25
  • Yes. I understand. – Bill Sep 12 '13 at 21:39

1 Answers1

1

If you know the covariance matrix of the measurement errors, one strategy is to estimate the model by OLS, calculate the bias, and then correct for the bias. Write (going to matrix notation): \begin{align} Y &= X^*\beta + \epsilon\\ &= X\beta + \left(X^*-X\right) \beta + \epsilon\\ &= X\beta - \eta \beta + \epsilon \end{align} So, the OLS estimator is: \begin{align} \hat{\beta}_{OLS} = \beta &+ \left( X'X\right)^{-1}X'\left( -\eta \beta + \epsilon \right)\\ = \beta &- \left( X'X\right)^{-1}X'\eta \beta + \left( X'X\right)^{-1}X' \epsilon\\ = \beta &- \left( X'X\right)^{-1} X^* \, '\eta \beta + \left( X'X\right)^{-1} X^* \, ' \epsilon\\ &- \left( X'X\right)^{-1} \eta'\eta \beta + \left( X'X\right)^{-1} \eta' \epsilon \end{align} The first term is $\beta$. The second and third terms go in probabability to zero by the usual arguments, just requiring that $X^*$ be uncorrelated with both $\eta$ and $\epsilon$. So, the OLS estimator goes to: \begin{align} \hat{\beta}_{OLS} &\xrightarrow{P} \beta - Q^{-1}D_{\eta\eta}\beta + Q^{-1}D_{\eta\epsilon}\\ &= \left( I-Q^{-1}D_{\eta\eta}\right)\beta + Q^{-1}D_{\eta\epsilon} \end{align} If it is not clear from context, $D_{\eta\eta}$ is the part of the matrix $D$ which pertains only to the $\eta$, and $D_{\eta\epsilon}$ is the part of the matrix $D$ which pertains only to the covariances between the $\eta$ and the $\epsilon$. Also, the matrix $Q$ is the probability limit of $\hat{Q}=\frac{1}{N}X'X$.

There are two sources of bias, here. The term $\left( I-Q^{-1}D_{\eta\eta} \right)$ represents something like the classical attenuation bias. The true coefficient gets multiplied by (1-expression). Expression is like a noise-to-signal ratio: it has the variance of $\eta$ in its "numerator" and the variance of $\eta$ plus the variance of $X$ in its "denominator" --- the quotes because these are matrix operations, not real operations so there are not really numerators and denominators. When only one element of $X$ has measurement error, this formula does literally give the classical attenuation bias formula. Here it gives something akin to it. The second term is like a classical endogeneity bias arising from the correlation between $X^{*}$, the variable we are using on the RHS, and $\epsilon$---where the correlation arises through $\eta$.

Finally, a consistent estimator is: \begin{align} \tilde{\beta} = \left(I-\hat{Q}^{-1}D_{\eta\eta} \right)^{-1}\left(\hat{\beta}_{OLS} - \hat{Q}^{-1}D_{\eta\epsilon} \right) \end{align} It is a pretty good bet that this estimator is not efficient since I have done nothing about adjusting for heteroskedasticity or serial correlation which the assumptions given do not rule out. You could calculate the variance of the estimator by pushing even more algebra, but I am not interested in doing it.

If anyone wants to check/correct my algebra, much obliged.

Bill
  • 7,304
  • 27
  • 33
  • I would take a better look to this on Monday but it looks very interesting. Thanks very much for answering – Manuel Sep 14 '13 at 02:29
  • do you know where i can find references to ths type of calculation or papers where are used? Thanks again. It gives me a new perspective to what i am doing. – Manuel Sep 17 '13 at 12:52
  • 1
    @Manuel Sure, the _Handbook of Econometrics_, volume 5, chapter 59 is all about this stuff. – Bill Sep 17 '13 at 15:07
  • there is a problem with the deduction. You say that $\eta^{'}\eta \; ---> \; D_{\eta \eta}$ but $\eta \in \mathcal{R}^{n \times 1}$ so $\eta^{'}\eta = \sum_{k=1}^{n} \eta_k^2$. Same happens with $\eta^{'}\varepsilon$. Is there some way to correct these? I can't find this problem in Handbook of Econometrics. Or they put to little hipotesis or to many.. – Manuel Oct 30 '13 at 14:53
  • forgot to put @Bill – Manuel Oct 30 '13 at 15:07
  • @Manuel Are you sure? $X$ is $N$ by $k$. So, $\eta$ is also $N$ by $k$, no? $D_{\eta\eta}$ is $k$ by $k$. $\epsilon$ is $N$ by $1$. So, $D_{\eta\epsilon}$ is $k$ by $1$. Let me know if you disagree. – Bill Oct 31 '13 at 21:33
  • I agree with that. But then, what is $D_{\eta \eta}$? I first understood it was the covariance matrix of $\eta \in \mathcal{R}^{N \times k}$ But that should be in $\mathcal{R}^{kN \times kN}$ not in $\mathcal{R}^{k \times k}$ The same happens with $D_{\eta \varepsilon}$ – Manuel Nov 05 '13 at 15:43
  • @Manuel Hmmmm. $D_{\eta\eta}$ is the probability limit of $\frac{1}{N}\eta'\eta$. So, the (1,1) element is $V(\eta_1)$. This is the variance of the measurement error on the first column of $X$. The (1,2) element is $Cov(\eta_1,\eta_2)$, the covariance between the measurement error on the first column of $X$ and the measurement error on the second column of $X$. Etc. So, I would call that the covariance matrix of $\eta$. I am implicitly assuming that the measurement error has the same distribution for different observations. That assumption is not acceptable in your implementation? – Bill Nov 05 '13 at 15:50
  • No, unfortunately that's not acceptable. Measurment error have different moments for diferent equations. In the case when $k=1$ $\left(\varepsilon_1,\eta_1,\ldots,\varepsilon_n,\eta_n\right)\sim\mathcal{N}_{2n}\left(\mathbf{0},D \right)$ $D \in \mathcal{R}^{2n \times 2n}$ and there is no special structure for $D$ except that has full rank. – Manuel Nov 05 '13 at 16:06
  • @Manuel Literally no special structure is a problem. The (1,1) element of my $D_{\eta\eta}$ matrix is $plim_{N \rightarrow \infty} \frac{1}{N} \sum_{i=1}^N \eta_{i1}^2$. There is a generalization of the usual law of large numbers which will get that average to go to $lim \frac{1}{N} \sum_i V(\eta_{i1})$, as long as that limit exists. The existence of that limit is a very weak assumption on your $D$, but something like that will be necessary. If you can make these assumptions (for all elements of $D_{\eta\eta}$) and calculate the relevant limits, then what I say above still goes through. – Bill Nov 05 '13 at 16:15
  • @ Manuel Or, if you know the elements of your $D$, you can use them to calculate $E(\frac{1}{N} \eta'\eta)$ and use that in place of $D_{\eta\eta}$ above. Though, to prove consistency, I think you are still going to need a bound on the limsum. – Bill Nov 05 '13 at 16:23
  • Thanks so much @Bill. I am going to work on this. Just one more question, Is there a reason to use $Q^{-1}$ instead of $\left(\frac{X'X}{N}\right)^{-1}$? The latter seams reasonable, and avoids the issue of calculating the probability limit. It would be \begin{align} \tilde{\beta} = \left(I-\left(\frac{X'X}{N}\right)^{-1}D_{\eta\eta} \right)^{-1}\left(\hat{\beta}_{OLS} - \left(\frac{X'X}{N}\right)^{-1}D_{\eta\epsilon} \right) \end{align} – Manuel Nov 05 '13 at 17:25
  • @Manuel No, it is typical to use the latter to estimate the former. – Bill Nov 05 '13 at 18:55