What is the problem about Reverse Regression and how does the IV approach help to solve it?

Question

I really need some help with reverse regressions. I'm trying to solve the following exercise:

a) Consider the following Conditional Expectation Function model: $Y = \alpha + \beta X + \epsilon , E[Y|X] = \alpha + \beta X$

Suppose that we have transformed the above equation in the following way: $X = -(\alpha /\beta) + (1/\beta)Y - (1/\beta)\epsilon $

Show that this equation does not satisfy the definition of the CEF model.

b) Consider now the best linear predictor equation: $Y = \alpha_L + \beta_L X + \epsilon_L$

Consider as well the reverse best linear predictor equation: $X = \alpha^*_L + \beta^*_L Y + \epsilon^*_L$.

Show that in the Best Linear Predictor Setting $\beta_L \ne 1/\beta^*_L$

c) Assume there is an Instrument Z. Show that $\beta^*_L = 1/\beta^*_L$ in the IV setting.

I would be really thankful if someone could explain me the idea behind the problem about reverse Regression and how it is related to the IV approach.

Federico Tedeschi · Answer 1 · 2017-06-07T13:46:45.503

I'm sorry: I don't understand what you mean by point c. I'll reply to points a and b.

a) $X = -(\alpha /\beta) + (1/\beta)Y - (1/\beta)\epsilon \rightarrow E[X_i|Y_i]=-(\alpha /\beta) + (1/\beta)Y_i$, because $X \perp \epsilon$. To satisfy the CEF model, we need: $E[- (1/\beta)\epsilon_i | Y_i]= 0 \rightarrow - (1/\beta)E[\epsilon_i | Y_i] = 0 \rightarrow (\beta!=0, E[\epsilon_i | Y_i] = 0)$. Let's focus on the second condition: $E[\epsilon_i | Y_i]=0$, and define $W_i=\epsilon_i*Y_i$. By the Law of Total Expectation, we have: $E[\epsilon_i | Y_i]=0 \rightarrow E[W_i]=E_{Y_i}[E[W_i|Y_i]]=E[Y_i*E[\epsilon_i|Y_i]]=E[Y_i*0]=0\rightarrow Cov(\epsilon,Y)=E[\epsilon*Y]-E[\epsilon]E[Y]=0-0=0.$ But, given $Y = \alpha + \beta X + \epsilon$, then: $Cov(\epsilon,Y)=\beta*Cov(X,\epsilon)+Cov(\epsilon,\epsilon)=\beta*0+Var(\epsilon)=\sigma^2>0$. Thus, the condition $E[\epsilon_i | Y_i]=0$ does not hold, so the CEF model is not satisfied. To be precise, the condition would only be satisfied in the degenerate case, i.e.: $\sigma^2=0\rightarrow \epsilon_i=0 \forall i$, i.e. in the deterministic setting.

b) Let's suppose $\beta_L=1/\beta^*_L$. Then, we get: $Cov(X,Y)/Var(Y)=Var(X)/Cov(X,Y)\rightarrow Cov(X,Y)^2=Var(X)*Var(Y)\rightarrow [Corr(X,Y)*SD( X)*SD(Y)]^2=Corr(X,Y)^2*Var(X)*Var(Y)=Var(X)*Var(Y)\rightarrow Corr(X,Y)=+1 \bigvee Corr(X,Y)=-1.$ Again, the condition only holds in case the relationship between $X$ and $Y$ is deterministic.

In fact, Galton (1886) showed that, if you normalize $X$ and $Y$ to have the same variance, then the coefficient in regressing $Y$ on $X$ or viceversa are the same, and equal to the coefficient of correlation (thus, between $-1$ and $1$). In case of perfect association and same variance of $X$ and $Y$, the coefficient would be $+1 \bigvee -1$ indeed. This link may help: http://davegiles.blogspot.it/2014/11/reverse-regression-follow-up.html

Also, you may be interested in these Qs&As:

What is the difference between linear regression on y with x and x with y? Effect of switching response and explanatory variable in simple linear regression

Galton, Francis (1886): “Regression Towards Mediocrity in Hereditary Stature,” The Journal of the Anthropological Institute of Great Britain and Ireland, 15, 246-263.

What is the problem about Reverse Regression and how does the IV approach help to solve it?

1 Answers1

Linked