1

I have a pretty basic question about conditional expectation that is stumping me.

Consider the real-valued random variables $Y$, $X$ and $e$, where $E[e] = 0$ and $X$ and $e$ are independent. Assuming a linear relationship, we can write the standard univariate regression equations, $$ Y = a + bX + e $$ $$ E[Y|X]=a + bX $$ But instead of this I need, $E[X|Y]$. So first we rearrange the equation, $$ X = (-a+Y -e)/b, $$ then take the conditional expectation, $$ E[X|Y]=(-a +Y-E[e|Y])/b. $$ (See also: What is the problem about Reverse Regression and how does the IV approach help to solve it?).

My question is: How do you simplify this further? As it stands, I can't tell what the slope of this line is since we have the variable $Y$ in two places: $Y-E[e|Y]$, instead of in one place to give an equation of the form $y=mx+b$.

I have tried this: calculate $E[e|Y]=E[eY]/E[Y]=E[e(a+bX+e)]/E[Y]=Var[e]/E[Y]$. But then I'm confused because this is a constant, when I expect that, since it's a conditional expectation, it should be a function of $Y$ and not a constant for all $Y$.

So, did I calculate $E[e|Y]$ correctly, in which case: $$ E[X|Y]=(1/b)Y-Var[e]/bE[Y]-a/b $$ so that the slope is $m=1/b$ and intercept $-Var[e]/bE[Y]-a/b$?

But if so, I have a very basic confusion about conditional expectation. Why is $E[e|Y]$ a constant but $E[X|Y]$ and $E[Y|X]$ are not? If I do the same calculation for $E[X|Y]$ I get, $E[X|Y]=E[XY]/E[Y]=E[X(a+bX+e)]/E[Y]=bVar[X]/E[Y]$, which is a constant. What mistakes am I making?

Any help is much appreciated!

user62421
  • 41
  • 4
  • I fear that you consider the identity $E[u|Y] = E[uY]/E[Y]$ as holding for any r.vs $u$, $Y$ which is not true. Try for example $u:= 1/Y$. – Yves Jun 18 '21 at 07:11

1 Answers1

2

In general, the reverse regression will not actually describe $E[X|Y]$. For example if $X$ is binary 0/1 then $E[X|Y=y]=P(X=1|Y=y)$, and this is a nonlinear sigmoid function of $Y$.

We can still ask what the least-squares line for regressing $X$ on $Y$ is. It won't be $E[X|Y]$ but it may be of interest.

The least squares reverse regression has slope $\mathrm{cov}[X,Y]/\mathrm{var}[Y]$ and the forward regression has slope $\mathrm{cov}[X,Y]/\mathrm{var}[X]$. So, if the forward OLS regression has slope $\beta$, the reverse OLS regression has slope $$\beta\frac{\mathrm{var}[X]}{\mathrm{var}[Y]}$$

Thomas Lumley
  • 21,784
  • 1
  • 22
  • 73