Expected value of q given y is weighted average of mean q and and y

Question

It is assumed that:

1) $y=q+u$

Where $q$ is productivity and $y$ a testscore that measures true productivity. $u$ is a normally distributed error term, independent of $q$, with zero mean and constant variance; $q$ is also assumed to be normally distributed with a mean equal to $\alpha$ and with a constant variance. The outcome of this is:

2) $E(q | y) = (1-\gamma)\alpha + \gamma y$
where $\gamma=Var(q)/(Var(q)+Var(u))$

How do you get equation 2? The equation can be expressed as a group effect and a individual effect.

[This is a model of statistical discrimination, see: Dennis J. Aigner and Glen G. Cain. Statistical theories of discrimination in labor markets. Industrial and Labor Relations Review, 30(2):175{187, January 1977. URL http:// ideas.repec.org/a/ilr/articl/v30y1977i2p175-187.html.]

score 6 · Accepted Answer · edited Jun 11 '20 at 14:32

All you need to know is that the regression of $q$ on $y$ is determined by standardizing both variables and their correlation coefficient will be the slope.

(In particular this result owes nothing to the assumptions that distributions are Normal; the independence of $q$ and $u$ is sufficient. Thus it will be most revealing to obtain it without recourse to any properties of Normal distributions.)

Preliminary Calculations

To standardize a variable, you subtract its expectation and divide by its standard deviation. We will therefore need to compute standard deviations, expectations, and a correlation coefficient.

Because $y=q+u$,

$$\mathbb{E}(y) = \mathbb{E}(q+u) = \mathbb{E}(q) + \mathbb{E}(u) = \alpha + 0 = \alpha,$$

taking care of computing the expectations.

Turn now to the standard deviations. Recall that it's simpler to work with their squares: the variances. For brevity, write $\sigma^2$ for the variance of $q$ and $\tau^2$ for the variance of $u$. Then

$$\text{Var}(y) = \text{Var}(q+u) = \text{Var}(q) + \text{Var}(u) + 2\text{Cov}(u,q) = \sigma^2 + \tau^2 + 0 = \sigma^2 + \tau^2.$$

Finally, the correlation is computed from the covariance:

$$\text{Cov}(y, q) = \text{Cov}(q+u, q) = \text{Cov}(q,q) + \text{Cov}(u,q) = \sigma^2.$$

(Both these calculations used the simplification $\text{Cov}(u,q)=0$ arising from the independence of $u$ and $q$.)

Therefore the standardized variables are $$\eta = (y-\alpha)/\sqrt{\sigma^2+\tau^2}$$ and $$\theta=(q-\alpha)/\sigma.$$

Moreover, the correlation is $$\rho=\sigma^2/\left(\sigma\sqrt{\sigma^2+\tau^2}\right) = \sigma / \sqrt{\sigma^2+\tau^2}.$$

Solution

We have computed everything necessary to regress $q$ against $y$:

$$\mathbb{E}(\theta\ |\ \eta) = \rho\, \eta.$$

(This is a fact about geometry, really: see the "Conclusions" section at https://stats.stackexchange.com/a/71303 for the derivation, which--although it is illustrated there for Normal distributions--still does not require Normality to derive.)

Expanding, and once again exploiting linearity of expectation,

$$\frac{\mathbb{E}(q\ |\ y)-\alpha}{\sigma} = \mathbb{E}(\theta\ |\ \eta) = \rho\, \eta = \frac{\sigma}{\sqrt{\sigma^2+\tau^2}}\left(\frac{y-\alpha}{\sqrt{\sigma^2+\tau^2}}\right) = \frac{\sigma(y-\alpha)}{\sigma^2+\tau^2}.$$

It is the task of ordinary algebra to convert this back to an expression for $\mathbb{E}(q\ |\ y)$ in terms of $y$, because (insofar as $\mathbb{E}(q\ |\ y)$ is concerned) all variables now represent numbers:

$$\mathbb{E}(q\ |\ y) = \frac{\tau^2}{\sigma^2+\tau^2} \alpha + \frac{\sigma^2}{\sigma^2+\tau^2} y.$$

That is Equation (2). Casting an eye back over the calculations should relieve any mystery about where these coefficients came from or what they mean.

Very interesting, will have to work through the long post you linked to later. Just a quick question: Where does the linearity of the conditional expectation function come from? It really is sufficient that y is a linear function of independent random variables, regardless of how they are distributed? — CloseToC, May 19 '14 at 20:35
@CloseToC A pretty thorough and general discussion of these relationships between correlation and regression appears at http://www.math.uah.edu/stat/sample/Covariance2.html. — whuber, May 19 '14 at 21:21

Sergio · Answer 2 · 2014-05-19T19:25:30.080

The model implies that $y\sim\mathcal{N}(q,\sigma^2_u)$ and $q\sim\mathcal{N}(a,\sigma^2_q)$. By Bayes' rule: $$p(q\mid y)\propto p(y\mid q,\sigma^2_u)p(q)$$ Ignoring constant factors (see here for a similar development): $$\begin{align}p(q\mid y) & \propto \exp\left\{-\frac{(y-q)^2}{2\sigma^2_u}-\frac{(q-a)^2}{\sigma^2_q}\right\}\\ &=\exp\left\{-\frac{1}{2}\left(\frac{y^2-2yq+q^2}{\sigma^2_u}+\frac{q^2-2qa+a^2}{\sigma^2_q}\right)\right\}\end{align}$$ any term that does not include $q$ can be viewed as a proportionality constant: $$\begin{align}\qquad\qquad\qquad &\propto\exp\left\{-\frac{1}{2}\frac{-2\sigma^2_q yq+\sigma^2_q q^2+\sigma^2_u q^2-2\sigma^2_u qa}{\sigma^2_u\sigma^2_q}\right\}\\ &=\exp\left\{-\frac{1}{2}\frac{(\sigma^2_q+\sigma^2_u)q^2-2(\sigma^2_u a+\sigma^2_q y)q}{\sigma^2_u\sigma^2_q}\right\}\\ &=\exp\left\{-\frac{1}{2}\frac{q^2-2q\frac{\sigma^2_u a+\sigma^2_q y}{\sigma^2_q+\sigma^2_u}}{\frac{\sigma^2_q\sigma^2_u}{\sigma^2_q+\sigma^2_u}}\right\}\propto \exp\left\{-\frac{1}{2}\frac{\left(q-\frac{\sigma^2_u a+\sigma^2_q y}{\sigma^2_q+\sigma^2_u}\right)^2}{\frac{\sigma^2_q\sigma^2_u}{\sigma^2_q+\sigma^2_u}}\right\}\end{align}$$ Therefore: $$E(q\mid y)=\frac{\sigma^2_u a+\sigma^2_q y}{\sigma^2_q+\sigma^2_u} =\left(1-\frac{\sigma^2_q}{\sigma^2_q+\sigma^2_u}\right)a+\frac{\sigma^2_q}{\sigma^2_q+\sigma^2_u}y$$

Sergio · Answer 3 · 2014-05-19T21:18:40.847

2

Another way, the shortest one ;-)

In general, if $X$ and $Y$ have a bivariate normal distribution, then (Anderson, Theorem 2.5.1): $$E[X\mid Y]=E[X]+\frac{\text{Cov}(X,Y)}{V[Y]}(Y-E[X]) =\left(1-\frac{\text{Cov}(X,Y)}{V[Y]}\right)E[X]+\frac{\text{Cov}(X,Y)}{V[Y]}Y$$ i.e. "expected value of X given Y is weighted average of mean X and Y" is a well-known result.

In your model $E[q]=a$, $V[y]=\sigma^2_q+\sigma^2_u$ and $\text{Cov}(y,q)=\sigma^2_q$ (see whubner's answer), so: $$E[q\mid y]=a+\frac{\sigma^2_q}{\sigma^2_q+\sigma^2_u}(y-a)= \left(1-\frac{\sigma^2_q}{\sigma^2_q+\sigma^2_u}\right)a+\frac{\sigma^2_q}{\sigma^2_q+\sigma^2_u}y$$

edited May 19 '14 at 21:18

answered May 19 '14 at 20:57

Sergio

5,628
2
11
27

Yes, it's the shortest in *length* -- but only because it relies on material that has already been posted! I think you would find it a challenge to find any answer shorter than the first line of mine, which I believe captures the essence of the question (and is more general than the theorem you quote). – whuber May 19 '14 at 21:42
Sorry, I can't understand. Did anyone quote Anderson? My second answer is "it is a well-known result". I've suggested my shortest-with-smile second answer just because I don't know what Fusscreme is looking for. If he is going to quote Aigner and Cain in a paper but must expound their statement, then something like "see Anderson, 2003, Theorem 2.5.1" could be the best way. – Sergio May 19 '14 at 21:52
Indeed this is very useful for me. Thanky you both and everyone else very much, your help is much appreciated. – Fusscreme May 22 '14 at 07:02

score 0 · Answer 4 · answered May 19 '14 at 18:22

I think the following argument shows why, unfortunately it's a messy. Much more elegant derivations are certainly out there somewhere as the linear Gaussian case is the best understood statistical model in existence.

Anyway, we have that:

U~N(0, sigma²)
Q~N(alpha, beta^2)
Y=Q+U.
U and Q are independent.

Because a linear function of normal random variables is itself a normal random variable and due to independence of U and Q it follows that Y | Q ~ N(Q+0,sigma²).

We can now write down the probability density function of Q conditional on Y=y. By Bayes theorem that's:

(pdf of Q times * pdf of Y|Q)/(pdf of Y).

I won't write this out because it's very messy with all the Gaussian densities.

Q|Y will be a normal random variable, which means that it's mode is its mean. Ignoring the denominator (the normalizing constant), we're left with:

1/(2pi*sigma^2*beta^2) * exp(-Something(Q))

We find the mode of the posterior distribution by choosing Q so as to maximise the density. That's going to be the conditional expected value too, because the mode is the mean for a Gaussian. To do that we can ignore everything except Something(Q) because the rest isn't a function of Q.

If you do the algebra, Something(Q) = 1/2 *( (y-q)^2/sigma^2) + (q-alpha)^2/beta^2)

If you differentiate wrt q, set to 0 and solve for q, you'll get:

q=beta^2/(beta^2+sigma^2)*y+sigma^2/(beta^2+sigma^2)*alpha... as required!

Expected value of q given y is weighted average of mean q and and y

4 Answers4

Preliminary Calculations

Solution

Linked