4

Could someone please clarify the part highlighted in red? Why the conditional density? I am having hard time understanding why the statement is about conditional density

I don't understand why saying that $e∼N(0,σ^2)$ implies that $Y∣X$ is Normal. I would like to have a proof .. not just words about the concept

thank youenter image description here

gioxc88
  • 1,010
  • 7
  • 20
  • 3
    It's an attempt to apply the definition of "$e\sim N(0,\sigma^2)$". All that's needed is to plug $e = y-(x^\prime\beta+ \alpha)$ into the usual formula for the Normal density. Somehow $\alpha$ was omitted--but isn't it obvious where it belongs? – whuber Sep 18 '17 at 20:14
  • I don't understand why saying that $ e \sim N( 0, \sigma^2)$ implies that $Y \mid X $. I would like to have a proof .. not just words about the concept – gioxc88 Sep 19 '17 at 11:12
  • @HardCore What do you mean "$e \sim N(0, \sigma)$ implies that Y|X"? Are you misreading the conditional operator to mean that $Y$ depends on $X$? $Y,X$ are (in this presentation) jointly observed RVs, so a conditional density exists. Since $e$ is independent of $X$, $Y|X$ is a sequence of constants plus a normal RV, making $Y|X$ normal. – AdamO Sep 19 '17 at 12:36
  • In the main post I highlighted a sentence in red. That sentence says the same thing I wrote. You are right !!! I was gonna write $ e \sim N(0,\sigma^2)$ implies that $Y \mid X$ is normal. Sorry – gioxc88 Sep 19 '17 at 12:46
  • @HardCore Do you accept that if an RV $U$ has a normal $N(a, b)$ distribution then, $U + c$ ($c$ a constant) has a $N(a+c, b)$ distribution? This can be made rigorous by moment generating functions or convolutions, which you should have been exposed to prior to approaching regression modeling. – AdamO Sep 19 '17 at 12:49
  • Of course I accept it .. I can't understand how you can start from a definition about the unconditional density of $e$ and end with an implication about the conditional density of $Y \mid X$ – gioxc88 Sep 19 '17 at 12:51
  • @HardCore well if we condition on $X$, $\mathbf{x}^\prime \beta + \alpha$ is just a constant, and $Y$ is $e$ plus that constant. In the last display, they simply write the density as whuber says. – AdamO Sep 19 '17 at 12:59
  • Of course that is totally untrue. If you condition on $X$, the conditional expected value is a RV. Only once you know X then the conditional expectation becomes known. I don't know you background but in measure theory the first thing they teach you is that $E(Y \mid X)$ is a RV – gioxc88 Sep 19 '17 at 13:12
  • 3
    @HardCore But regression model conditions on the observed (i.e. known) X's. Moreover, pleas [**do not be rude**](https://stats.stackexchange.com/help/be-nice). AdamO is trying to help and you are being rude to him. – Tim Sep 19 '17 at 13:24
  • I am sorry Tim .. it's just my way of arguing .. It wasn't my intention!! Sorry if I gave that impression – gioxc88 Sep 19 '17 at 13:29

3 Answers3

2

Simple linear regression model (let's focus on single predictor case for simplicity) describes relationship of dependent variable $Y$ with independent variable $X$. It tells us what kind of value of $Y$ can we expect when $X=y$, i.e. it models conditional expectation

$$ E(Y|X) = \mu = \alpha + \beta X $$

where $Y|X \sim \mathcal{N}(\mu, \sigma^2)$, since it is $e + \mu$, where $e \sim \mathcal{N}(0, \sigma^2)$. It is conditional by definition, because we are interested in the relationship between the variables. If it weren't, we would simply ask "what is the expected value of $Y$?", and we wouldn't care about $X$. Finally, if $Y$ and $X$ were independent, then the model would simplify to

$$ E(Y|X) = \alpha + 0 \times X = \alpha $$

where $\alpha$ would be a single-value summary statistic that describes $Y$ and minimizes the squared error, i.e. the mean of $Y$.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • **I don't understand why saying that $e∼N(0,σ^2)$ implies that $Y∣X$ is Normal. I would like to have a proof .. not just words about the concept** – gioxc88 Sep 19 '17 at 12:19
  • @HardCore this comes from the properties of expectations $E(X+c) = E(X) +c$. If you add a constant to the random variable you shift it's mean by the constant. Same if you add random variables $E(X+Y) =E(X)+E(Y)$ and if you [sum normally distributed random variables](https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables). You are simply shifting the mean from $0$ to $\mu$. – Tim Sep 19 '17 at 12:23
  • Yes of course I know that, but first you consider the uncoditional and then the conditional. And again why you all keep saying that $X$ is fixed? $X$ is a random variable that has a joint distribution with $Y$. If I want to study the effect say of the weight Y on the height X, when I draw a sample I observe jointly the 2 RVs $X$ and $Y$. They have a joint density $f(x,y)$. They are both RV. Why $X$ should be fixed in this context? – gioxc88 Sep 19 '17 at 12:26
  • Tim I think we are talking about different things. Thank you very much for the aswer though. – gioxc88 Sep 19 '17 at 12:31
  • @HardCore when using regression you *don't* estimate the joint distribution, you simplify this problem to estimating the conditional expectation by conditioning on the *observed* $X$'s. – Tim Sep 19 '17 at 12:36
  • Tim the link you posted is about fixed effects. The fixed effects refers to model parameters, in this case $\beta$. The term fixed effect has nothing to do with the fact that $X$ is a RV as it is. I think you misunderstood the meaning of the ter fixed effect. – gioxc88 Sep 19 '17 at 12:55
  • @HardCore check this thread: https://stats.stackexchange.com/questions/246047/independent-variable-random-variable – Tim Sep 19 '17 at 12:59
  • thanks for the post I am reading it, but again the distinction made in the post you linked between fixed and random regressors has nothing to do with the previuous wikipedia page you linked where the subject is fixed vs random EFFECTS, not REGRESSOR .. they are two different things. I want to clarify just because If someone reads may get confused – gioxc88 Sep 19 '17 at 13:08
0

In regression modeling, the $X$ (a matrix or vector of covariate values) is considered fixed or given by design. Whether by randomization assignment, or by sampling strategy even simple random sampling, this is often a reflection of how data are collected. If we think of $X$ as a constant, then we can speak simply of the density of the $Y$.

$Y$ then is a bunch of constants plus a normal term, making the $Y$s normally distributed with some mean and standard deviation. This is nice because we can use our observed variables ($Y$ and $X$) to do maximum likelihood and estimate $\beta$ even if we never observe $\epsilon$. If we knew $\epsilon$, regression would be simple arithmetic: subtract residual error, solve a system of equations.

AdamO
  • 52,330
  • 5
  • 104
  • 209
  • I can't see how $X$ can be considered fixed as it is a random variable. In fact the conditional expectation $E(Y \mid X)$ itself is a random variable. When you take conditional expectation you are actually conditioning on the $\sigma-field$ generated by $X$, $\sigma(X)$. And you couldn't do it if X were fixed. Every time you draw a sample you observe a different value of X. Of course once the sample has been drawn X is fixed, but in that case everything is fixed. – gioxc88 Sep 19 '17 at 12:22
  • @HardCore this is a basic tenant of probability theory: when you condition on something, it is no longer random. In a formula: $E[X|X] = X$. If I throw a coin, it is either heads or tails, the randomness comes about when I speak of repeatedly throwing the coin in the same fashion. – AdamO Sep 26 '17 at 19:58
0

After a while, I came back to this question that has not received an answer so far.

I was able to give a formal proof of the statement (finally). I will give one in the one dimensional case. The proof with $k$ regressors is simply an extension and does not affect the proof below.

Let's begin stating the hypotesis

  1. $Y = \beta X + e $

  2. $e \sim N(0, \sigma^2)$

  3. $X$ and $e$ are independent

We want to prove that $ f_{Y \mid X}(y \mid x)$ is $N(\beta x, \sigma^2)$

We are not making any assumption about the distribution of $X$, and I want to point that $X$ is a RANDOM unlike many people said that $X$ is considered FIXED.

Consequently $X$ has a probability distribution. That being said let's begin with the proof:

$f_{Y \mid X}(y \mid x) = \dfrac{f_{YX}(y,x)}{f_{X}(x)}$

$f_{YX}(y,x)=f_{YX}(\beta X + e,x) = f_{eX}(y- \beta X,x)$

the last equality is true because it is a simple bivariate transformation

The independence of $e$ and $X$ implies that

$f_{eX}(y- \beta X,x) = f_e(y- \beta X)f_X(x)$

Finally, since $e \sim N(0, \sigma^2)$

$f_{Y \mid X}(y \mid x) = f_e(y- \beta X) = \dfrac{1}{2 \sigma \sqrt{(\pi)}}exp{\bigg(\dfrac{1}{2} \Big(\dfrac{y -\beta x}{\sigma} \Big)^2 \bigg)} $

This conclude the proof since the last equality implies that.

$ Y \mid X \sim N(\beta x, \sigma^2)$

gioxc88
  • 1,010
  • 7
  • 20