If $\epsilon_i \sim \mathcal{N}(0, \sigma^2)$, why does this also imply $x_i|\beta \sim \mathcal{N}(0, \sigma^2)$

Question

I have seen this stated in multiple sources, where if the errors in a linear model ($y_i = \beta x_i + \epsilon_i$) follow $\epsilon_i \sim \mathcal{N}(0, \sigma^2)$, then $x_i|\beta \sim \mathcal{N}(0, \sigma^2)$, the same distribution. Here is one link https://www.youtube.com/watch?v=_-Gnu498s3o that states this, starting at around 2:20.

If the error terms are gaussian distributed, why does this imply that the independent and dependent variables are also gaussian distributed?

I've watched the video and I believe the creator of the video made a typo when they wrote the likelihood. He wrote $f(x_i | \beta, \sigma^2)$ but he should have wrote $f(y_i | \beta, \sigma^2)$. That is, $$y_i | \beta, \sigma^2 \sim \mathcal N (\beta x_i , \sigma^2)$$ — SOULed_Outt, May 25 '20 at 06:39
Also, depending on how you'd like to treat $x_i$ you may want to write the likelihood as $f(y_i | x_i, \beta, \sigma^2)$. — SOULed_Outt, May 25 '20 at 06:40
@SOULed_Outt Ah, maybe that's what we meant. I think the latter form you commented is the one that I see the most often. Although, I think I usually see the semicolon usage $f(y_i | x_i ; \beta)$. Andrew NG's CS229 notes uses the semicolon notation to indicate that we're not conditioning on $\beta$. — user5965026, May 25 '20 at 06:43
Perhaps the notation is to make it clearer that you're treating the parameters as fixed values (i.e. not random variables). Then it would be better to say $$y_i | x_i; \beta, \sigma^2 \sim \mathcal N (\beta x_i, \sigma^2)$$ and $$f(y_i | x_i; \beta, \sigma^2)$$ — SOULed_Outt, May 25 '20 at 08:06

score 2 · Accepted Answer · edited May 25 '20 at 11:31

2

It doesn’t imply anything about the predictors (independent variables) or the response (dependent variable). It is a comment about the conditional distribution of $y$, conditioned on some specified value of $x$.

The idea is that you’re sliding a bell curve up and down the regression line. For example, The regression line gives the expected value, but then you draw an observation from the conditional distribution of $y$ given that $x$-value. That’s where the error comes from.

Remember that this framework posits that the conditional distribution is $N(\hat{y}_i, \sigma^2)$.

edited May 25 '20 at 11:31

user5965026

503
2
11

answered May 25 '20 at 05:46

Dave

28,473
4
52
104

Sorry, I'm having a hard time visualizing "sliding a bell curve up and down the regression line." Is this bell curve parallel to the x axis or to the regression line? – user5965026 May 25 '20 at 05:52
https://blogs.sas.com/content/iml/files/2015/09/GLM_normal_identity.png I’m on my phone right now, but if you edit that into my post, I’ll accept the edit and you’ll get a rep point or two. Otherwise I’ll add it tomorrow. – Dave May 25 '20 at 05:53
Ah I think I get it. So basically the idea is by assuming the errors are normally distributed with zero mean, means that on average, our $y_i$ will fall on the regression line. Is this regression line determined from population parameters or from sample parameters? Also do you know why the video states $f(x_i|\beta)$ is normally distributed. I was really confused by that. – user5965026 May 25 '20 at 05:58
Wait what is the mistake on the error term in my title? I wrote that it's gaussian distributed with zero mean. Isn't that the correct assumption for MLE? – user5965026 May 25 '20 at 06:01
1

I meant to add the picture directly in the post, not just a link to it. I’ll address your other comments in the morning. – Dave May 25 '20 at 06:04
While that is certainly a helpful visualisation of what is happening, a more 'rigorous' answer (from a math/statistical point of view) can be found here: https://stats.stackexchange.com/questions/305908/likelihood-in-linear-regression – Fabian Werner May 25 '20 at 12:32
@user5965026 Let's addres your two questions from last night. As other comments pointed out, I think Lambert made a mistake in the equation you quote. As far as what determines the regression line, I'll tell you the answer once you think about this next comment for a while, but you never get to know the population parameters. (Simulations are exceptions, but even then the machinery to fit the regression, the $\hat{\beta}=(X^TX)^{-1}X^Ty$ you've perhaps seen, does not get to know the simulation parameters you've specified.) So what determines the regression line, the population or the estimate? – Dave May 25 '20 at 17:24
Under the gauss markov theorem, isn't $E[\hat{\beta}] = \beta$? $\beta$, the population estimator, isn't observed/known. – user5965026 May 25 '20 at 18:55
Gauss-Markov makes stronger claims than just unbiasedness, I’ll mention, but certainly it gives you an unbiased estimator of $\beta$. I do not follow your objection, however. Could you please clarify what you mean? – Dave May 25 '20 at 19:07
I think maybe I misunderstood your question. Were you asking me what determines $\beta$, the population parameter(s)? – user5965026 May 25 '20 at 19:20
I’m asking you what determines the regression line, now that you know we never get to know the population parameters. – Dave May 25 '20 at 19:23
Oh. The regression line is determined from the sample parameters. We approximate $\beta$ using $\hat{\beta}$, and one such (and most popular?) way to do so is to use OLS, giving us $\hat{y} = X^T\hat{\beta} = X^T(X^TX)^{-1}X^Ty$, giving us the regression line? – user5965026 May 25 '20 at 19:28
1

Exactly! The regression line is your predictions, which come from the the fitted equation. I hope this answers your question. If not, please do move this to chat. – Dave May 25 '20 at 19:30
Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/108462/discussion-between-user5965026-and-dave). – user5965026 May 25 '20 at 19:37

If $\epsilon_i \sim \mathcal{N}(0, \sigma^2)$, why does this also imply $x_i|\beta \sim \mathcal{N}(0, \sigma^2)$

1 Answers1