0

I just want to clarify that in GLM we make the assumption that Y | X follows some sort of distribution, not Y.

For example, in the classical simple linear model, we assume that Y | X is normally distributed, not just Y is normally distributed.

If that's the case, does that mean the marginal distribution of Y can be any distribution. The marginal distribution of Y shouldn't affect anything except just a lack of data at some values?

Thanks!

confused
  • 2,453
  • 6
  • 26
  • 1
    Short: **Assumptions is on the distribution of $Y | X$**. Maybe a dup: https://stats.stackexchange.com/questions/374452/family-of-glm-represents-the-distribution-of-the-response-variable-or-residuals/374461#374461 – kjetil b halvorsen Dec 05 '19 at 15:23

1 Answers1

2

Yes ... all the distributional assumptions are about Y given X .... also known as the residuals .

IrishStat
  • 27,906
  • 5
  • 29
  • 55
  • Thanks for clarifying. And when they say Ɛi is normally distributed, that also makes the implicit assumption that there is constant variance correct? If that implicit assumption was not made, then it's possible Ɛi is not normally distributed if each Ɛi | X has different variance, even if each Ɛi | X is normally distributed. – confused Dec 05 '19 at 15:09
  • The point that you are making is that the errors must have constant variance i.e. subgroups of the errors – IrishStat Dec 05 '19 at 15:16
  • 1
    The first half of your answer is correct but the second half is not. In a GLM the assumptions are *not* about the residuals: they are about the conditional responses. – whuber Dec 05 '19 at 15:28
  • If you don't have repeat measurements .. isn't that the same as the looking at the residuals classified for example by time sections ? – IrishStat Dec 05 '19 at 16:02