2

Recently, I'm studying linear regression. I've heard that errors always follow normal distribution because they are supposed to do (in the point of they are noises). But suddenly I just wonders what if residuals of a linear regression follow other distribution rather than a normal distribution. I think I can interpret that there's a potential to approving the performance of the model because it implies that there are some data that I've didn't collect. It means irreducible errors can be reduced actually. But I'm not good at math, so I don't know it is right or not.

Once again, how can we interpret residuals of a linear regression doesn't follow other distribution and how can we utilize if they do?

Steffen Moritz
  • 1,564
  • 2
  • 15
  • 22
Yoo Inhyeok
  • 161
  • 8
  • You should have a look at [this post](https://stats.stackexchange.com/questions/148803/how-does-linear-regression-use-the-normal-distribution/148812#148812) – kjetil b halvorsen Feb 09 '19 at 11:12
  • @kjetilbhalvorsen Basically, the normality assumption doesn't affect on estimating regression line or performance of the regression. However, if residuals don't follow normal distribution, statistical inference such as calculating CI things could be wrong but I can use other way like bootstraping. Am I right? – Yoo Inhyeok Feb 09 '19 at 13:21
  • Yes, basically that's right. You should still look out for leverage (very influential points), outliers and severely non-normal distributions. – kjetil b halvorsen Feb 09 '19 at 13:34
  • @kjetilbhalvorsen thanks for the kind answers. However, they still didn't help that much for what I asked. I'm just wondering how can we interpret if residuals follow specific distribution and what does it imply. – Yoo Inhyeok Feb 09 '19 at 13:39
  • Can you give some more details and context? What is your response variable? continuous? positive? otherwise? Do you suspect nonnormal distribution just because residuals do not look normal? or for some other reason? What is the goal of the analysis? ... – kjetil b halvorsen Feb 09 '19 at 13:41
  • If the data is from an exponential, or even sine wave, fitting a straight line will not give errors with a normal distribution. It is a good practice to visually inspect scatter plots of the data to determine if any obvious data transform or model function is needed. – James Phillips Feb 09 '19 at 14:27
  • @kjetilbhalvorsen There's no goal or purpose of it. I'm just wondering it. – Yoo Inhyeok Feb 12 '19 at 10:01
  • @JamesPhillips I have no data. I'm asking since I just curious. – Yoo Inhyeok Feb 12 '19 at 10:03
  • Well, but what it implies depend much on context and goals! – kjetil b halvorsen Feb 12 '19 at 12:05
  • Other dups: https://stats.stackexchange.com/questions/148803/how-does-linear-regression-use-the-normal-distribution/148812#148812, https://stats.stackexchange.com/questions/100214/assumptions-of-linear-models-and-what-to-do-if-the-residuals-are-not-normally-di – kjetil b halvorsen Jun 19 '19 at 09:00

1 Answers1

1

It is true that standard linear models (of which regression is one type) require that the residuals be normally distributed. But you can used generalized linear models to specify residual distributions other than the normal. There are a lot of questions here dealing with these models:

https://stats.stackexchange.com/questions/tagged/generalized-linear-model

mkt
  • 11,770
  • 9
  • 51
  • 125