Suppose we want to do simple linear regression. Before we do simple linear regression, we need to check these following assumptions (please correct me if I'm wrong):
- Linear relationship
- Normality of residual
- Homoskedasticity of residual
- No autocorrelation
My question is: When the data doesn't follow several or even all of the assumption, which one should be handled first?
I have several data that have one variable independent (X) and one variable dependent (Y).
- My first data violates normality and autocorrelation assumption. When I handled the autocorrelation using Cochrane-Orcutt transformation, the data became normally distributed. In this case, did I only need to handle the autocorrelation issue?
- My second data violates autocorrelation assumption. I handled it with the same method, but the data became non-normally distributed. What should I do?
- My third data doesn't follow linearity and normality assumption. I tried to handle the normality issue first by transforming the orginal data to sqrt(X) and sqrt(Y) (I'm not sure this is right to do), then did linearity test again, and the result said that the data was linear.
- My fourth data violates normality, heteroskedasticity, and autocorrelation assumption. Since transformation method will affect result of the other assumption test, which issue should be handled first to get right conclusion?