How to perform regression analysis? (Including assumptions)

Question

As far as I know first I have to standardize the variables. Then I have to check whether they are normal or not, then I should check whether there is multicolinearity. Then I perform the make regression and check whether the residuals are randomly distributed or not.

Is there anything I am missing or have wrong?

For more information, [search our site](http://stats.stackexchange.com/search?tab=votes&q=regression%20diagnostics): there are hundreds of answers addressing variations of this question. — whuber, May 07 '13 at 21:29
Right: you checked all 133 results in the last 17 minutes. What's the matter with the duplicate? What else are you looking for? — whuber, May 07 '13 at 21:48
@whuber, no. I have checked only the most 5-10 relevant results. they were not much related with my question. the lower ones tend to be unrelated at all. — ilhan, May 07 '13 at 21:53
I went through them all (quickly) and found these may be worth looking at: http://stats.stackexchange.com/questions/28688, http://stats.stackexchange.com/questions/51046, http://stats.stackexchange.com/questions/57549, http://stats.stackexchange.com/questions/17673, http://stats.stackexchange.com/questions/41194, http://stats.stackexchange.com/questions/28688, and http://stats.stackexchange.com/questions/32600 (the dup). Plenty of others are relevant but need some imagination to apply; for instance, many of the questions on logistic regression contain useful information. — whuber, May 08 '13 at 13:27

score 2 · Answer 1 · answered May 07 '13 at 21:26

You do not have to standardize the variables; you do not have to check them for normality. You should check for collinearity. The residuals should be normally distributed and not related to the independent variables.

Beyond that there is a whole lot to do. There is the whole issue of model selection, for one. You need to check for outliers. There is more, too.

Eric Paulsson · Answer 2 · 2013-05-07T21:54:33.120

1

There's no need to make assumptions about the distribution of the predictors. If a predictor is heavily skewed towards higher values, you may need to transform it though.

Watch out for multicolinearity (hint: VIF). Always make sure your residuals are normally distributed, if they're not - do something. Transforms of the predictors can be worth trying out.

EDIT: Removed first line of post (wrong info). Check the later answers for information regarding standardization of the predictors.

edited May 07 '13 at 21:54

answered May 07 '13 at 21:26

Eric Paulsson

128
2
7

What software standardizes the predictors for you? `R` does not, `SAS` does not and I am pretty sure `SPSS` does not. At least, not by default. Nor is it necessary to do so (although some people think it is a good idea). Also, condition indexes are a better method for collinearity than VIFs are. – Peter Flom May 07 '13 at 21:27
That is debatable. Some agree with you (including some very prominent people) but I don't. I think changes in the original units are usually easier to understand. But it's not an *assumption* of regression, in any case. – Peter Flom May 07 '13 at 21:41
As I remember it, it's a good idea to standardize the predictors in order to understand what they're doing for the response. The interpretation of the parameter estimated will be clearer. I thought they all standardized them by default, if it's wrong I'll edit my post. – Eric Paulsson May 07 '13 at 21:42
No assumption at all, he asked for it though and I believe it's can be good idea to standardize (especially if you're new to regression). In the end, it comes down to the nature of whatever you're modeling. – Eric Paulsson May 07 '13 at 21:52

How to perform regression analysis? (Including assumptions)

2 Answers2