1

Let's say I do a multiple regression, using robust (Stata option). It is a robust standard error regression. I want to analyse and discuss residuals.

  1. Residuals versus fitted values

Is it sufficient to simply observe a random and homogeneous distribution of the residuals around 0?

  1. Kernel density estimate of the residuals (Are they following a normal distribution?).

As I didn't use an OLS regression, I don't care if the residuals are not normal. True?

  1. Is there anything I forgot?
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
firepod
  • 105
  • 8
  • What does multivariate mean here? A multivariate response, or just a multiple regression? What does "robust" mean here? Regression with robust (sandwich-Huber-White-Eicker-GameOfThrones) standard errors or some unspecified flavour of robust regression? Is `robust` the name of some program, package, function or command and if so in which language, as not everybody uses whatever you do? – Nick Cox Jun 07 '16 at 16:38
  • (I made up GameOfThrones. Alternatively, the meme starts here.) – Nick Cox Jun 07 '16 at 16:39
  • Notice that you tag "robust-standard-error" but are explicit that you are not using OLS. – Nick Cox Jun 07 '16 at 16:41
  • Hello, thank you for your answer. Multivariate = multiple regression, my bad. robust = Regression with robust standard error, as specified in the title, also the name of the option in Stata. I will make the sentence more clear – firepod Jun 07 '16 at 18:18
  • 2
    Regression with robust standard errors gives you exactly the same coefficients as without. So, it makes no difference to what is expected of residuals in broad terms. `robust` is **not** the name of a Stata command; it's an option. – Nick Cox Jun 07 '16 at 18:29
  • Thank you for your answer. However, I am not sure how your comment precisely answer to my points 1, 2 and 3. – firepod Jun 07 '16 at 18:47
  • Although the comment by @Nick does not answer your questions, it shows that existing threads do, such as http://stats.stackexchange.com/questions/32600 . – whuber Jun 07 '16 at 19:48
  • #1 Nothing is sufficient, but homogeneous scatter of residuals is in itself a good sign. What do you understand by "random" here? #2 You should care about normal distributions of residuals as much as you would without robust SEs: it's an ideal condition but not essential. #3 Everything else! Whether a model tracks systematic behaviour is about the top of any list. – Nick Cox Jun 07 '16 at 20:09
  • Hello, thank you for your answer whuber. I already checked this thread, but this does not answer the question : what to do with residuals, in the precise case of a robust regression with standard error. – firepod Jun 07 '16 at 20:10
  • Thank you Nick cox, what I mean by random is that the residuals vs fitted plot does not show a trend of the residuals, they don't seem to be align on a "curve". for #2, normal distribution of the residuals is not an assumption of robust standard error regression. This regression just assumes that errors are heteroskedastic (variance of condition distribution of errors given x depends on x). In fact, i am not even sure why we checked the normal distribution of the residuals – firepod Jun 07 '16 at 20:23
  • I disagree. Getting more honest SEs is one thing. It's not an absolute protection against improperly specified models. Put it this way: do you want to insist that you are indifferent to the distribution of the residuals and that the distribution of the residuals conveys absolutely no information about how good the fit is or how appropriate the specification is? Robust SEs $\neq$ robust regression. – Nick Cox Jun 08 '16 at 05:23
  • Not exactly indifferent, but this thread summaries my point : http://stats.stackexchange.com/questions/29731/regression-when-the-ols-residuals-are-not-normally-distributed – firepod Jun 08 '16 at 06:21
  • In Stata terms, `regress` and `regress, robust` give the same coefficient estimates and thus the same residuals. If the coefficients defined a poor summary of the systematic structure in the first case there is thus no medicine that fixes the problem in the second case. **If the mean structure $Y = Xb$ is about right in both cases**, then more honest SEs are a gain, but that is the really crucial condition. – Nick Cox Jun 08 '16 at 06:32
  • It's a pity that the question title still starts "robust regression" because two distinct ideas are still being conflated. (It's a bigger pity that economists hijacked the term robust and gave it different flavour, thus introducing confusion in the literature.) – Nick Cox Jun 08 '16 at 06:34

0 Answers0