Does this graph imply a violation of homoscedasticity?

Question

I assume that this graph doesn't support the assumption of homoscedasticity. Am I right? Does it make sense to carry out another test to be sure?

y-axis: Regression Standardized Residual, x-axis: Regression Standardized Predicted Value, dependent variable: Score of a questionnaire measuring rage attacks, n=156

Welcome to the site. For those of us who don't speak German (I think it's German) could you translate the axis label? I think I understand most of it, but not "geschatzer Wert" and "think I understand" is not really good. — Peter Flom, Dec 16 '19 at 13:56
Sure, it's in my text: y-axis: Regression Standardized Residual, x-axis: Regression Standardized Predicted Value. — AppleSeed, Dec 16 '19 at 14:05
You should have a look at [this](https://en.wikipedia.org/wiki/Breusch%E2%80%93Pagan_test) — MGP, Dec 16 '19 at 14:13
Does this answer your question? [Chart indicates homoscedasticity but Breusch-Pagan test p<.001> — Nick Cox, Dec 16 '19 at 14:27
There is a presumably a constraint on the response, say no score is lower than zero. Hence, residuals must plot above some line that in spirit is residual = minimum observed $-$ fitted. (Standardization affects details only.) This alone inhibits or prohibits homoscedasticity. If there is an upper bound only, all the more reason to consider a generalized linear model with appropriate link rather than plain regression. For more, see the thread suggested just above as duplicate. — Nick Cox, Dec 16 '19 at 14:32
Thanks for your help! Breusch-Pagan-test shows a significance of 0.187, the modified Breusch-Pagan p= 0.242 and the White-test p= 0.587. So, can I assume homoscedasticity? But the problem which Nick Cox wrote about stays...so is it still advisable to run a generalized linear model instead of a multiple linear regression? — AppleSeed, Dec 16 '19 at 15:08
I don't think this is a duplicate because in the question marked as a potential duplicate the N was much larger and the Breusch Pagan test was statistically significant. — Peter Flom, Dec 17 '19 at 11:34

score 0 · Answer 1 · edited Dec 16 '19 at 14:33

Two essential assumptions of regression are being unbiased and variance independence with observations. Mathematically, $\varepsilon \in \mathbb{R}^{n}$ is:

$$ \varepsilon \sim \mathcal{N} (0, \sigma^{2} I_{n}). $$

In your case, it seems that the estimated variance increase with observations. Two possibilities : the first one is a problem of heteroscedasticity and maybe a problem of sampling.

I hope it helps.

Does this graph imply a violation of homoscedasticity?

1 Answers1