Questions tagged [heteroscedasticity]

Non-constant variance along some continuum in a random process.

Heteroscedasticity refers to the property of a random process that has non-constant variance along some continuum. This most commonly presents in regression where the error variance increases as a function of one or more predictors, but also commonly refers to a time series whose variance changes over time. The Greek skedasis means "dispersion".

Random data showing heteroscedasticity: . . . . . and heteroscedastic vs. homoscedastic residuals:
by Q9. . . . . . . . . . by Protonk.

Heteroscedasticity may be intrinsically interesting, as in this example from Wikipedia:

A classic example of heteroscedasticity is that of income versus expenditure on meals...A poorer person will spend a rather constant amount by always eating inexpensive food; a wealthier person may occasionally buy inexpensive food and at other times eat expensive meals. Those with higher incomes display a greater variability of food consumption. [Emphasis added.]

Heteroscedasticity may complicate predictive/explanatory modeling, as in the other example:

Imagine you are watching a rocket take off nearby and measuring the distance it has traveled once each second. In the first couple of seconds your measurements may be accurate to the nearest centimeter, say. However, 5 minutes later as the rocket recedes into space, the accuracy of your measurements may only be good to 100 m, because of the increased distance, atmospheric distortion and a variety of other factors. The data you collect would exhibit heteroscedasticity. [Emphasis added.]

Questions that should use this tag:

See Wikipedia also for:

1054 questions
59
votes
2 answers

What does having "constant variance" in a linear regression model mean?

What does having "constant variance" in the error term mean? As I see it, we have a data with one dependent variable and one independent variable. Constant variance is one of the assumptions of linear regression. I am wondering what homoscedasticity…
Mukul
  • 737
  • 1
  • 6
  • 8
51
votes
7 answers

When conducting a t-test why would one prefer to assume (or test for) equal variances rather than always use a Welch approximation of the df?

It seems like when the assumption of homogeneity of variance is met that the results from a Welch adjusted t-test and a standard t-test are approximately the same. Why not simply always use the Welch adjusted t?
russellpierce
  • 17,079
  • 16
  • 67
  • 98
48
votes
1 answer

Alternatives to one-way ANOVA for heteroskedastic data

I have data from 3 groups of algae biomass ($A$, $B$, $C$) which contain unequal sample sizes ($n_A=15$, $n_B=13$, $n_C=12$) and I would like compare if these groups are from the same population. One-way ANOVA would definitely be the way to go,…
Rick L.
  • 481
  • 1
  • 5
  • 3
36
votes
5 answers

Why are there two spellings of "heteroskedastic" or "heteroscedastic"?

I frequently see both the spellings "heteroskedastic" and "heteroscedastic", and similarly for "homoscedastic" and "homoskedastic". There seems to be no difference in meaning between the "c" and the "k" variants, simply an orthographic difference…
Silverfish
  • 20,678
  • 23
  • 92
  • 180
35
votes
5 answers

What are the dangers of violating the homoscedasticity assumption for linear regression?

As an example, consider the ChickWeight data set in R. The variance obviously grows over time, so if I use a simple linear regression like: m <- lm(weight ~ Time*Diet, data=ChickWeight) My questions: Which aspects of the model will be…
Dan M.
  • 830
  • 1
  • 7
  • 11
29
votes
2 answers

How do you find weights for weighted least squares regression?

I am a bit lost in the process of WLS regression. I have been given dataset and my task is to test whether there is heteroscedascity, and if so I should run WLS regression. I have carried out the test and found evidence for heteroscedascity, so I…
m3div0
  • 569
  • 1
  • 7
  • 11
27
votes
4 answers

Best way to deal with heteroscedasticity?

I have a plot of residual values of a linear model in function of the fitted values where the heteroscedasticity is very clear. However I'm not sure how I should proceed now because as far as I understand this heteroscedasticity makes my linear…
TristanDM
  • 271
  • 1
  • 3
  • 3
25
votes
1 answer

Sandwich estimator intuition

Wikipedia and the R sandwich package vignette give good information about the assumptions supporting OLS coefficient standard errors and the mathematical background of the sandwich estimators. I'm still not clear how the problem of residuals…
25
votes
6 answers

Always Report Robust (White) Standard Errors?

It has been suggested by Angrist and Pischke that Robust (i.e. robust to heteroskedasticity or unequal variances) Standard Errors are reported as a matter of course rather than testing for it. Two questions: What is impact on the standard errors of…
25
votes
1 answer

Why Levene test of equality of variances rather than F ratio?

SPSS uses the Levene test to evaluate homogeneity of variances in the independent group t-test procedure. Why is the Levene test better than a simple F ratio of the ratio of the variances of the two groups?
Joel W.
  • 3,096
  • 3
  • 31
  • 45
24
votes
3 answers

Regression modelling with unequal variance

I would like to fit a linear model (lm) where the residuals variance is clearly dependent on the explanatory variable. The way I know to do this is by using glm with the Gamma family to model the variance, and then put its inverse into the weights…
Tal Galili
  • 19,935
  • 32
  • 133
  • 195
23
votes
4 answers

Practically speaking, how do people handle ANOVA when the data doesn't quite meet assumptions?

This isn't a strictly stats question--I can read all the textbooks about ANOVA assumptions--I'm trying to figure out how actual working analysts handle data that doesn't quite meet the assumptions. I've gone through a lot of questions on this site…
Jas Max
  • 738
  • 1
  • 6
  • 15
22
votes
2 answers

Transforming proportion data: when arcsin square root is not enough

Is there a (stronger?) alternative to the arcsin square root transformation for percentage/proportion data? In the data set I'm working on at the moment, marked heteroscedasticity remains after I apply this transformation, i.e. the plot of…
18
votes
2 answers

How do I interpret this fitted vs residuals plot?

I don't really understand heteroscedasticity. I would like to know whether my model is appropriate or not according to this plot.
kanbhold
  • 755
  • 2
  • 9
  • 18
18
votes
2 answers

How to run two-way ANOVA on data with neither normality nor equality of variance in R?

I am working on my master thesis at the moment and planned on running the statistics with SigmaPlot. However, after spending some time with my data I came to the conclusion that SigmaPlot might not be fit for my problem (I may be mistaken) so I…
Sabine
  • 181
  • 1
  • 1
  • 4
1
2 3
70 71