1

I am familiar with Levene's test (Wikipedia link) for equality of variances, which is robust to non-normality, but assumes observations are independent. Today I was wondering what to do when you want to compare variances between two samples, but observations are not independent. For instance, in panel data, an individual is observed through time, and may be grouped with other individuals in a cluster, making it very likely that observations are not independent.

The heart of the Levene's test is the spread measure $z_{ij} = |x_{ij} - \mu_{i}|$, the absolute deviation of an observation $j$ in group $i$ with group mean $\mu_{i}$. In reality, $\mu_{i}$ is not known, so an estimate $\hat{\mu}_i$ is used instead. The authors of this interesting review of Levene's test point out that instead of using the sample mean (or median), one can use a more robust estimate. This got me thinking: if you could estimate $\mu_i$ in a way that controls for issues that arise in panel data (e.g. observations correlated across time and individuals/cross-sections) then you could substitute that into Levene's test and fire away. But is this in fact true?

Then I read Ben Bolker's response to a somewhat similar question. If I understand him correctly, comparisons of variances amount to comparisons of conditional distributions, since problems like heteroskedasticity and autocorrelation are captured in the error term of a regression. He thus advises comparing the residuals of a regression. Is this the right path?

Or, is Levene's test robust to any issues brought up by panel data?

invictus
  • 329
  • 2
  • 10
  • How you estimate $\mu_i$ won't help cure the problem of correlated data. Thus, it helps to separate two issues conceptually and practically: (1) developing a useful test statistic and (2) determining its sampling distribution. The usual distribution (assumed for independent data or nearly-independent residuals) won't apply to panel data. – whuber Feb 07 '18 at 22:16
  • @whuber I don't understand how better estimates of $\mu_i$ are irrelevant. See "Robust tests for the equality of variances for clustered data" ( http://www.tandfonline.com/doi/abs/10.1080/00949650802641841). They propose replacing the ANOVA step in Levene's step with a regression that controls for correlation within clusters. – invictus Feb 07 '18 at 22:29
  • I didn't write "irrelevant." It could make some difference, probably tiny in most applications. But that is a minor issue: just about any robust estimate of the central location will work fine. The real issue is coping with the correlated residuals. – whuber Feb 07 '18 at 22:43
  • @whuber Thanks for clarifying. What if you ran Levene's test on the residuals? – invictus Feb 07 '18 at 22:57
  • I like Levene's test, and it works fine with residuals, but I wouldn't be able to use the usual approximations to the sampling distribution of the statistic, due to the correlation. I would therefore bootstrap it, paying attention to resampling in a way that reproduces the correlation. – whuber Feb 07 '18 at 22:59
  • @whuber I see. Can you give a description of such a bootstrap? I am a bit confused. Do you mean bootstrap the test statistic W? – invictus Feb 07 '18 at 23:02

0 Answers0