3

I have 5 samples of different sizes (5000-15000 observations each) from 5 (presumably different) distributions. I need to perform something like ANOVA to test the hypothesis that these distributions have the same median and interquartile range. If I would be interested only in medians I would use the Kruskal-Wallis test, but I need also interquartile range.

Is there is a way to perform such a test, may with permutations?

P.S. I assume that samples come from different distributions, but these distributions have the same IQR and median, and differences starts from higher order statistics.

zlon
  • 639
  • 4
  • 20
  • 2
    With such huge samples, you'll almost certainly reject using any reasonable test -- your ability to identify trivially small differences will be substantial. – Glen_b Aug 23 '19 at 07:42
  • 1
    Great point, @Glen_b! That is why it will be important to supplement the test results with confidence intervals for the differences in true medians (or IQR's). – Isabella Ghement Aug 23 '19 at 15:42
  • Indeed, but then why test at all? Doesn't an interval for the difference in medians (or whatever other statistic) convey the necessary information about the size of the difference? – Glen_b Aug 24 '19 at 02:22

1 Answers1

3

Quantile regression allows you to test any quantile. Here is an example in R

> library(quantreg)
> summary(rq(mpg~cyl+disp+hp,c(0.25,0.5,0.75),data=mtcars))

Call: rq(formula = mpg ~ cyl + disp + hp, tau = c(0.25, 0.5, 0.75), 
    data = mtcars)

tau: [1] 0.25

Coefficients:
            coefficients lower bd upper bd
(Intercept) 26.53473     26.32579 30.86801
cyl         -0.31763     -1.33117 -0.14118
disp        -0.02588     -0.02827 -0.00917
hp          -0.00672     -0.06947  0.00266

Call: rq(formula = mpg ~ cyl + disp + hp, tau = c(0.25, 0.5, 0.75), 
    data = mtcars)

tau: [1] 0.5

Coefficients:
            coefficients lower bd upper bd
(Intercept) 33.06030     27.49147 38.66943
cyl         -1.52147     -2.57978 -0.41709
disp        -0.01632     -0.03119  0.00197
hp          -0.00292     -0.03085  0.00132

Call: rq(formula = mpg ~ cyl + disp + hp, tau = c(0.25, 0.5, 0.75), 
    data = mtcars)

tau: [1] 0.75

Coefficients:
            coefficients lower bd upper bd
(Intercept) 41.03835     27.01372 47.45651
cyl         -2.26654     -4.27368  2.81358
disp        -0.01191     -0.04720  0.02987
hp          -0.01290     -0.04689  0.01637

And you can also run an ANOVA

> anova(rq(mpg~cyl+disp+hp,c(0.25,0.5,0.75),data=mtcars))

Quantile Regression Analysis of Deviance Table

Model: mpg ~ cyl + disp + hp
Joint Test of Equality of Slopes: tau in {  0.25 0.5 0.75  }

  Df Resid Df F value Pr(>F)
1  6       90  1.6521 0.1421
user2974951
  • 5,700
  • 2
  • 14
  • 27
  • Does `anova.rq()` take into account the covariance structure of estimates from different quantiles? – JustGettinStarted Aug 23 '19 at 16:44
  • @JustGettinStarted I don't think so, I cannot find any information on it. – user2974951 Aug 24 '19 at 06:22
  • @ user2974951 if you check out page 49 of Lingxin Hao's brief on QR (see link). You see that for comparing quantile estimates you need to take into account cross-quantile covariance. `anova.rq()` should do this already?but I'm not positive that it does. https://nguyenvantien0405.files.wordpress.com/2014/03/quantile_regressiolingxin-hao.pdf – JustGettinStarted Aug 26 '19 at 16:38