Is there any easy-to-use software for Tukey median-polishing rows and columns with lots of missing values?
Asked
Active
Viewed 943 times
1 Answers
7
Well R has medpolish built in, and it can deal with some level of missingness:
> a # some data
[,1] [,2] [,3] [,4]
[1,] 32.45884 29.50403 38.54330 30.06207
[2,] 27.92059 25.00838 NA 13.93309
[3,] 37.91911 23.98091 36.00139 27.73731
[4,] 29.20283 29.68059 18.41809 29.92471
[5,] NA 30.98312 23.55309 22.63105
[6,] 24.96472 33.52443 24.85243 37.43364
The medpolish command is simple:
> medpolish(a,na.rm=TRUE) # Pretty easy to use
1 : 86.06071
Final: 85.59585
Median Polish Results (Dataset: "a")
Overall: 29.01548
Row Effects:
[1] 2.2356134 -4.0668144 3.4436953 -0.1729532 -5.2644925 0.1729532
Column Effects:
[1] 1.2077470 0.4488938 -0.1978902 -1.1544723
Residuals:
[,1] [,2] [,3] [,4]
[1,] 0.00000 -2.19595 7.4901 -0.034543
[2,] 1.76418 -0.38917 NA -9.861103
[3,] 4.25219 -8.92715 3.7401 -3.567392
[4,] -0.84743 0.38917 -10.2265 2.236662
[5,] NA 6.78324 0.0000 0.034543
[6,] -5.43146 3.88711 -4.1381 9.399689
This is not particularly hard to do in a spreadsheet by the way (but note that you would normally iterate it; nevertheless it's quite doable).
However if you have really large amount of missingness, you may not be able to estimate effects for all rows and columns (if one is all-missing for example)
Edit: as whuber notes below, a lot of missingness may result in problems of bias or nonconvergence

Glen_b
- 257,508
- 32
- 553
- 939
-
2upvoted because I didn't know anything about median polishing and your example is clear enough to get at least a superficial idea of it! – Elvis Dec 19 '12 at 21:02
-
4@Elvis Thanks. I tend to think of it as a bit like a two-way main-effects ANOVA model... but for medians rather than means. There's good coverage of it in "*Understanding Robust and Exploratory Data Analysis*", Hoaglin, Mosteller and Tukey (eds); it's also in Mosteller and Tukey's "*Data Analysis and Regression*". Also description and an example [here](http://www.stats.ox.ac.uk/pub/MASS4/VR4stat.pdf) (starts on page 7). – Glen_b Dec 19 '12 at 21:39
-
3(+1) Median polish is used extensively throughout Tukey's *EDA* text. It is easily implemented even in a spreadsheet. With any appreciable amount of missingness it becomes problematic, being potentially biased and often not converging at all. – whuber Dec 20 '12 at 02:25