2

I started reading about modern robust methods as an alternative to classic parametric techniques because I keep encountering issues with normality and, at times, violations of other classic parametric test assumptions in my real world data. For context: I come from an applied background (researcher using stats for analysing the data) and while I am trying to keep up with the "maths" behind the methods, my grasp of this is not what I would like it to be. So please bear with me in that respect.

In their paper "Modern robust statistical methods: an easy way to maximize the accuracy and power of your research", Erceg-Hurn and Mirosevich (2008; see here) argue, "research has shown that the use of trimming (and other modern procedures) results in substantial gains in terms of control of Type I error, power, and narrowing confidence intervals (...)." (p. 595).

Amongst other points, the author state that "if data are normally distributed, the mean and the trimmed mean will be the same." and "Modern robust methods are designed to perform well when classic assumptions are met, as well as when they are violated. Therefore, researchers have little to lose and much to gain by routinely using modern statistical methods instead of classical techniques" (p. 595). While it seems logical to me that the mean and the trimmed mean will be the same if data are normally distributed, I still have a feeling that modern robust methods would be a disadvantage in the case in which assumptions are met since the standard error would - in my understanding - become larger as part of the data is disregarded (what is referred to as "price" in this post). But perhaps this is just my lack of understanding of how the stats behind the modern robust methods work.

Finally, the authors claim, "Because of the serious limitations of assumption tests noted earlier, researchers should not use assumption tests as a basis for deciding whether to use classic or modern statistical techniques." (p. 595).

So my questions are:

  1. Can modern robust methods be used as a standard, or do they come at a price when assumptions are met?
  2. If modern robust methods are such a great solution to deal with issues like violated assumptions of normality, homoscedasticity etc., why aren't they more established? I rarely read a paper which used modern robust methods. I appreciate that Erceg-Hurn and Mirosevich discuss potential reasons in their paper, but I would be interested in the users' view too.
  3. If the view is that modern robust methods do come at a cost when used with data that do not violate the assumptions, what is the best basis to decide whether or not to use a modern robust method in the light of the authors' claim that researchers should not use assumption tests as a basis for such a decision?
  4. Are there any limits to the extent of violations acceptable (e.g. extreme skew) when using modern robust methods?
  5. Which modern robust method can be recommended (perhaps particular packages in R)? How can researchers who use stats for their analyses (rather than being statisticians) assess the value and "price" of robust methods (without running simulations etc.)? Are there perhaps any recommendations (preferably published somewhere) that are accepted by the wider community?
  6. In short, while using modern robust methods may seem tempting, I am concerned that using them without a thorough understanding (especially of their limits and potential issues) is no better than just sticking to the classic parametric techniques. What is the community's view on taking such a stance?
grey
  • 625
  • 1
  • 4
  • 16
  • 1
    This raises many interesting, important questions -- but too many in my view. It's far from evident that "modern robust methods" are a distinct category. Trimmed means are a very old idea. At worst, "modern robust methods" are those approved of, indeed in some cases invented, by Rand R. Wilcox in his texts, which are in my view sometimes useful, but often eccentric or perverse. I'd say that the biggest deal is not what you use for univariate summary or the simplest tests, but for regression-like (predictive) modelling, where there is little consensus on how to handle awkward distributions. – Nick Cox Jul 30 '21 at 10:58
  • 1
    I've skimmed through the paper you cite which is often contentious. For example, I would say that their objections to transformations are exaggerated or even incorrect. Faced with skewed distributions with difficult tails and outliers, working on logarithmic scale in particular is often the simplest and best way to move. – Nick Cox Jul 30 '21 at 11:00
  • Thanks for sharing your thoughts @NickCox. As an aside: I often found that transformations did not fully solve the issues with skew for my data (if they do in one group, they may not in another etc.). I did indeed come across the WRS2 package by Wilcox, which offers, for instance, a robust two-way ANOVA. It is difficult to identify suitable modern robust methods without in-depth statistical knowledge - that's also the reason for my question. Are there any approaches that are perceived as useful and supported by the wider community? – grey Jul 30 '21 at 11:35
  • Depends which community you're talking about but Wilcox's own inventions are not used widely so far as I can tell. Methods like those introduce a lot of arbitrary and ad hoc decisions. Departure from normality is just about the least important problem for many procedures, even t tests and ANOVA, but probably the most obsessed about among non-statisticians (and I am one, just an exception on this score). – Nick Cox Jul 30 '21 at 12:49
  • Rupert G. Miller's _Beyond ANOVA_ despite its 1986 date, has not been superseded as worthwhile guidance -- based on experience and analysis rather than on myth or misunderstanding -- on which departures you need to worry about and which you don't. – Nick Cox Jul 30 '21 at 12:50
  • Thank's for the hint at Miller's book. It is indeed difficult to appreciate the importance of various assumptions (including that of normal distribution) as a non-statistician. You will find sources on either side of the argument, either telling you that normality is not really that important or that violations of the assumption result in errors and biases of all sorts... – grey Jul 30 '21 at 13:18

0 Answers0