0

In data analysis, one usually need to verify that data have property $X$ before applying method $Y$, which takes $X$ as a prerequisite. To illustrate, possible values of $(X, Y)$ include $(\text{homogeneity of variance and normality}, \text{$t$-test})$, $(\text{independence}, \text{ANOVA})$, and $(\text{stability}, \text{ARIMA-based inference})$.

You must answer the following question before proceeding: do the data deviate enough from the ideal condition where property $X$ perfectly holds to "forbid" use of method $Y$? As far as I know, this question is usually addressed by performing a hypothesis test. For example, a normality test is conducted, and if we cannot reject normality, we suppose the answer is "No" and readily apply a $t$-test.

It appears that this works, thanks to the robustness of the $t$-test. However, this answer points out that hypothesis answers a different question than what we really care, i.e. is there convincing evidence of any deviation from the property $X$? The answer is almost always "Yes" if your dataset is big enough.

My question is, do all methods have "robustness" to some extent? If not, why can we verify that data have property $X$ with hypothesis testing? To paraphrase, does $p > \alpha$ when testing for $X$ always implies the applicability of method $Y$?

nalzok
  • 1,385
  • 12
  • 24
  • 1
    Re: the premises. It might be more fruitful to think of the *assumptions* as having properties like those you list, because the *data* definitely have none of them. The issue then becomes whether the data offer substantial evidence against your assumptions. Although many people do try to assess such evidence through formal hypothesis tests, IMHO those who are most successful at data analysis eschew that machinery and favor robust exploratory techniques. One reason the formal testing doesn't work is that it needs to change the interpretation of the results of $Y,$ such as its p-values. – whuber May 08 '19 at 18:30
  • @whuber You are right, the assumption is that data is drawn from a *population* with homogeneity of variance and normality/independence/stability. Can you give some examples of "robust exploratory techniques" to illustrate? – nalzok May 09 '19 at 00:31
  • 1
    https://www.amazon.com/Exploratory-Data-Analysis-John-Tukey/dp/0201076160 – whuber May 09 '19 at 01:43

0 Answers0