Can I choose to use only Shapiro-Wilk?

Question

I did a normality test for my data and as usual, SPSS will give both Shapiro-Wilk and Kolmogorov-Smirnov numbers. Is it okay if I choose to use only the number from Shapiro-Wilk? because the number from SW is the ones that showed significant level (0.183) while KS is not (0.046). is there any references that I can use to support my decision as to using only SW when both SW and KS were given? thanks.

I would say the correct approach is to decide beforehand which test you will use, and then stick with that. However, in most circumstances, I advise against using statistical test to assess normality. Those tests are sensitive to sample size, so they often don't answer the question you want to ask. (For example if your data are suitable for certain parametric analysis.) — Sal Mangiafico, Jul 17 '17 at 01:42
I don't quite get what you mean by "the number from SW is the ones that showed significant level, while KS is otherwise" — Glen_b, Jul 17 '17 at 01:44
@Glen_b KS showed the p value is 0.046 while SW p value is 0.138 — akie, Jul 17 '17 at 02:10
Thanks for clarifying akie. I made some small changes to my answer. Would you mind editing that clarification into your question? — Glen_b, Jul 17 '17 at 02:20
@Glen_b when i use spps, it automatically give me both numbers from SW and KS. i am very new to statistic and i learn how to use spss from the internet. i am clueless. so what should i do? — akie, Jul 17 '17 at 02:24
If you must test it at all, choose one without reference to the data. Beyond my advice that the Shapiro-Wilk is generally more powerful, what basis should I be advising you on? You've given no information on which one could reasonably form a basis to choose one particular test. What is this test supposed to tell you? What does it achieve? — Glen_b, Jul 17 '17 at 04:46

Glen_b · Accepted Answer · 2017-07-17T22:41:12.117

If you want a formal hypothesis test of some hypothesis, you should use one test to test that hypothesis.

You should choose that test before you see data, not after you have results in front of you.

You should normally choose that test so that it gives the best power against the alternatives that matter to you. If you're looking at data (let alone p-values) it's already too late to do this cleanly. (This is an argument against those packages that just present a laundry list of tests as a matter of course -- they directly encourage p-hacking -- consciously or unconsciously there will be a tendency to focus on the result you were looking for. Better, I think is the packages design philosophy that gives you tests that you ask for, so that you at least make a conscious decision about what you're going to test and when.
It's easy (before the fact) to justify using the Shapiro-Wilk -- it's generally more powerful than most of the competitors, including what SPSS is calling the Kolmogorov-Smirnov, but which I assume is actually Lilliefors' test (because the actual Kolmogorov-Smirnov test is not a test of general normality -- it's not clear why they'd choose to erase Lilliefors' contribution).
If you're actually trying to check suitability of assumptions of some other procedure, formal hypothesis testing is generally unsuitable.

Firstly, see Is normality testing essentially useless -- especially the answer by Harvey.

Secondly, if you're choosing between different procedures (such as one that assumes normality if you fail to reject and doesn't assume normality if you do) on the basis of a test of normality, you impact the properties (significance level and power) of both the alternatives you're choosing between and the result is not necessarily what you might hope for. Typically if you're not comfortable justifying a choice of a normal-theory procedure before you see the data you should probably just use a test that either doesn't depend on that assumption at all or at least something that's pretty robust to it (and it's not just level-robustness that matters, though you'd hardly guess from many discussions of robustness of tests).
The phrasing in "the number from SW is the ones that showed significant level, while KS is otherwise" is unclear. If you actually mean that the Shapiro-Wilk would reject the null while the other test would not (or vice versa), using that significance or non-significance as a reason to choose the test is unambiguously p-hacking. If you're choosing between tests post hoc on the basis of whether they rejected or didn't reject, you have to toss out the p-values you're looking at because they no longer mean much of anything; if you present the results as if you had just run one test, you're misleading the people who read your work.
I note from your previous question that $n=5$. That's not much to go on, power may be pretty low against some kinds of alternatives that could matter; with such a small sample like that neither a rejection nor a non-rejection is particularly informative (if we entertain seriously the possibility that the null can be true (the population could actually be normal), the power of the Kolmogorov-Smirnov may be so low that a rejection may be fairly likely to just represent type I error).

If there's no good reason to anticipate normality, unless you have a procedure that's quite robust to the assumption, I'd be inclined to avoid assuming it.

If SW is more powerful, it should reject the null hypothesis more likely than KS. But the p-value in the question for KS is smaller? — SmallChess, Jul 17 '17 at 02:35
@SmallChess **1**. Being more powerful against some particular alternative means that the *probability* of rejection is higher (probability of $p\leq \alpha$ is higher), not that every individual p-value is lower. It's like having two coins where coin 1 has P(H)=0.6 and coin 2 has P(H)=0.45 -- the first coin has a better chance of heads, but that doesn't mean one toss it can't be Coin1:T, Coin2: H. **2**. Being more powerful against a wide range of alternatives of interest (i.e. generally speaking, it's more powerful) is not the same thing as greater power against every possible alternative. — Glen_b, Jul 17 '17 at 02:46
What do you say for following every single path in the garden of forking paths, seeing where they lead and considering all this is on your final evaluation of a question? Additionally, what other facets are relevant to the robustness discussion? — Heteroskedastic Jim, Jul 17 '17 at 03:02
This sounds like an entirely new (and hardly trivial) question. Perhaps you should formulate a question (with a little more context, perhaps, since it seems like you're trying to lead to something) and post it. — Glen_b, Jul 17 '17 at 04:50
@Glen_b: Regarding the K-S labeling, the SPSS Statistics Tests of Normality table does contain a section labelled Kolmogorov-Smirnov, but it is footnoted in the output with "Lilliefors Significance Correction" and is so described in the Command Syntax Reference and in the What's This popup help. The plot output includes Q-Q plots that help to identify where the deviations from normality occur. All this aside from the question of when such tests are useful. — JKP, Jul 17 '17 at 13:54

Can I choose to use only Shapiro-Wilk?

1 Answers1

Related