16

I am going to use the Kolmogorov-Smirnov test to test normality of MYDATA in R. This is an example of what I do

 ks.test(MYDATA,"pnorm",mean(MYDATA),sd(MYDATA))

Here is the result R gives me:

 data:  MYDATA
 D = 0.13527, p-value = 0.1721
 alternative hypothesis: two-sided

 Warning message:
 In ks.test(MYDATA, "pnorm", mean(MYDATA), sd(MYDATA)) :
    ties should not be present for the Kolmogorov-Smirnov test

I think there is a problem, what does "ties" mean in this warning?

ttnphns
  • 51,648
  • 40
  • 253
  • 462
unes
  • 161
  • 1
  • 1
  • 3
  • 3
    Why do you wish to perform this normality test? In most cases, testing normality of a variable [is pretty useless](http://stats.stackexchange.com/q/2492/28500), although testing normality of residuals following a regression can be important. – EdM Aug 27 '16 at 21:48
  • 3
    Even without ties, the KS test is not a test for general normality but of a fully specified distribution (you're estimating the mean and sd from data). Your p-values will be nonsense. Search our site for references to Lilliefors test – Glen_b Nov 27 '16 at 02:54
  • See [Lilliefor's test](https://en.wikipedia.org/wiki/Lilliefors_test). – DifferentialPleiometry Jul 23 '21 at 17:36

2 Answers2

14

You have two problems here:

The K-S test is for a continuous distribution and so MYDATA should not contain any ties (repeated values).

The theory underlying the K-S test does not let you estimate the parameters of the distribution from the data as you have done. The help for ks.test explains this.

mdewey
  • 16,541
  • 22
  • 30
  • 57
  • I though this was not enough to be a different question, but here it is: https://stats.stackexchange.com/questions/389151/ties-in-a-two-sample-kolmgorov-smirnov-test – Nemesi Jan 25 '19 at 16:50
6

As explained by @mdewey, The K-S test is not suitable when estimating the parameters from the data. You can use the following code, which relies on the Anderson-Darling test for normality, and does not require you to supply the mean and the stddev. This test is stronger in accuracy than the Lilliefors test.

install.packages("nortest")
library(nortest)
ad.test(MYDATA)
Roee Anuar
  • 171
  • 1
  • 3
  • "Accuracy" may be for a narrow but misguided search. In both cases, most applications of either of these tests is at worst useless and in most cases misleading. People are often taught to use them by persons with faulty understanding of the assumptions about regression methods. I susppose the relative weakness of the KS-test would make it actually "better" to use that the more powerful alternatives since its results would be less likely to be misleading to the naive user. – DWin Jun 16 '17 at 17:26