0

I am running a kstest on MATLAB. When I take the data directly, i.e. kstest(data), the result says that my data is non normal. However, if I use kstest((data-mean(data))/std(data)) it would come out as normal distribution.

What is the correct way of testing normality?

Sharah
  • 145
  • 2
  • 8

2 Answers2

3

If you subtract the mean and divide by standard deviation then you can no longer use that particular test. We discussed this several time here. The reason is that kstest function in MATLAB is for a known distribution, meaning that you know its mean and variance. You're estimating them from sample, which makes this test invalid as implemented in MATLAB. Look up Lilliefors test for your purpose. Other software packages such as SPSS may have implemented it in the their KS test functions, check the docs.

Aksakal
  • 55,939
  • 5
  • 90
  • 176
  • ... or Shapiro-Wilk's. – Stephan Kolassa Jul 20 '18 at 20:47
  • This is correct, though when someone asks about a stats package that package may call Lilliefors test "Kolmogorov-Smirnov" (like SPSS does). – Glen_b Jul 21 '18 at 02:22
  • @Glen_b, thanks, I clarified my answer. In fact when I ran into this issue with MATLAB, I assumed that they'd take care of unknown parameters case. It's great that SPSS does this – Aksakal Jul 21 '18 at 17:19
  • If SPSS is a problem because it is costly, you could try the gnu version: PSPP; https://www.gnu.org/software/pspp/manual/html_node/KOLMOGOROV_002dSMIRNOV.html#KOLMOGOROV_002dSMIRNOV – Fuca26 Jul 21 '18 at 17:55
  • much thanks for the explanation, great for me who's knows very fundamental of statistic – Sharah Jul 31 '18 at 13:16
1

By subtracting the mean and dividing by the standard deviation, you're making your data normal. If you want to test for normality, I would say the most intuitive way would be to plot a histogram and see how close it is to a normal distribution. For a more analytical approach, you can always run a Shapiro-Wilk test.

aalberti333
  • 113
  • 3
  • You might reasonably say you make normal data *standard* normal by subtracting the mean & dividing by the standard deviation. – Scortchi - Reinstate Monica Jul 20 '18 at 20:25
  • Ah, I guess I should have been more clear. However, I think it's still fair to say that doing so would still be making your data normal. The standard normal distribution is still a normal distribution after all :) – aalberti333 Jul 20 '18 at 20:31
  • Why would you want to say you make normal data normal? Anyway, what your first sentence appears to be saying is that you make non-normal data normal by subtracting the mean & dividing by the standard deviation! – Scortchi - Reinstate Monica Jul 20 '18 at 20:43
  • 1
    If the original data are not normal, then subtracting the mean and dividing by the SD will *not* make it normal. Conversely, if the data were normal to begin with, then scaling them will not make it "more" normal. I fail to see what you are trying to say. Shapiro-Wilk is a very good point. – Stephan Kolassa Jul 20 '18 at 20:49