1

I would like to use a t-test for hypothesis testing on the amount of calories consumed by women per day. The distribution of this variable is right-skewed and a log-transformation increase its symmetry but it shows also an outlier on the lower-tail.

I know that, since the sample has more than 200 observations, I could in principle use t-test without worrying about any normality condition. But I wanted to know if I can get the same result (significant p-value) by using t.test with the log of the data and of the mean.

Before doing that, I wanted to test my assumption that the data are log-normal. In order to do that I used the Shapiro-Wilk test with the log of the variable. The resulting p-value is $0.43$: with this, can I say that we cannot reject the null hypothesis that the data are log-normal? Or this only tests normality even if I use the log transformation?

wrong_path
  • 607
  • 6
  • 20
  • 2
    It tests normality of the logtransformed data, and that is by definition lognormality of the original, untransformed data – kjetil b halvorsen Nov 26 '17 at 11:56
  • @kjetilbhalvorsen Thank you for the answer. Therefore, in my case (since p-value is greater than the commonly used confidence levels), I could say: "With this sample, we cannot reject the null hypothesis that the original, untransformed data are log-normal distributed", right? – wrong_path Nov 26 '17 at 12:00
  • Yes,. You are right – kjetil b halvorsen Nov 26 '17 at 12:03
  • Lognormal distributions can be highly skewed. Therefore, 200 observations might not be sufficient to assure the t-test gives reliable results: see https://stats.stackexchange.com/questions/69898/t-test-on-highly-skewed-data/69967#69967 for an example. You might therefore want to reconsider this issue and examine the data distribution more closely. Moreover, a t-test based on the logarithms tells you whether the *geometric means* differ, not whether the *arithmetic means* differ (which is, implicitly, your original purpose). – whuber Nov 26 '17 at 15:05

0 Answers0