Impact of large slighly unequal samples on kurtosis and skew

Question

I have sample A (17'123 observations) and sample B (23'136 observations) of intervall data. Visual inspection, i.e. histogram, QQplot and boxplot showed that the data is not perfectly normally distributed and has outliers. The reason why I don't use a normality test is because my samples are large.

The kurtosis values are as following:

sample A = -.4786 / sample B = -.3769

The data is skewed to the left:

sample A = -.7669 / sample B = -.7499

The calculations are based on G2 / G1 as adopted in SPSS or SAS and explained in D. N. Joanes and C. A. Gill (1998), Comparing measures of sample skewness and kurtosis. The Statistician, 47 , 183–189.

The values are close to 0. In light of the big samples, does this still mean that the data is not too skewed and platykurtic? Implications on t-test results?

In addition my samples are not equal. According to these posts (1 and 2) unequal sample sizes don't bias the t-test results, but reduce its power. In other samples I have gathered, sample C is more than 3 times bigger than sample D. Therefore, the statistical power is reduced markably see here. Is there any way to improve statistical power? A different test? I want to see whether the scores of group A (or C) are significantly lower than in group B(or D).

I don't like to just delete data. That is somewhat hard to justify to reviewers and editors. Many thanks!!

Impact of large slighly unequal samples on kurtosis and skew

0 Answers0