1

I have two samples, and in performing a Kolmogorov–Smirnov test the sample distributions are shown to be significantly different from each other.

In knowing that the two sample distributions are different, can I perform a t-test and expect valid results?

If not, is there another test that would provide better results to test central tendency between those two samples given the results of the aforementioned K-S test?

edit

The samples being tested are in essence A/B testing, and the motivation is to find their expected value per test arm category.

Unfortunately, the sample size is unavoidably small; 40 to 60 observations and are not normal.

When running both a t-test and k-s test the sample test are significant, but given the nature of the data, can anything be inferred as to each test arms respective central tendencies?

JLuu
  • 11
  • 2
  • 2
    “Valid” meaning what? – Dave Jul 22 '21 at 02:35
  • reliable and robust. – JLuu Jul 22 '21 at 02:47
  • _Possibly invalid:_ Ideally 2-sample t tests are for two normal samples. It seems that at least one of your two samples isn't normal. // _Possibly pointless:_ You say you already know by K-S that the two populations differ. What do you hope to learn from the t test? – BruceET Jul 22 '21 at 05:34
  • Yeah, I did some more searching and asking, and realized that, and I tested for normality, and most of my samples lack normality. Both the K-S tests and T-test are significant, but like you said was worried that the T-tests are pointless. I want to know if the sample means are significantly different from each other. – JLuu Jul 22 '21 at 05:58
  • 1
    The t-test is pretty robust to deviations from normality, particularly when sample sizes are large. Are your samples obviously non-normal upon visual inspection? – Dave Jul 22 '21 at 09:45
  • yes, and running `scipy.stats.normaltest` strong indicates that some of them are not. – JLuu Jul 22 '21 at 15:17
  • 1
    Some people advocate for a Wilcoxon Mann-Whitney U test in this situation; you are not quite testing for mean equality (unless the populations have the same shape), but it is something like that. My preference is to be thoughtful about what you really want to learn from the data. In a non-normal distribution, the mean might not be particularly descriptive. In such a situation, I am not so sure that I would care if I can show the means to be unequal, even if I can do so. // [Formal testing of normality is less helpful than one might hope.](https://stats.stackexchange.com/q/2492/247274) – Dave Jul 22 '21 at 15:23
  • Please include new information as an edit to the post, not only in comments. Not everybody reads comments! Also include information on sample size, context, and why you do want to test means when you aleady knows the distributions are different? – kjetil b halvorsen Jul 22 '21 at 17:26
  • I added more information in the comments because I was responding to a comment. which is a common norm. I edited the post. Can you please re-open the question? – JLuu Jul 22 '21 at 17:46

0 Answers0