Choosing the right test to compare the difference in mean between 3+groups according to the data we have (param vs non-param, sample size..)

Question

I hope everyone who reads this is fine. I come to you for help, I am a student in data analysis, but I am not yet familiar with everything, especially regarding tests.

I have 3 groups of independent objects and their durability in hours (they are not totally independent due to their use, but in practice they are considered as such. If other alternatives exist are possible I'm interested in, they are not paired, object 1 can be used 50 hours tied up with object 2 then 50 hours with object 3).

The 3 groups (or more) are: the currently sold object, a new one to test and one from a competitor. They do not follow normal distribution, and there is a high chance that they follow Weibull distributions. Their variances are probably heterogeneous. Here is an R script to generate a similar sample:

library(stats)
library(lawstat)

# Number of observation for each sample
# What test to use under 30 ? with different numbers ?
n = rep(30, 3)

# Generating simple samples
# Current model
current = data.frame(rep("cur", n[1]), rweibull(n[1], 12, 7500))
# New model 
new = data.frame(rep("new", n[2]), rweibull(n[2], 16, 8000))
# Competitor
comp = data.frame(rep("com", n[3]), rweibull(n[3], 13, 7350))

# Merging
colnames(current) = c("Type", "Value")
colnames(new) = c("Type", "Value")
colnames(comp) = c("Type", "Value")
df = rbind(current, new)
df = rbind(df, comp)

# Non paramtric test of homogeneité of variance
fligner.test(Value~Type, data=df)

I would like to study the existence of a mean/median difference between the durability of those groups, but I don’t know which test to use. This journey in search for the right test has made me learn a lot, but it also has made me question myself a lot.

1.) I considered Kruskal test and then post hoc Dunn (pairwise), am I right? Could I use a pairwise Mann-Whitney-Wilcoxon Test?

2.) I would like to compare the objects in terms of % of performance. How can I create a confidence interval for this type of test (I would like to get something like “new is 15% better than cur, with a 95% confidence interval of [7%, 25%]”)?

2.bis) If I plan on comparing them only in pairs (new to cur, new to comp, cur to comp) and not in their globality (new to cur to comp), is it okay to do not adjust the method, and to simply do multiple tests ?

3.) If the variances were homogeneous, could I use another test?

4.) Can I do a post hoc test (I am interested in the differences between the groups) without doing an ANOVA or a Kruskal test? Like if the normality assumption was fulfilled, would it be okay to do a pairwise t.test (corrected) instead of an ANOVA + Tukey HSD?

5.) Can I compare 2 Weibull distribution with a t.test (eventually Welch t-test)?

6.) Is there any parametric test that doesn’t require normality but the estimated parameters of the distribution?

7.) What if the groups have different distribution? Can I still say that new might be better than cur?

8.) What if the samples are too small (5-15)?

9.) I love the plot of TukeyHSD, but I can’t replicate it with other tests :’(

I know I asked a lot of things, but I look forward to any answer. Thanks in advance.

Mark Ebden · Accepted Answer · 2020-07-29T15:36:33.053

0

Answer to Q1: Since you're interested in comparing means or medians, not distributions, use simply the (nonparametric) median test of K populations; e.g. Test 51 in 100 Statistical Tests. This is different to the KW test you mention, which assesses whether the distributions may differ.

You also asked in Q1 about the pairwise Mann-Whitney-Wilcoxon test. Those kinds of tests answer a different question, namely whether there are pairs with significant differences. If this is the question of more interest to you, consider the Dwass-Steel-Critchlow-Fligner test, which is superior to the Dunn test.

Answer to 2bis: You can run multiple pairwise tests, but will need a correction for the multiple-comparison problem, such as the Holm method. (Which correction to use depends on how you want to guard against false positives; e.g. do you want to specify the probability of making at least one Type-I error, and so on.)

Answer to Q4: Yes, if you genuinely have no interest in the ANOVA/Kruskal-like results. You'll achieve a more powerful test by doing pairwise testing immediately. Your expected false-positive rate rises slightly, but this is a valid approach; see for example here.

Answer to Q7: Yes. Using the above tests is immune to most differences in distribution.

edited Jul 29 '20 at 15:36

answered Jul 27 '20 at 14:46

Mark Ebden

419
1
4

A great thanks for your answer !! If I understand correctly, Test 51 is Mood's median test right ? Looking at the Mann-Whitney-Wilcoxon I also found out that it's a test to compare probability distribution at first whereas a lot of sources say it's a median test, without making clear their assumption. Also, do you have any answer to Q2bis ? – Jul 29 '20 at 08:13
Could you please rephrase 2.bis? In other words, since you're not testing in pairs, or globally, what are you seeking in 2.bis? – Mark Ebden Jul 29 '20 at 10:37
Sorry, there was an extra word ("don't") (I have edited my post), I meant is it ok if I want to compare them in pairs (in other words, two samples comparaisons), but not globally (one test to show that A is different than B, A is different than C but we don't want to look at B vs C) – Jul 29 '20 at 11:08
Ok, I have added an answer to 2bis above. Please click to accept this answer if you are happy with the result. BTW yes Test 51 is Mood's median test. – Mark Ebden Jul 29 '20 at 15:36

Choosing the right test to compare the difference in mean between 3+groups according to the data we have (param vs non-param, sample size..)

1 Answers1