How to test whether subgroup is influential in overall group mean? (whether removing a subgroup affects results/all-but-subgroup mean != overall mean)

Question

This question is similar to this one but I want to test whether the all-but-subgroup mean differs from overall group including the subgroup. This is DIFFERENT FROM THAT QUESTION in that the question at hand is if omitting the subgroup yields the same results (i.e. same mean) as including the subgroup (if that subgroup is influential).

You can see an illustration of how they differ at the end of this post in my example with the hair salon.

In my case I don't believe comparing subgroup vs. others is reasonable, as suggested in the other question.

This is because the sample size of the subgroup is much smaller (~5% of) than that of the rest, and I was wondering if omitting the subgroup yields the same results (i.e. same mean).

E.g.

ALL: All hair salons - 630 customers, mean = 2.5 (SD ~ 0.3)

SUBGROUP: Aveda - 30 customers, mean = 2 (sd ~ 0.3)

RESPONSE: Customer rating of salon from 1-10.

The effect of including Aveda in the analysis is going to be very small (not practically - or statistically - significant, since the number of Aveda customers is very small compared to the rest).

However, when testing for the difference in mean ratings of Aveda vs. rest it's very likely going to be significant.

In this case, I want to test the effect of removing Aveda from the analysis (i.e. that removing it doesn't really make a difference)

Can I simply conduct a t-test comparing all vs. all-but-subgroup? Would it be a paired or unpaired t-test? (since obs's are not independent but they're not measured twice at different time points in the same subject)

This appears to be similar to this: https://stats.stackexchange.com/questions/30562/how-to-test-whether-subgroup-mean-differs-from-overall-group-that-includes-the?rq=1 Read this instead — Beavis, Aug 15 '17 at 13:26
@Beavis This is different from that question, you can see it in my example with the hair salon (I even linked that question in my question). Because it would yield different results: including Aveda wouldn't significantly change the ratings even though its rating is significantly different from the rest — Grint, Aug 15 '17 at 14:31
Please do not reask questions when you don't get the response you want. See if the post can be clarified, or if attention can be brought to it some other way. This just clogs up the site. Be aware that you are asking strangers over the internet to volunteer their time to help you. It is always possible that they are too busy, or lack the interest or requisite expertise in your question to do so. — gung - Reinstate Monica, Aug 16 '17 at 16:21
This is a duplicate of the linked thread. The answer there is the answer to your question. — gung - Reinstate Monica, Aug 16 '17 at 16:21
Possible duplicate of [How to test whether subgroup mean differs from overall group that includes the subgroup?](https://stats.stackexchange.com/questions/30562/how-to-test-whether-subgroup-mean-differs-from-overall-group-that-includes-the) — gung - Reinstate Monica, Aug 16 '17 at 16:22
@Beavis I'm apologize. As I replied to Beavis, this is not the same as my question (as explained above), I even linked that question in this question. — Grint, Aug 16 '17 at 19:24
The nature of the question isn't quite clear. "Doesn't make a difference" seems to ask *how big* the difference is between including or not including the subgroup, whereas t-tests only tell you whether you can *detect* a difference between the subgroup mean and the mean of everyone else--and as others point out, that question has already been answered. Could you please elaborate on what you're trying to accomplish in this regard? — whuber, Aug 16 '17 at 19:31
@whuber Thanks for your comment. It's exactly the first one: I want to see whether there is a significant difference between including or not including the subgroup in the analysis. — Grint, Aug 16 '17 at 19:39
This may be what is confusing us: "significant" in statistics ordinarily means "detectable". If you really do mean to ask the "how big" question, then the obvious answer is to do the analysis with and without the subgroup and compare the results. Presumably you have done this. What more are you expecting? — whuber, Aug 16 '17 at 19:41
@whuber Thanks! Yes, I've done that. Is there a way to quantify that comparison? Or can I simply infer from the context whether it is practically significant or not? I'm confused. Thanks again — Grint, Aug 17 '17 at 13:32
If by quantify you mean "estimate" you can subtract the means. If you mean "test", gung already linked you to the correct procedure — Glen_b, Jul 27 '18 at 23:34

How to test whether subgroup is influential in overall group mean? (whether removing a subgroup affects results/all-but-subgroup mean != overall mean)

0 Answers0