This question is similar to this one but I want to test whether the all-but-subgroup mean differs from overall group including the subgroup. This is DIFFERENT FROM THAT QUESTION in that the question at hand is if omitting the subgroup yields the same results (i.e. same mean) as including the subgroup (if that subgroup is influential).
You can see an illustration of how they differ at the end of this post in my example with the hair salon.
In my case I don't believe comparing subgroup vs. others is reasonable, as suggested in the other question.
This is because the sample size of the subgroup is much smaller (~5% of) than that of the rest, and I was wondering if omitting the subgroup yields the same results (i.e. same mean).
E.g.
ALL: All hair salons - 630 customers, mean = 2.5 (SD ~ 0.3)
SUBGROUP: Aveda - 30 customers, mean = 2 (sd ~ 0.3)
RESPONSE: Customer rating of salon from 1-10.
The effect of including Aveda in the analysis is going to be very small (not practically - or statistically - significant, since the number of Aveda customers is very small compared to the rest).
However, when testing for the difference in mean ratings of Aveda vs. rest it's very likely going to be significant.
In this case, I want to test the effect of removing Aveda from the analysis (i.e. that removing it doesn't really make a difference)
Can I simply conduct a t-test comparing all vs. all-but-subgroup? Would it be a paired or unpaired t-test? (since obs's are not independent but they're not measured twice at different time points in the same subject)