Did I pick the right t-test based on the results from the F-test?

Question

I am presenting my thesis in two weeks and have some troubles with the statistical part, and I would be very grateful if someone could help me out since I dont really know what I am doing.

I started by analyzing my data with an F-test to be able to choose the right t-test. I had the following results:

F-Test Two-Sample for Variances     

             Non believer       Believer
Mean         28.375             38.59550562
Variance     305.6964286        161.6072523
Observations 8                  89
df           7                  88
F            1.891600929    
P(F<=f) one-tail    0.080345387 
F Critical one-tail 2.115471719

I understood it from an instruction video on YouTube that if F_critical > F, then we have equal variances. Since this is the situation here, I made a t-test assuming equal variances. Did I misunderstand it or did I pick the right test? Thank you.

score 1 · Answer 1 · answered May 30 '14 at 22:46

When your test statistic is below your critical value, this only means your evidence is insufficient to reject the null hypothesis; it does not mean that the null is true. You have very different sample variances from very unbalanced classes, one of which is pretty short on data. This affects the outcome of an F-test of the equality of variances. Furthermore, as Wikipedia says:

This F-test is known to be extremely sensitive to non-normality,^[2][3] so Levene's test, Bartlett's test, or the Brown–Forsythe test are better tests for testing the equality of two variances. (However, all of these tests create experiment-wise type I error inflations when conducted as a test of the assumption of homoscedasticity prior to a test of effects.^[4])

^{References

2. Box, G. E. P. (1953). Non-normality and tests on variances. Biometrika, 40(3/4), 318–335.

3. Markowski, C. A, & Markowski, E. P. (1990). Conditions for the effectiveness of a preliminary test of variance. The American Statistician, 44(4): 322–326.

4. Sawilowsky, S. (2002). Fermat, Schubert, Einstein, and Behrens–Fisher: The probable difference between two means when $σ_1^2 ≠ σ_2^2$. Journal of Modern Applied Statistical Methods, 1(2), 461–472. Retrieved from http://digitalcommons.wayne.edu/cgi/viewcontent.cgi?article=1022&context=coe_tbf.}

score 1 · Answer 2 · edited Apr 13 '17 at 12:44

First, go here and read the whole exchange.

Then, consider that in orthodox hypothesis testing, we never accept a hypothesis - such as the hypothesis of equal variances - we only fail to reject such a hypothesis. Your situation is actually a good example for why this is so. Remember that in the single case, the p value is best seen a gradual measure of the probability of the data conditional on H0 being true.

In this specific situation, the p value tells you that if drawing samples from two groups with the same parameter, only in about 8% of cases would the difference in the parameter be more extreme than in your situation. In other words, the value in question is quite different in this sample; however, your sample is too small to justify the confident rejection of the hypothesis that they were still drawn from populations where the parameter is identical. In other words, had your sample been only a little larger, a difference this big would have justified a rejection of the null hypothesis. As they say, surely God loves the p=0.049 as much as the 0.051; and p = 0.08 is hardly evidence in favour of H0.

So while your test does not justify the rejection of H0 at 95% confidence (or 5% alpha rate), it is far from good positive evidence for it. In fact, the F test is known to be not especially good at detecting that which the t test is vulnerable to in the first place! On the other hand, the t test is known to be highly robust to deviations from equal variances. However, this holds only while sample sizes are equal. In your case, they are highly unequal, so at a p value of .08, indicating little confidence in the H0 (regardless of any arbitrary alpha level), I would be somewhat concerned.

So if you want to make sure, you have two things ahead of you. First, visualise the sample distributions by inspecting QQplots, histograms and/or similar methods. Then, perform a test more robust to differences in variances, such as Welch's t test, and see if its result is in disagreement with the t test. If the samples appear reasonably similar, and the two tests deliver similar results, you're good to go. If not, well, you already got the robust test calculated, haven't you?

score 0 · Answer 3 · answered May 30 '14 at 22:47

First a little statistical theory, then the bottom line answer to your questions. The theory: The F test compares variances but not the variances that you might have thought. Rather the F test compares an estimate of the population variance that is calculated based on the group means to an estimate of the population variance that is calculated based on the within group variation in scores. If the means differ more than by chance, the estimate of the population variance that is based on the means will be large. The bottom line: The F test you did compared the 2 means and the results of the F test indicate that the difference between means is not statistically significant at the p=.05 level, one tailed (since your p value is .08, which is larger than .05). In short, you do not have reason to reject your null hypothesis of no difference in group means. Concerning what you read about using t tests to follow up on a significant F test, there is no reason to do do here, for 2 reasons. The first reason is that you do not have a statistically significant F. The second reason is that you have only two groups and the F and t tests are basically identical for 2 groups.

score 0 · Answer 4 · answered May 30 '14 at 23:09

0

If normality is satisfied in a normal probability plot, I would consider a nonparametric Levene's Test due to the differing variances.

answered May 30 '14 at 23:09

Clarinetist

3,761
3
25
70

Did I pick the right t-test based on the results from the F-test?

4 Answers4