4

I am trying to understand where the p-value of a F-test comparing two variances comes from. More specifically, the p-value given by R's var.test function does not match p-value assigned to a F-test by the pf function with the same F value and degrees of freedom.

For example, p-value given here:

> d1 <- rnorm(300, sd=1)
> d2 <- rnorm(300, sd=1.2)
> var.test(d1, d2)

    F test to compare two variances

data:  d1 and d2
F = 0.78, num df = 299, denom df = 299, p-value = 0.03212
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.62 0.98
sample estimates:
ratio of variances 
              0.78 

Does not match this one:

> pf(0.78, 299, 299, lower.tail=F)
[1] 0.98

Could someone explain where the difference comes from?

Nick Stauner
  • 11,558
  • 5
  • 47
  • 105
twowo
  • 182
  • 1
  • 1
  • 10
  • This (why that test uses two-tailed pvalues, rather than one, as in the F-test in ANOVA) is discussed in a number of posts on this site, including [this one](http://stats.stackexchange.com/questions/67543/why-do-we-use-a-one-tailed-test-f-test-in-analysis-of-variance-anova/73993#73993), and also in the extensive comments under [this one](http://stats.stackexchange.com/questions/55550/how-do-i-interpret-the-results-from-the-f-test-in-excel/55553#55553) – Glen_b Mar 23 '14 at 21:25

1 Answers1

3

pf(,lower.tail=F) gives a one-tailed $P[X > x]$, whereas var.test defaults to alternative='two-sided'. Hence:

set.seed(2);var.test(rnorm(300),rnorm(300,0,1.2)): $F_{(299,299)}=.8148,p=.07706$.

2*(1-pf(.8148,299,299,lower.tail=F)): $p=.07710$. Close enough, right? I just subtracted from 1, and multiplied by 2 to get the two-tailed value.

If you want an even closer result, you can feed the exact result in to reduce rounding error:

set.seed(2);2*(1-pf( var.test(rnorm(300),rnorm(300,0,1.2))$statistic ,299,299,lower.tail=F))

$p=.07705506$. Even more digits than the output from var.test! Otherwise identical.

Nick Stauner
  • 11,558
  • 5
  • 47
  • 105
  • Thanks for your reply but p-value in var.test is 0.03212. In what way is it close to 0.077? – twowo Mar 23 '14 at 19:43
  • You randomly generated your data and didn't provide the seed, so I randomly generated some other data. `2*(1-pf(.78,299,299,lower.tail=F))` $=.03206452$. – Nick Stauner Mar 23 '14 at 19:54
  • 1
    Isn't the F to compare variances traditionally computed with the larger variance in the numerator, so F is always greater than 1.0? It won't change the answer or conclusion, but makes it easier to follow. – Harvey Motulsky Mar 23 '14 at 21:05
  • It's not more precise. It's only printing more digits. I expect both results to be identical (at least to floating point precision). – Roland Mar 24 '14 at 08:25
  • Precise in the sense of "exactness and accuracy of expression or detail". I'm not saying the numbers are meaningfully different...Eh. Might as well change the word anyway. – Nick Stauner Mar 24 '14 at 08:29
  • Setting `lower.tail=F` and then subracting the result from `1` is the same as not setting `lower.tail=F`, i.e. `2*(1-pf(.8148,299,299,lower.tail=F))` is the same as `2*pf(.8148,299,299)`. – Cm7F7Bb Jan 06 '17 at 13:49