I'm confused about the way chisq.test() is handling a very simple table from an exercise where I asked students to "sample" a bag of M&Ms and compare their scope to the pooled samples from their lab mates, to wit:
> mm1<-data.frame(
+ color = c("red", "orange", "yellow", "green", "blue", "brown"),
+ mySample = c(9,10,9,15,13,7),
+ myTable = c(26,28,34,44,44,18))
> chisq.test(mm1$mySample,mm1$myTable)
Pearson's Chi-squared test
data: mm1$mySample and mm1$myTable
X-squared = 18, df = 16, p-value = 0.3239
Warning message:
In chisq.test(mm1$mySample, mm1$myTable) :
Chi-squared approximation may be incorrect
> chisq.test(mm1[,2:3])
Pearson's Chi-squared test
data: mm1[, 2:3]
X-squared = 0.67269, df = 5, p-value = 0.9844
The second version is what I'd expect (5 d.f.), but why is the first one failing and where is it coming up with 16 d.f.? I don't think any of my expected values should be less then 5 for this set, so I don't think that error reflects the small-cell issue that's discussed here. Is.atomic() and is.numeric() on mm1\$mySample and mm1\$myTable all evaluate to TRUE, so what's the difference between passing them as an X and Y and passing them as a matrix?