6

I'm doing some analysis on a device which sizes potatoes. To assess the accuracy of this device, I'm comparing (1) the data for a box of potatoes, as sized by this device, and (2) the data for the same box, as sized by hand. In both cases, the raw data consists of an array of numbers. Both data sets appear to be (approximately) normally distributed.

So far, my analysis has consisted of:

  • Dividing the data into bins, and carrying out a chi-squared test.
  • Carrying out a t-test to compare the means of the two data sets.
  • Carrying out a Wilcoxon signed rank test, which I understand to be a non-parametric equivalent to the t-test.

For completeness, I'd like to carrying out a parametric equivalent to the chi-squared test. Is there such a thing? I've half an idea that a one-sided ANOVA test might be what I'm looking for, but I'm really not sure.

Tom Hosker
  • 267
  • 1
  • 7
  • 1
    Why do you want a parametric test? Also, it sounds like what you're after is an [equivalence test](https://en.wikipedia.org/wiki/Equivalence_test) where a rejection of the null hypothesis gives a conclusion that the two cases are similar. That's not what the chi-squared test would give you. – abstrusiosity Nov 11 '20 at 17:18
  • @abstrusiosity I want a parametric test for two not very compelling reasons: (1) a compulsive desire for symmetry - the t-test and the Wilcoxon test form a neat pair, so why should the chi-squared test be uncoupled? - and (2) I'm curious to find out whether a parametric equivalent to the chi-squared test exists. – Tom Hosker Nov 11 '20 at 17:29
  • @abstrusiosity Wouldn't the chi-squared, with its "observed" and "expected" columns, tell us how well a given measuring device was performing versus a control? If $X^2$ is huge, you have to reject the null hypothesis, which in this case means concluding that the device isn't taking sufficiently accurate measurements. – Tom Hosker Nov 11 '20 at 17:31
  • 1
    Rejecting the null means you are confident that they're different, but failing to reject the null does not mean you're confident that they're the same. It sounds like you want to be able to assert that device *is* taking accurate measurements. To do that you need a procedure that accounts for what you mean by "sufficiently accurate" and looks for proof that the difference between methods is within that bound. – abstrusiosity Nov 11 '20 at 17:44
  • 2
    The usual chi-squared test *is* parametric: see my account at https://stats.stackexchange.com/a/17148/919 for the details. – whuber Nov 11 '20 at 17:49
  • @abstrusiosity It sounds like you've hit the nail on the head! What would such a procedure look like? Are there parametric and non-parametric versions of it? – Tom Hosker Nov 11 '20 at 18:07
  • @abstrusiosity I'd still be really, really grateful if you (or anyone!) could point me towards a procedure that would test whether the device is taking sufficiently accurate measurements - within stated bounds. That was a genuinely insightful comment. – Tom Hosker Nov 12 '20 at 08:55
  • It's not really my area so I don't have any specific suggestions. The wikipedia article I linked above cites [this paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3019319) that seems to do a good job covering the issues involved in equivalence testing. – abstrusiosity Nov 12 '20 at 13:18

1 Answers1

7

Poisson regression.

Here is an example of a potential table you may be describing.

         category
method      1   2   3   4
  hand    101 210 590  99
  machine  97 401 403  99

A poisson regression with additive effects should yield the same expected cell count as the chi-square procedure.

Here is how we would fit the model and make the expected cell counts

tabl = xtabs(~method + category, data = d) 
model_data = as.data.frame(tabl)


model = glm(Freq~method + factor(category), data =model_data, family = poisson)

model_data$expec = predict(model, type = 'response')

And here is the Chi-square test

library(tidyverse)

model_data %>% 
  mutate(X = (Freq-expec)^2/expec) %>% 
  summarise(test_stat = sum(X))

>>>95.00335

This test has 3 degrees of freedom, and I don't need to look up the p value to tell you this is significant (since the test stat is very far from the mean of the chi square).

Here is the chi-square test itself. Note the test statistic

chisq.test(tabl)


    Pearson's Chi-squared test

data:  tabl
X-squared = 95.003, df = 3, p-value < 2.2e-16

So here, I used the predictions from the model to do the test. Another way to do this -- which I would count as a parametric test -- would be to do a deviance goodness of fit test for the Poisson model. The proof of why the deviance goodness of fit test is similar to the chi-square escapes me, but it is easy to show from directly computing it that the results are not too different.

The deviance goodness of fit test statistic is obtained via

model$deviance
>>>96.227

which is close enough. You can simuilate some more examples to check that the deviance and the chi-square result in similar test stats.

EDIT:

Turns out the chi-square test is an approximation to the likelihood ratio test for these models, which is closely related to the deviance goodness of fit test. The approximation is made by taking a taylor series expansion of some terms, which explains why the deviance GOF test statistic is larger than the chi square.

Demetri Pananos
  • 24,380
  • 1
  • 36
  • 94