Size of Tukey's 1 Degree of Freedom Test

Question

In Oehlert (p. 218) the following algorithm for computing Tukey's one degree of freedom test for non-additivity is suggested:

Fit a preliminary model; this will usually be an additive model.

Get the predicted values from the preliminary model; square them and divide their squares by twice the mean of the data.

Fit the data with a model that includes the preliminary model and the rescaled squared predicted values as explanatory variables.

The improvement sum of squares going from the preliminary model to the model including the rescaled squared predicted values is the single degree of freedom sum of squares for the Tukey model.

Test for significance of a Tukey type interaction by dividing the Tukey sum of squares by the error mean square from the model including squared predicted terms.

The coefficient for the rescaled squared predicted values is $\eta$, an estimate of $\eta$. If Tukey interaction is present, transform the data to the power 1 − $\eta$ to remove the Tukey interaction.

Suppose the original model is $y=X\beta +e$. If I understand things correctly, the test can be constructed the following way: get $\hat y = X\hat \beta $ from this model and compute $Z = \hat y ^2 / (2\bar y)$. Then fit the model $y=X\beta + Z\eta + u$ and test the null $\eta = 0$ against $\eta \neq 0$ using an F-test, which is equivalent to the regular t-test of this parameter.

My question is this: wouldn't the fact that $Z$ is estimated from the data distort the size of this test? Specifically, I would think the variance is underestimated. And if so, are there any studies on the size of this test? I would also be interested in a convincing argument that we don't care about the size of this test, something an applied researcher I spoke to recently claimed.

You can check its behavior under the null via simulation easily enough — Glen_b, Mar 21 '15 at 07:15

Momo · Accepted Answer · 2015-03-21T12:29:10.067

The short answer is no, it doesn't distort the test's size (i.e., it keeps the at most nominal $\alpha$ level).

This is as long as the right sum of squares and right degrees of freedom are used for the F-distributed test statistic (and all other assumptions are met). The test basically compares the sum of squares of the interaction with the error sum of squares and in this the fact that interaction and error sum of squares are estimated is factored in. This is because the F distribution arises from the ratio of two scaled $\chi^2$ distributions and the estimated mean sum of squares for error and interaction each follow a $\chi^2$ distribution.

Interestingly, the test even works for mixed models where the interaction and one of the factors are random. Here you'd have to estimate a variance component for the random factors a situation for which the Tukey test was not developed. Nevertheless, it still works in that situation. The behaviour of the Tukey 1 df test (and various others) has been studied in both fixed and mixed factor situations in this paper (preprint).

The question reveals good thinking on your part though: In the paper you find a modification of the Tukey 1 Df that exhibits more power, but it also tends to have slightly higher type-I-risk than the nominal $\alpha$. That is because the modification utilizes a $\chi²$ test statistic and relies on an asymptotic argument (basically for denominator degrees of freedom of $\infty$). In small samples this therefore exceeds the nominal level. If in the standard Tukey 1 Df test the denominator would be treated as fixed (i.e., the error sum of squares in the model with the squared predicted values - step 5) this would amount to a similar situation and would indeed cause the size to be more than $\alpha$. Luckily, though, it is not treated that way in Tukey's test.

Size of Tukey's 1 Degree of Freedom Test

1 Answers1

Linked