ANCOVA for 2X2 crossover with baseline measurements - validity of assumptions

Question

I'm trying to learn how to do ANCOVA for a 2x2 crossover study with baseline measurements. I have followed the analysis performed at Mehrotra 2014 "A recommended analysis for 2 x 2 crossover trials with baseline measurements" (The article is paywalled, the link is to the full text @ researchgate).

I've recreated the analysis performed at table V with method IV with the following python code:

import pandas
from statsmodels.formula.api import ols
x = pandas.read_json('{"X1":{"0":3.61,"1":4.09,"2":4.48,"3":4.26,"4":3.94,"5":6.93,"6":6.48,"7":4.26,"8":5.26,"9":7.0,"10":6.08,"11":5.62},"X2":{"0":3.61,"1":3.64,"2":4.62,"3":4.67,"4":3.61,"5":6.67,"6":6.13,"7":3.61,"8":4.38,"9":6.48,"10":5.78,"11":6.0},"Y1":{"0":4.16,"1":4.68,"2":4.63,"3":5.15,"4":3.61,"5":7.22,"6":7.53,"7":6.42,"8":6.41,"9":8.25,"10":6.69,"11":6.47},"Y2":{"0":3.61,"1":5.42,"2":5.27,"3":4.36,"4":4.26,"5":6.97,"6":7.8,"7":4.97,"8":4.58,"9":7.46,"10":5.25,"11":6.11},"Sequence":{"0":"1","1":"1","2":"1","3":"1","4":"1","5":"1","6":"1","7":"2","8":"2","9":"2","10":"2","11":"2"}}')
x['inter'] = x.Y1-x.Y2
x['base'] = x.X1-x.X2
ols('inter ~ base + C(Sequence)', x).fit().summary()

This provides the same p-value presented in the paper, of 0.013. However, I find it very weird. AFAIK, there should not be an interaction between the two variates. But here there is one, as for one treatment the slope of inter~base is positive and for the other negative. (You can visualize with:)

import matplotlib.pyplot as plt
ax = plt.subplot(111)
x[x.Sequence == '1'].plot(x='base', y='inter', kind = 'scatter', c = 'blue', ax = ax)
x[x.Sequence == '2'].plot(x='base', y='inter', kind = 'scatter', c = 'red', ax = ax)

This interaction is even statistically significant, as can be seen by running:

ols('inter ~ base + C(Sequence) + base*C(Sequence)', x).fit().summary()

This is because the model is (Y2-Y1)|(X2-X1). There would not be a violation if they would do (Y_t1-Y_t2)|(X_t1-X_t2) (when t1 and t2 are the two types of treatment), but then I don't see the point in supplying C(Sequence) to the model.

So to summarize my question - can you please explain why the analysis advocated in this paper, of ANCOVA using (Y2-Y1)|(X2-X1), makes sense and does not violate the assumptions for ANCOVA?

ANCOVA for 2X2 crossover with baseline measurements - validity of assumptions

0 Answers0