Is it ok to repeat MANCOVA until all interactions are significant

Question

In order to enlighten the relation of three (related) scales describing psychopathology (1 anxiety score and 2 shame scores) with demographic data (gender, place of residence, age etc) collected by convenient sampling I conducted a MANCOVA, where only main effects and 2nd order interactions was included in the model (limited to 2nd order due to large number of independent variables)

I know that when an interaction is significant then the corresponding main effects (if significant) are marginal and should not reported or reported with caution.

My question is: if some interactions are not significant is it a good practice to remove those non significant interactions from the model and repeat the MANCOVA procedure until all remaining interactions are significant? (so that the most clear picture of effects and interactions will appear)

In case of an affirmative answer is there any reference for that practice?

Thank you in advance.

score 3 · Accepted Answer · edited Apr 13 '17 at 12:44

Peter Flom's answer to this question is also relevant in your case. Especially the last point; if you hypothesize an interaction, it should remain in the model. Gelman and Hill argue that "people sometimes think that if a coefficient estimate is not significant, then it should be excluded from the model. We disagree. It is fine to have nonsignificant coefficients in a model, as long as they make sense" (Gelman and Hill, 2006: 42). Saying that, in a subsection titled "building regression models for prediction" (and prediction is the keyword here), they suggest that "if a predictor is not statistically significant and does not have the expected sign..., consider removing it from the model (that is, setting its coefficient to zero)" (Gelman and Hill, 2006: 69). But if you look at the answers to similar questions on Cross Validated (for example, here and here), you will see that it is recommended to keep interactions (and predictors in general) that are not statistically significant in the model, especially if they are of substantive interest.

Although you did not ask about it, I think there is another issue. You say that the data are collected by convenient sampling which is a type of nonprobability sampling. According to Groves et.al., there are two benefits of using probability sampling (such as simple random sampling): "(1) Important types of sample statistics are unbiased. (2) We can estimate the sampling variance (standard error) from one realization of the sample design." (Groves et.al., 2004:381). There is some statistical rule involved in the probability sampling which is not the case for nonprobability, hence convenient, sampling. It does not mean necessarily that the nonprobability samples are not representative of the population, but it means "nonprobability samples cannot depend upon statistical probability theory" (de Leeuw, Hox and Dillman, 2008: 9). So, we need more information about the joint distributions of the variables in the population, which is not usually the case, and a model-based approach is recommended to make inference about the population (de Leeuw, Hox and Dillman, 2008). I think you should take this into consideration in your analysis.

de Leeuw, E., Hox, J. J., & Dillman, D. A. (2008). The Cornerstones of Survey Research. In E. de Leeuw, J. J. Hox, & D. A. Dilman (Eds.), International Handbook of Survey Methodology (pp. 1–18). New York and London: Lawrence Erlbaum Associates.

Gelman, A., & Hill, J. (2006). Data Analysis Using Regression and Multilevel Models. Cambridge: Cambridge University Press.

Groves, R. M., Fowler, J. J. J., Couper, M. P., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2004). Survey methodology. New Jersey: Wiley.

Thank you for your excellent answer. I think the last sentence of your answer about a model-based approach is the most essential conclusion, although I am not sure whether I could succeed in doing so in the collected data, since the factors and the interactions that appear to be significant varies a lot with the assumed model and all factors and interactions (theoretically) could find an explanation in the context of a theory... — Epaminondas, Nov 09 '16 at 21:00

Is it ok to repeat MANCOVA until all interactions are significant

1 Answers1

Linked