8

I need to do some simple mean comparisons between groups (basic ANOVA F-tests) on data with missing values. I use the mice package in R for multiple imputation, but I can only pool results for the linear model coefficients, or the $R^2$.

Does anyone know how to combine to pool multiple F-statistics from each linear model fit? Or, how can I compute the standard errors for the F-test?

Glen_b
  • 257,508
  • 32
  • 553
  • 939
Brian
  • 81
  • 2
  • Welcome to the site @Brian. Is this question *only* about how to get something done in R, or also about the statistical issues? If so, it really belongs on [Stack Overflow](http://stackoverflow.com/), rather than here (that doesn't make it a bad question, though). *Please don't cross-post* (SE strongly discourages this), if it's better there, just say so, & after a bit, the moderators will migrate it for you. – gung - Reinstate Monica Aug 10 '12 at 16:14
  • 2
    Thanks. Well, it's kind of a bit of both. The statistical issue is "How to do significance tests for comparing sample means with multiple imputation? In particular, how do you compute the variance of the estimate (F-statistic)?". Now, I'm using R/MICE for my multiple imputation needs, so I thought someone would know of a function for it. Alternatively I'd be more than happy with a statistical explanation on how to do it so I can just write the function myself from scratch. – Brian Aug 10 '12 at 17:16
  • It sounds like we can keep it here for now then. But if it doesn't get a satisfactory answer here after a while, you can also ask the moderators to migrate it to SO for you. GL – gung - Reinstate Monica Aug 10 '12 at 22:23
  • 2
    This paper might answer it for you: http://www-personal.umich.edu/~teraghu/Raghunathan-Dong.pdf – Jeremy Miles Feb 28 '13 at 18:18
  • 1
    I think this is very appropriate here. I've had this same question myself. @JeremyMiles is referring to the only paper I found on the subject, and unfortunately I didn't find it as convenient as I was hoping it would be. Perhaps this is still a domain that requires research? – Patrick Coulombe Sep 04 '13 at 02:13
  • I don't see why do you need to pool F-statistics. But if your concern is on the standard error of your "pooled" statistics, maybe you want to look into jackknife variance estimation procedure. – math_stat_enthusiast Dec 31 '13 at 20:01

1 Answers1

1

A recent paper by van Ginkel & Kroonenberg works out the details of pooling F-tests and other ANOVA results. The paper is:

van Ginkel, J. R., & Kroonenberg, P. M. (2014). Analysis of Variance of Multiply Imputed Data. Multivariate Behavioral Research, 49(1), 78-91.

and van Ginkel's website (http://www.socialsciences.leiden.edu/educationandchildstudies/childandfamilystudies/organisation/staffcfs/van-ginkel.html) has SPSS macros with instruction files. As far as I know, their formulae have not yet been implemented in R.

@Brian, if you do write a function, please share!

  • 1
    Can you provide some details about this article, and explain why you recommend it? – chl Jun 03 '14 at 10:38