1

I have a question about SAS and R. For a research, I used a longitudinal data and I initially used SAS (GLIMMIX) and then I analyzed the data with R (glmer) programming. There are differences between p-values of SAS and R. I expected that regression coefficient and standard error could be different for R and SAS. But there are differences for p value for some variables, which are significant in R, are not significant in SAS.

My R model and SAS model are respectively :

#R
m3.glmm <- glmer(y ~ timebefore + timeafter + x1 + x2 +...+ x11 +      
                     (1+timebefore+timeafter|id), 
                 data=data, family=binomial(link="logit"), nAGQ=3)

#SAS
proc glimmix data=data METHOD=QUAD(QPOINTS=3) NOCLPRINT ;
  class id x2 x3 x4 x5;
  model y(event='1')=timebefore timeafter x1 x2 x3 x4 x5 
        x6 x7  x8 x9 x10 x11 /solution CL link = logit dist = binary;
  random intercept timebefore timeafter/subject = id GCORR SOLUTION;
run;

Eg: variable "x1"(defined as age) was significant (p val= 0.04) in SAS but not in R (p val=0.1). But others were similar. It means that significant variables in SAS are found significant in R, or insignificant variables in SAS are insignificant in R.

Does anybody know about the differences?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650

2 Answers2

2

This from an answer I posted on Cross Validated,

Why does SAS PROC GLIMMIX give me VERY different random slopes than glmer (lme4) for a binomial glmm

According to Zhang et al 2011, "On Fitting Generalized Linear Mixed-effects Models for Binary Responses using Different Statistical Packages", they describe:

Abstract:

The generalized linear mixed-effects model (GLMM) is a popular paradigm to extend models for cross-sectional data to a longitudinal setting. When applied to modeling binary responses, different software packages and even different procedures within a package may give quite different results. In this report, we describe the statistical approaches that underlie these different procedures and discuss their strengths and weaknesses when applied to fit correlated binary responses. We then illustrate these considerations by applying these procedures implemented in some popular software packages to simulated and real study data. Our simulation results indicate a lack of reliability for most of the procedures considered, which carries significant implications for applying such popular software packages in practice.

So, even if you specify the models in what you think might be the exact same way, the results will likely not be the same. Like me... you might drive yourself crazy trying to make them that way!

Nova
  • 525
  • 3
  • 16
1

If you expect the standard errors to change, why would you not expect the p-values to change as well? And what do you mean by 'big differences'?

Others may be more helpful, but in my (limited) experience with random effects/mixed models I've noticed fairly unstable standard error estimates depending on the routines I use, for example quadrature versus simulation methods.

Are you sure SAS and R are trying to solve the model in the exact same way? How stable are your regression coefficients? Any number of details might be wrong, but I would suggest looking under the hood for glimmix and glmer to see how they solve their optimization routines. Is nAGQ=3 doing the same thing as METHOD=QUAD(QPOINTS=3)?

Hope this is somewhat helpful!

  • Thank you for respond. I mean, there will be differences p-values between these methods. But for example in SAS, while p=0.04 for X2 , in R p=0.10 for X2, but p-values of other variables are similar, it means when they are significant in SAS, they are also significant in R, or when they are not significant in SAS, they are not significant in R. – Indigofera suffruticosa Jun 14 '14 at 02:04
  • To my knowledge, METHOD=QUAD(QPOINTS=3) for SAS is same as nAGQ=3 for R. – Indigofera suffruticosa Jun 14 '14 at 02:10