1

I have the following toy data:

x <- structure(c(2L, 2L, 3L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 3L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 
2L, 3L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 2L, 3L, 2L, 3L, 
3L, 2L, 2L, 2L, 3L, 2L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 3L, 
3L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("1", "2", "3"), class = "factor")

y <- structure(c(2L, 2L, 3L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 
2L, 3L, 2L, 1L, 2L, 2L, 3L, 2L, 2L, 3L, 2L, 3L, 2L, 3L, 1L, 2L, 
2L, 3L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 3L, 3L, 2L, 2L, 2L, 
2L, 3L, 2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 3L, 3L, 1L, 3L, 2L, 3L, 
3L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 3L, 2L, 1L, 
3L, 2L, 2L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 1L, 3L, 
3L, 3L, 2L, 2L, 3L, 3L, 2L), .Label = c("1", "2", "3"), class = "factor")

z <- structure(c(1L, 1L, 3L, 2L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 3L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 
1L, 3L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 3L, 1L, 3L, 
3L, 1L, 1L, 1L, 3L, 1L, 3L, 3L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 3L, 
3L, 1L, 1L, 1L, 3L, 3L, 3L), .Label = c("1", "2", "3"), class = "factor")

I have replaced the 1's in x with 2's in z and vice versa. Now when I do an ordered logistic regression in R with polr (library MASS), I get the following coefficients:

f1 <- polr(y~x,Hess=TRUE)
f2 <- polr(y~z,Hess=TRUE)

coef(summary(f1))
       Value Std. Error  t value
x2  25.95727  0.3028808 85.70127
x3  30.21524  0.5463144 55.30742
1|2 24.02167  0.3480269 69.02246
2|3 27.77068  0.3432316 80.90944

coef(summary(f2))
         Value   Std. Error       t value
z2  -21.495979 6.530398e-10 -3.291680e+10
z3    4.257964 8.119540e-01  5.244095e+00
1|2  -1.935599 3.567345e-01 -5.425880e+00
2|3   1.813399 3.411874e-01  5.314964e+00

It seems that something is not correct. Why relabeling the levels is changing dramatically the estimates for the SEs?

mdewey
  • 16,541
  • 22
  • 30
  • 57
teucer
  • 1,801
  • 2
  • 16
  • 29
  • What exactly do you mean by *"I have replaced the 1's in x with 2's in z and vice versa"*? Regardless of any relabeling of the classes in `x` or `z` (as categorical predictors, you're right that any relabeling, as long as it was consistent, shouldn't change any results) but it's not clear why you expect the model with `x` as the predictor and the model with `z` as the predictor will give the same results. – Macro Jul 13 '13 at 18:36
  • @Macro What I have done is basically z – teucer Jul 13 '13 at 20:22
  • In that case the fits are not different, the coefficients are just estimating different things. For example, in `f2`, the coefficient for `z3` is estimating the difference between level `3` and level `1` of `z`. In model `f1`, this quantity if estimated by the difference of the other two coefficients, because you've swapped the labels and you can see that `30.21-25.95=4.26`. Similarly, the values of the thresholds get moved around because the linear predictor has been moved around, but the model fit should be equivalent. – Macro Jul 15 '13 at 15:36
  • @Macro Ok that's clear. My question is more about the standard errors. – teucer Jul 15 '13 at 20:20
  • Well, you're estimating a different parameter so it's no surprise that it has a different standard error. – Macro Jul 25 '13 at 19:02

0 Answers0