2

I have tried to build an ordinal logistic regression using one ordered categorical variable and another three categorical dependent variables (N= 43097). While all coefficients are significant, I have doubts about meeting the parallel regression assumption. Though the probability values of all variables and the whole model in the brant test are perfectly zero (which supposed to be more than 0.05), still test is displaying that H0: Parallel Regression Assumption holds. I am confused here. Is this model perfectly meets the criteria of the parallel regression assumption?

library(MASS)
table(hh18_u_r$cat_ci_score) # Dependent variable

Extremely Vulnerable  Moderate Vulnerable    Pandemic Prepared 
              6143                16341                20613 

# Ordinal logistic regression
olr_2 <- polr(cat_ci_score ~ r1_gender + r2_merginalised + r9_religion, data = hh18_u_r, Hess=TRUE)
summary(olr_2)

Call:
polr(formula = cat_ci_score ~ r1_gender + r2_merginalised + r9_religion, 
  data = hh18_u_r, Hess = TRUE)

Coefficients:
                      Value Std. Error t value
r1_genderMale          0.3983    0.02607  15.278
r2_merginalisedOthers  0.6641    0.01953  34.014
r9_religionHinduism   -0.2432    0.03069  -7.926
r9_religionIslam      -0.5425    0.03727 -14.556

Intercepts:
                                       Value    Std. Error t value 
Extremely Vulnerable|Moderate Vulnerable  -1.5142   0.0368   -41.1598
Moderate Vulnerable|Pandemic Prepared      0.4170   0.0359    11.6260

Residual Deviance: 84438.43 
AIC: 84450.43 

## significance of coefficients and intercepts
summary_table_2 <- coef(summary(olr_2))
pval_2 <- pnorm(abs(summary_table_2[, "t value"]), lower.tail = FALSE)* 2
summary_table_2 <- cbind(summary_table_2, pval_2)
summary_table_2

                                            Value Std. Error    t value        pval_2
r1_genderMale                             0.3982719 0.02606904  15.277583  1.481954e-52
r2_merginalisedOthers                     0.6641311 0.01952501  34.014386 2.848250e-250
r9_religionHinduism                      -0.2432085 0.03068613  -7.925682  2.323144e-15
r9_religionIslam                         -0.5424992 0.03726868 -14.556436  6.908533e-48
Extremely Vulnerable|Moderate Vulnerable -1.5141502 0.03678710 -41.159819  0.000000e+00
Moderate Vulnerable|Pandemic Prepared     0.4169645 0.03586470  11.626042  3.382922e-31

#Test of parallel regression assumption
library(brant)
brant(olr_2) # Probability supposed to be more than 0.05 as I understand

---------------------------------------------------- 
Test for        X2  df  probability 
---------------------------------------------------- 
Omnibus         168.91  4   0
r1_genderMale       12.99   1   0
r2_merginalisedOthers   41.18   1   0
r9_religionHinduism 86.16   1   0
r9_religionIslam    25.13   1   0
---------------------------------------------------- 

H0: Parallel Regression Assumption holds

# Similar test of parallel regression assumption using car package
library(car)
car::poTest(olr_2)
Tests for Proportional Odds
polr(formula = cat_ci_score ~ r1_gender + r2_merginalised + r9_religion, 
  data = hh18_u_r, Hess = TRUE)

                    b[polr] b[>Extremely Vulnerable] b[>Moderate Vulnerable] Chisquare df Pr(>Chisq)    
Overall                                                                            168.9  4    < 2e-16 ***
r1_genderMale           0.398                    0.305                   0.442      13.0  1    0.00031 ***
r2_merginalisedOthers   0.664                    0.513                   0.700      41.2  1    1.4e-10 ***
r9_religionHinduism    -0.243                   -0.662                  -0.147      86.2  1    < 2e-16 ***
r9_religionIslam       -0.542                   -0.822                  -0.504      25.1  1    5.4e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Kindly suggest whether this model satisfies the parallel regression assumption? Thank you

  • 1
    I believe this is just R's standard notation for the null hypothesis under investigation: This does not mean that $H_0$ currently holds given your data and the test statistic, it's just a remainder of what is being tested actually. – chl Oct 22 '20 at 06:45
  • @chl, thank you. So, shall I reject the selection of the ordinal regression model? I am really confused. My dependent variable is a score with a range between 0 and 10 (fixed) and later categorized into three levels with ordered pattern (0-3: extremely vulnerable, 4-7- moderately and 8-10: pandemic prepared). Given my ordinal regression does not meet the assumption of parallel regression, what should I do? Any suggestions for improvement will be highly appreciated. – Biswajit Kar Oct 22 '20 at 07:04
  • 2
    I don't know the `brant` package, but if your p-value is < 0.05 (or whatever threshold you consider) then it means "reject the null". If $H_0$ is the [proportional odds assumption](https://stats.stackexchange.com/a/268825/930), then you're in trouble, and you may want to look for alternative ordered logit models. – chl Oct 22 '20 at 07:16

1 Answers1

2

@chl is right. I added it to to the function output to remind persons what hypothesis they are testing because it is often not clear what the alternative ($H_A$) and the null hypothesis ($H_0$) is. So it just tells you what the null hypothesis is and nothing about the acutal result. p < 0.05 means that $H_0$ can be rejected.

So in your case the parallel regression assumption does not hold. In generell: p-value of omnibus >= 0.05 => holds, p-value < 0.05 => does not hold (assumption: $\alpha$-value of 0.05).

  • Thank you, sir, for your response. Could you please tell me about the alternative ordinal regression model in detail? How could I fit a proper model to understand the impact of independent factors over ordinal dependent? – Biswajit Kar Oct 22 '20 at 07:30
  • 2
    I do not know the alternative model very well. I just know that you can estimate a [generalized ordered logit model](https://stackoverflow.com/questions/47346321/generalized-ordered-logit-in-r-or-python), which I however never used myself. – Benjamin Schlegel Oct 22 '20 at 07:41
  • 2
    A few possibilities for alternative models are: the partial PO model (see Stata [gologit2](https://www3.nd.edu/~rwilliam/gologit2/), I don't about specific R package), as suggested above, any IRT model that deal with polytomous items and allow for varying slopes (partial credit or rating scale with varying discrimination parameters) --- I provide some references in this [post](https://stats.stackexchange.com/a/31842/930). Frank Harrell also provides an extensive review of such models in his [RMS](https://hbiostat.org/doc/rms/4day.html) book or 4-day handout (see chap. 13). – chl Oct 22 '20 at 08:28
  • Thank you very much for answering. – Biswajit Kar Oct 22 '20 at 15:38
  • @BenjaminSchlegel I wanted a small clarification: I'm using the Brent package and I see that except for two terms (out of 20) the p value is > 0.5. Is there a procedure/resource that I can refer to understand what I can do here? For instance, do I proceed with interpreting ordianl regression because omnibus p-value is > 0.05? Alternatively, what can I do if the test of parallel lines fails? – Pss Mar 05 '22 at 23:17