2

So I'm currently trying to use a multinomial logistic regression model in R on a data set with 13 variables (mix of continuous and categorical) and 33,000 observations, where the dependent variable has 4 different categories. To do this properly though I need to test the following assumption:

"There needs to be a linear relationship between any continuous independent variable and the logit transformation of the dependent variable"

I'm not sure on how best to carry this out. After a bit of research I found a possible solution where I could create a simple logistic regression model for each individual category in the outcome variable and then look at the plot of binned residuals against the estimated probabilities. Still in need of some further advice on whether this is a correct solution or if there is a better way to do this. Thanks

ColorStatistics
  • 2,699
  • 1
  • 10
  • 26
ChrisAirth
  • 21
  • 2
  • This is a very good question, and one that will serve this site well. I think there are several separate questions here: 1) Is there a linearity assumption in Multinomial Logistic Regression... this thread says yes https://stats.stackexchange.com/questions/136771/assumptions-behind-multinomial-logistic-regression, yet this other thread says no https://stats.stackexchange.com/questions/30959/multinomial-logistic-regression-assumptions?rq=1 and yet both threads have people with high reputation scores opining. 2) If linearity is an assumption of Multinomial Logit, what exactly is it? – ColorStatistics Apr 08 '20 at 21:14
  • The assumption as quoted by the OP comes from Binomial Logistic Regression. What is its counterpart, if any, in Multinomial Logistic Regression. And finally there is a question how it would be done with R. – ColorStatistics Apr 08 '20 at 21:14
  • @ColorStatistics: It's the same as with linear regression: If there is only one categorical redictor, the linearity assumption is void, else, there is certainly a linearity assumption. To test it? in practical terms, try splines. – kjetil b halvorsen Apr 08 '20 at 22:28
  • @kjetilbhalvorsen: Thank you for clarifying that. It explains the difference between the referenced 2 threads: one assumes some of the predictors are continuous; the other one does not. Since the linearity assumption in multinomial logistic regression, as I understand it, is tested using a set of variables formed from the outcome multinomial variable, this is not something that is explained in either response and hoping someone who understands this better than I do can explain that. – ColorStatistics Apr 08 '20 at 22:38

0 Answers0