0

The csv for the dataframe is here. Basically the column Mode represents the transportation choice available for the user to go to the airport, Choice represents the user's choice, and all the other columns depict the variables affecting the choice. I am posting some code below. the mFormula function just segregates the variables into 3 parts separated by a '|' i.e

alternative dependent w generic coefficients | individual specific w alternative specific coefficients | alternative dependent w alternative specific coefficients

I tried two formulas with the mFormula function one of which outputs the coefficients(f1) but doesn't model the variables for the utility function as it should(f).

library(mlogit)

path=""

a<-read.table(path,sep=",", header=TRUE)


Ta<-mlogit.data(a,shape="long",alt.var="Mode",choice="Choice")



f <- mFormula(Choice ~ CostPersonal+CostTrainBus+CostCarpool+CostTaxi+CostSharedTaxi+CostCS+CostSCS+DistSYR+DistNYC+DistALB+DistEWR+SafteyPersonal+SafteyTrainBus+SafteyCarpool+SafteyTaxi+SafteySharedTaxi+SafteyCS+SafteySCS+PrivacyPersonal+PrivacyTrainBus+PrivacyCarpool+PrivacyTaxi+PrivacySharedTaxi+PrivacyCS+PrivacySCS+ConviniencePersonal+ConvinienceTrainBus+ConvinienceCarpool+ConvinienceTaxi+ ConvinienceSharedTaxi+ConvinienceCS+ConvinienceSCS+payUticaSYR+payUticaALB+flightsYear+daysBreturn+incomeB+familySize+ageB|Airport+TimePersonal+TimeTrainBus+TimeCarpool+TimeTaxi+TimeSharedTaxi+TimeCS+TimeSCS+luggage+ownVehicle+shuttleMaxPersonsB)

f1<- mFormula(Choice ~ 1| +0+CostPersonal+CostTrainBus+CostCarpool+CostTaxi+CostSharedTaxi+CostCS+CostSCS+Airport+DistSYR+DistNYC+DistALB+SafteyPersonal+SafteyTrainBus+SafteyCarpool+SafteyTaxi+SafteySharedTaxi+PrivacyPersonal+PrivacyTrainBus+PrivacyCarpool+PrivacyTaxi+PrivacySharedTaxi+ConviniencePersonal+ConvinienceTrainBus+ConvinienceCarpool+ConvinienceTaxi+ConvinienceSharedTaxi+TimePersonal+TimeTrainBus+TimeCarpool+TimeTaxi+TimeSharedTaxi+flightsYear+daysBreturn+luggage+ownVehicle+incomeB+familySize+ageB+shuttleMaxPersonsB+payUticaSYR+payUticaALB)


ml.a<-mlogit(f,Ta)#does not output coefficients
ml.a<-mlogit(f1,Ta)#does output coefficients
summary(ml.a)`

The error is:

Error in solve.default(H, g[!fixed]) : 
  Lapack routine dgesv: system is exactly singular: U[3,3] = 0

although mlogit with f1 does output the coefficients with a

Likelihood: -2.2039e-07

Clearly I don't understand which variables should and should not be included and co-relation between which ones is causing the issue.

  • 2
    Your strategy seems to be to throw all the predictors you can find at the response. That often produces problems if there are inbuilt relations between predictors in principle or in the data and/or if the number of parameters you are estimating is not much less than the number of data points. (I've left part of your code unedited because I was wary of introducing errors.) – Nick Cox Apr 07 '19 at 11:09
  • Further, I am not at all fluent in R but interactions between predictors need very careful handling, especially between predictors that aren't indicators. How many parameters are you asking for? – Nick Cox Apr 07 '19 at 11:11
  • Trivial, but I will mention for your later reporting the correct spellings _convenience_ and _safety_. – Nick Cox Apr 07 '19 at 11:13
  • @NickCox Thanks for replying. As far as the number of predictors are concerned, they amount to about 50 in the formulae above. Although, trying it with just 4 doesn't work either. – Harsh Asnani Apr 07 '19 at 12:06
  • @NickCox the mFormulae function is divided into three parts(alternative specific variables with generic coefficients|individual specific variables| alternative specific variables with alternative coefficients) – Harsh Asnani Apr 07 '19 at 12:09
  • 3
    OK, but 1. Unfortunately "doesn't work" is not a very clear problem report. 2. Unsurprisingly not all readers are fluent in R, least of all in the syntax of a particular package or function that even frequent R users may not know. 3. Your question seems poised uncomfortably between "What is wrong with my R code?" and "What is wrong with my statistical thinking?" At best it may require people to play around with your code and your data to get a proper answer. At worst it may be off-topic here as in essence a request to debug your code. I can't be clear which it is. – Nick Cox Apr 07 '19 at 12:49
  • @NickCox Thanks for the comment. I've edited the post. Hope that makes your comprehension of my problem clearler. My question still boils down to "What's wrong with my statistical thinking?" – Harsh Asnani Apr 07 '19 at 20:04

0 Answers0