3

I'm having trouble with rpar argument in the mlogit function (package mlogit).

My dataset looks like this:

> head(scan.s)
   year id.scan day weather sealvl wave lact repro     nb.gr       HS nb.pv pup act
1 2011       1   4       2   0.30    3    1     0 0.6666667 7.600000     3   0   R
2 2011       1   4       2   0.30    3    1     0 0.6666667 7.600000     3   0   R
3 2011       1   4       2   0.30    3    1     0 0.6666667 7.600000     3   1   R
4 2011       2   4       2   0.35    3    1     0 0.6666667 8.100000     2   0   R
5 2011       2   4       2   0.35    3    1     0 0.6666667 8.100000     2   1   R
6 2011       3   4       2   0.40    3    1     0 0.6666667 8.633333     2   0   R

> str(scan.s)
'data.frame':   10140 obs. of  13 variables:
 $ year   : int  2011 2011 2011 2011 2011 2011 2011 2011 2011 2011 ...
 $ id.scan: Factor w/ 280 levels "1","2","3","4",..: 1 1 1 2 2 3 3 4 4 5 ...
 $ day    : int  4 4 4 4 4 4 4 4 4 4 ...
 $ weather: Factor w/ 3 levels "1","2","3": 2 2 2 2 2 2 2 2 2 2 ...
 $ sealvl : num  0.3 0.3 0.3 0.35 0.35 0.4 0.4 0.5 0.5 0.6 ...
 $ wave   : Factor w/ 4 levels "1","2","3","4": 3 3 3 3 3 3 3 3 3 3 ...
 $ lact   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ repro  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ nb.gr  : num  0.667 0.667 0.667 0.667 0.667 ...
 $ HS     : num  7.6 7.6 7.6 8.1 8.1 ...
 $ nb.pv  : int  3 3 3 2 2 2 2 2 2 2 ...
 $ pup    : int  0 0 1 0 1 0 1 0 1 0 ...
 $ act    : Factor w/ 5 levels "A","C","D","G",..: 5 5 5 5 5 5 5 5 5 5 ...

Then I used mlogit.data to transform my dataset in long shape:

> scan.l<- mlogit.data(scan, varying = NULL, choice = "act", shape = "wide")

There is no variable varying across choices.

    year id.scan day weather sealvl wave lact repro     nb.gr  HS nb.pv pup   act chid alt
1.A 2011       1   4       2    0.3    3    1     0 0.6666667 7.6     3   0 FALSE    1   A
1.C 2011       1   4       2    0.3    3    1     0 0.6666667 7.6     3   0 FALSE    1   C
1.D 2011       1   4       2    0.3    3    1     0 0.6666667 7.6     3   0 FALSE    1   D
1.G 2011       1   4       2    0.3    3    1     0 0.6666667 7.6     3   0 FALSE    1   G
1.R 2011       1   4       2    0.3    3    1     0 0.6666667 7.6     3   0  TRUE    1   R
2.A 2011       1   4       2    0.3    3    1     0 0.6666667 7.6     3   0 FALSE    2   A

> str(scan.l)
Classes ‘mlogit.data’ and 'data.frame': 50700 obs. of  15 variables:
 $ year   : int  2011 2011 2011 2011 2011 2011 2011 2011 2011 2011 ...
 $ id.scan: Factor w/ 280 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ day    : int  4 4 4 4 4 4 4 4 4 4 ...
 $ weather: Factor w/ 3 levels "1","2","3": 2 2 2 2 2 2 2 2 2 2 ...
 $ sealvl : num  0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 ...
 $ wave   : Factor w/ 4 levels "1","2","3","4": 3 3 3 3 3 3 3 3 3 3 ...
 $ lact   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ repro  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ nb.gr  : num  0.667 0.667 0.667 0.667 0.667 ...
 $ HS     : num  7.6 7.6 7.6 7.6 7.6 7.6 7.6 7.6 7.6 7.6 ...
 $ nb.pv  : int  3 3 3 3 3 3 3 3 3 3 ...
 $ pup    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ act    : logi  FALSE FALSE FALSE FALSE TRUE FALSE ...
 $ chid   : num  1 1 1 1 1 2 2 2 2 2 ...
 $ alt    : chr  "A" "C" "D" "G" ...
 - attr(*, "index")='data.frame': 50700 obs. of  2 variables:
  ..$ chid: Factor w/ 10140 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2 2 ...
  ..$ alt : Factor w/ 5 levels "A","C","D","G",..: 1 2 3 4 5 1 2 3 4 5 ...
 - attr(*, "choice")= chr "act"

Then I ran the model:

mod1 <- mlogit(act ~ 1| nb.gr+nb.pv+sealvl+lact+repro+HS+day+id.scan,data = na.omit(scan.l), rpar=id.scan, format="long", reflevel="R", R=100, halton=NA, print.level=0)

The random parameter here is a factor and I am supposed to specify a distribution for rpar but is it relevant for a factor? (I tried to provide a distribution without any change).

And then I get this:

Error in coef(eval(callst, parent.frame())) : 
  error in evaluating the argument 'object' in selecting a method for function'coef' : Error in solve.default(H, g[!fixed]) : Lapack routine dgesv: system is exactly singular

There is a way to use "HS" and "day" instead, both numerical. But then I get another error:

Error in names (sup.coef) <- names.sup.coef: Attribute 'names' [1] must be the same length as the vector [0] 

traceback() did not provide any insight about what happened.

I searched for explanations with those errors and found that there could be a problem between one outcome and the random effect so I tried to subset my dataset with every combination of 3 outcomes with the same result. I found nothing relevant about the second error. Maybe it has something to do with the transformation with mlogit.data. I checked the dataset provided with the mlogit package and could not figure out what I did different.

I would be grateful if someone could explain what is happening here.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Lyly
  • 31
  • 1
  • 2

3 Answers3

4

Your error is because you are including the id.scan variable in the list of covariates. The error where coef() is a derivative error because the model is failing to fit correctly -- that's the second error on that same line relating to the system is exactly singular. You are confusing the rpar in this specificiation with a "grouping" variable in mixed modelling generally. rpar specifies which covariate is responding to the grouping variable. You specify the grouping variable using the argument id.var = scan.id in the call to mlogit.data().

You cannot fit a random parameter model using mlogit() for the formula you are specifying. The covariates mentioned in rpar must be alternative specific covariates -- you have to read the vignette very carefully to see that! The example provided by @John Jackson has only the alternative specific covariates (before the |) in the rpar statement. I bet if you try:

mod1 <- mlogit(act ~ 1 | nb.gr+nb.pv+sealvl+lact+repro+HS+day,data = na.omit(scan.l),  reflevel="R")

It will work fine.

atiretoo
  • 1,458
  • 14
  • 29
1

You can also do it by specifying the rpar arguments exactly as they appear when you run the regularmlogit() command. So for your example:

rpar = list("A:scan.id"="n", "C:scan.id"="n", ...)

where they named list must be quoted because of the ":" (I think, it may just be for "(", but it works with quotes either way). This is explained in footnote 20 p. 24 of Viton, PA. "Discrete-Choice Logit Models with R" (pdf):

The specification of random parameters for the alternative-specific constants has changed from mlogit version 0.1-8. The old version had rpar=c(altair='n',altbus='n',alttrain='n'). If you get an error here, try estimating model without random parameters (like model res4 above), and note how the mode-specific dummys are reported; then use that syntax in the rpar argument.

Also see the 1st paragraph of p.26 of the documentation (pdf) about having to list the entire name of individual specific coefficients. I had a similar problem when trying to just use scan.id but it stated working when I started putting the A:scan.id = argument instead. I believe you can do this with your data, as long as you set panel=F which I think is the only time the id.var=scan.id is invoked for multiple observation on the same person. If you do in fact have a panel, then don't use that variable and use the other variables you want to simulate as normal or another distribution. I'd also recommend using halton=NA argument if you have a large dataset as it speeds up the simulation significantly.

* Philip A. Viton (2014) Discrete Choice Logit Models with R. Materials for Ohio State City and Regional Planning 5700.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
EconGeo
  • 21
  • 5
1

I received the same error message when I tried to include a random coefficient on a person specific variable, in this case the intercept, with a wide dataset. The solution appears to be to:

  1. convert the data to the long form;
  2. create alternative specific intercepts by interacting a variable equal to one for all observations with each category in the alt variable;
  3. estimate a model with these new variables in the list for alternative specific variables and any other person specific variables -1 after the | line;
  4. specify each of the alternative specific constant terms in the npar expression.

For the Fishing example, the long data are in fish and the mlogit command is:

out<mlogit(mode~oneboat+onecharter+onepier|income-1, fish, 
           rpar=c(oneboat="n",onecharter="n",onepier="n"), R=100, halton=NA)
gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650