2

I am trying to correctly interpret and plot the effects of a multinomial model fit in R. The data and script are here for reproducing the problem.

The multinomial variable describes different outcomes of a videogame where players have to hit a moving target. 3 is a miss, 2 is a non critical hit and 1 is a critical hit. The predictors are traits like the speed of the target, its body size (svl) or relative tail size, its behavior in the videogame (strategy) and the contrast of the tail color with the background (tailcontrst).

I strongly expect that speed makes it more difficult to give critical hits, however, the odds of higher values decrease with speed, as the table and the graph shows.

polr(formula = result ~ 1 + Speed + svl + svl:Speed + tailcontrst:Reltailsize + 
strategy:svl, data = results)

Coefficients:
                        Value Std. Error t value
Speed                   -0.266959  0.0079756 -33.472
svl                      0.075696  0.0016456  45.999

Speed:svl               -0.001898  0.0001965  -9.658
tailcontrst:Reltailsize  0.000405  0.0001375   2.946
svl:strategy            -0.009896  0.0007036 -14.064

enter image description here

Agus Camacho
  • 540
  • 2
  • 4
  • 14
  • 1
    One thing that's wrong is that you have an interaction term without the main effects that make it up. In addition, interpreting the main effect of speed when you have an interaction involving speed is not straightforward. And, on your link, there are tons of files (including some you might not want to be in a public link). Which ones have the data you used? And, on the same link, the first plot shows extreme skew. – Peter Flom Feb 08 '17 at 14:17
  • Thanks, it was supposed to only show two files. I edited the link. The model was obtained through a permutative comparison of the fit of thousands of potential models. That was the one with lower AIC. Guess you are right about speed and its interaction. However, i stilll cant understand why the effect is inverted. I explained the problem better in the question. – Agus Camacho Feb 09 '17 at 02:26

1 Answers1

1

First thing to do is to look at a simpler model and see what happens. The simplest that we should look at is just speed:

polr(formula = as.factor(result) ~ 1 + Speed,  data = data)

but this also shows a negative coefficient for speed, contrary to your expectations. So, let's look at a plot of result and speed:

boxplot(Speed~as.factor(result), data = data)

This is pretty unambiguous, there is higher speed for factor 1 than 2 and higher for 2 than 3.

So, there are at least two possibilities:

  1. You've miscoded the data. That's annoying but easy to fix.
  2. Your model is wrong. That's good news, in a weird way, because it means that you are surprised and that's an opportunity to learn something. (My favorite professor in grad school often said "if you're not surprised, you haven't learned anything").

However, I am guessing that it's quite possible that you have miscoded the data and that speed is not what you think it is. One common way to go wrong is to have data that indicates time to do something, rather than speed of doing it (e.g. in car data, acceleration is often given as number of seconds to reach a particular speed - higher numbers are lower acceleration).

Peter Flom
  • 94,055
  • 35
  • 143
  • 276