6

I am totally confused by statistics and I would be glad if you could help me.

I have a difficulties to interpret marginal effects in logit model, if my independent variable is log transformed.

I will illustrate my question on the example from my data below. I run a logistic regression in stata

My dependent variable is dummy indicating whether a game is of X Genre. My independent variable is a continuous and log transformed variable (log heterogeneity)

After I run a logit regression:

logit xGenre logheterogeneity + control variables

I get the following results:

The coefficient of my independet variable is .567

STATA:

-X Genre        Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-log heterog|  .5655944  .1741354     3.25   0.001     .2242953    .9068935

In order to be able to interpret the results easier, I should look at the marginal effects.

I have used therefore an mfx command.

My results show following: for my independent variable I get dy/dx = .056

STATA:

-variable       dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X

-log heter.  .0563382      .01688    3.34   0.001   .023259  .089417   3.51361

(*) dy/dx is for discrete change of dummy variable from 0 to 1

Now I am confused on how to interpret my results. Can I say, If my independent variable increase in 10% (log heterog.), then the probability that my Game will be of Genre X increases by 0.56%.

Sycorax
  • 76,417
  • 20
  • 189
  • 313
Alina Lobova
  • 63
  • 1
  • 1
  • 3

1 Answers1

8

You know that in a logit:

$$Pr[y = 1 \vert x,z] = p = \frac{\exp (\alpha + \beta \cdot \ln x + \gamma z)}{1+\exp (\alpha + \beta \cdot \ln x + \gamma z )}. $$

After some tedious calculus and simplification, the partial of that with respect to $x$ becomes:

$$ \frac{\partial Pr[y=1 \vert x,z]}{\partial x} = \frac{\beta}{x} \cdot p \cdot (1-p). $$

This is (sort of) equivalent to

$$\frac{\Delta p}{\Delta x}=\frac{\beta}{x} \cdot p \cdot (1-p),$$

which can be re-written as

$$\frac{\Delta p}{100 \cdot \frac{ \Delta x}{x}}= \frac{\beta \cdot p \cdot (1-p)}{100}.$$

This is the definition of semi-elasticity, and can be interpreted as the change in probability for a 1% change in $x$.

Here's an example in Stata.* Note that I am using margins instead of the out-of-date mfx to get the average marginal effect of $x$, $\frac{1}{N}\Sigma_{i=1}^N\frac{\beta \cdot p_i \cdot (1-p_i)}{100}$:

. sysuse auto, clear
(1978 Automobile Data)

. gen ln_price = ln(price)

. logit foreign ln_price mpg weight, nolog

Logistic regression                             Number of obs     =         74
                                                LR chi2(3)        =      57.69
                                                Prob > chi2       =     0.0000
Log likelihood = -16.185932                     Pseudo R2         =     0.6406

------------------------------------------------------------------------------
     foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ln_price |   6.851215    2.11763     3.24   0.001     2.700737    11.00169
         mpg |  -.0880842   .1031317    -0.85   0.393    -.2902186    .1140503
      weight |  -.0062268   .0017269    -3.61   0.000    -.0096115   -.0028422
       _cons |  -41.32383   16.24003    -2.54   0.011    -73.15371   -9.493947
------------------------------------------------------------------------------

. margins, expression(_b[ln_price]*predict()*(1-predict())/100)

Predictive margins                              Number of obs     =         74
Model VCE    : OIM

Expression   : _b[ln_price]*predict()*(1-predict())/100

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   .0046371   .0007965     5.82   0.000      .003076    .0061982
------------------------------------------------------------------------------

This means that for a 1% increase in price, the probability that a car is foreign increases by 0.005 on a [0,1] scale. Or a 10% increase in price gives you a 0.05 increase. In this date, about 0.3 of the cars are foreign, so these are economically and statistically significant.


Edit:

A good way to do this in Stata 10 is to install the user-written command margeff:

. margeff, dydx(ln_price) replace

Average partial effects after margeff
      y  = Pr(foreign) 

------------------------------------------------------------------------------
    variable |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ln_price |   .4637103   .0796514     5.82   0.000     .3075964    .6198241
         mpg |  -.0059616    .006781    -0.88   0.379    -.0192522     .007329
      weight |  -.0004214   .0000417   -10.11   0.000    -.0005031   -.0003398
------------------------------------------------------------------------------

. lincom _b[ln_price]/100

 ( 1)  .01*ln_price = 0

------------------------------------------------------------------------------
    variable |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   .0046371   .0007965     5.82   0.000      .003076    .0061982
------------------------------------------------------------------------------

*This is actually not a great empirical example since the relationship in the data has an inverted-U shape.

dimitriy
  • 31,081
  • 5
  • 63
  • 138
  • Thank You very much for a quick answer and clear example! Very helpful. I will try the margins command instead of mfx. – Alina Lobova Apr 22 '15 at 08:32
  • Unfortunately I have Stata 10.1, after a quick googeling I found that I cannot use this command in Stata 10. Is there any way to do the same in Stata 10? – Alina Lobova Apr 22 '15 at 08:51
  • 1
    @AlinaLobova Luckily, it is pretty easy. See the edit above. It would be good practice to include such info in the question going forward. – dimitriy Apr 22 '15 at 15:48
  • the commands you have provided for stata10 worked fine! Will definitely include the program version next time. I have another question,considering the economical significance. In your answer you have written _In this date, about 0.3 of the cars are foreign, so these are **economically** and statistically significant_. I understand that economical significance, says if my coefficient/result makes sense in practice. Now I am confused, to see that if in my data only **0.3** cars are foreign, even a small increase such as 0.005 is economically significant. – Alina Lobova Apr 22 '15 at 18:38
  • @AlinaLobova I would define ES as practical significance, as measured by the sign and magnitude of the AMEs, instead of p-values. 0.3 means that about a third of the cars are foreign. If a 10% price increase takes that to 0.35, that is a 17% increase in imports in the sample. – dimitriy Apr 22 '15 at 18:52
  • Sorry, how do you come up with overall 17% increase in imports? I just cannot calculate it myself. – Alina Lobova Apr 22 '15 at 19:14
  • From 100*(.35-.3)/.3. – dimitriy Apr 22 '15 at 19:22
  • would the above code remain valid (also in terms of produced standard errors) if my model is slightly more complicated? Suppose I have some interaction effect between X1 and X2 and my marginal effect (dp/dX1) gives me something like: margins, expression(_b[ln_price]*X2*predict()*(1-predict())/100) – night_owl89 Mar 23 '20 at 00:39
  • @night_owl89 This is hard to answer in the abstract. If those interactions don't involve ln_price, then yes. If they do, then the answer is no, and it is probably worth posting a new question with the details. – dimitriy Mar 23 '20 at 00:43
  • Indeed; in terms of your example I am thinking about something alone the lines of: logit foreign ln_price mpg weight ln_price#weight, nolog ; analytically it is obvious but I am wondering how would that translate into stata code *using* the expression() option in particular. – night_owl89 Mar 23 '20 at 00:49