8

I don't understand how to interpret the coefficient from a Poisson regression relative to the coefficient from an OLS regression.

Suppose I have time series data, my left-hand side variable is number of games won per year, and my main right-hand side variable is NASDAQ value. If my preferred specification is to interpret the model as in percentage terms, I take the log transformation of games won. I can also take the log of the NASDAQ to say how much a 1 percent increase in the NASDAQ would increase the percent of games won. Now, I acknowledge that a Poisson model might make sense because the data for games won are counts and not continuous. I run the regression with say many, many control variables.

Would I not do a log-transformation on games won and instead use only games? When I get the coefficients, do I do some sort of marginal effects calculation (as can be done for probit)?
How do I interpret these coefficients?
How do I compare the interpretation of the Poisson to OLS--either the OLS that is log-transformed or the OLS that is not?

I know this sort of question has been asked before, but I still don't quite get it.

user1690130
  • 755
  • 2
  • 11
  • 23
  • My answer here is relevant: http://stats.stackexchange.com/questions/142338/goodness-of-fit-and-which-model-to-choose-linear-regression-or-poisson/142353#142353 – kjetil b halvorsen Feb 29 '16 at 19:26

1 Answers1

10

Not to be critical, but this is kind of a strange example. It's not clear that you're really doing time series analysis, nor what the NASDAQ would have to do with the number of games won by some team. If you're interested in saying something about the number of games a team won, I think it would be best to use binary logistic regression, given that you presumably know how many games are played. Poisson regression is most appropriate for talking about counts when the total possible is not constrained well, or at least not known.

How you would interpret your betas depends, in part, on the link used--it is possible to use the identity link, even though the log link is more common (and typically more appropriate). If you are using the log link, you probably wouldn't take the log of your response variable--the link in essence is doing that for you. Let's take an abstract case, you have a Poisson model using the log link as follows:
$$ \hat{y}=\text{exp}(\hat{\beta}_0)*\text{exp}(\hat{\beta}_1)^x $$ alternatively, $$ \hat{y}=\text{exp}(\hat{\beta}_0+\hat{\beta}_1x) $$

(EDIT: I'm removing the "hats" from the betas in what follows, because they're ugly, but they should still be understood.)

With normal OLS regression, you are predicting the mean of a Gaussian distribution of the response variable conditional on the values of the covariates. In this case, you are predicting the mean of a Poisson distribution of the response variable conditional on the values of the covariates. For OLS, if a given case were 1 unit higher on your covariate, you expect, all things being equal, the mean of that conditional distribution to be ${\beta}_1$ units higher. Here, if a given case were 1 unit higher, ceteris paribus, you expect the conditional mean to be $e^{{\beta}_1}$ times higher. For instance, say ${\beta}_1=2$, then in normal regression it is 2 units higher (i.e., +2), and here it is 7.4 times higher (i.e., x 7.4). In both cases, ${\beta}_0$ is your intercept; in our equation above, consider the situation when $x=0$, then exp$({\beta}_1)^x=1$, and the right hand side reduces to exp(${\beta}_0$), which gives you the mean of $y$ when all covariates equal 0.

There are a couple of things that can be confusing about this. First, predicting the mean of a Poisson distribution isn't the same as predicting the mean of a Gaussian. With a normal distribution, the mean is the single most likely value. But with the Poisson, the mean is often an impossible value (e.g., if your predicted mean is 2.7, that's not a count that could exist). In addition, normally the mean is unrelated to the level of dispersion (i.e., the SD), but with the Poisson distribution, the variance necessarily equals the mean (although, it often doesn't in practice, leading to additional complexities). Finally, those exponentiations make it more complicated; if, instead of a relative change, you wanted to know the exact value, you would have to start at 0 (i.e., $e^{{\beta}_0}$) and multiply your way up $x$ times. For predicting a specific value, it's easier to solve the expression inside the parentheses in the bottom equation and then exponentiate; this makes the meaning of the beta less clear, but the math easier and reduces the possibility of error.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • Thank you for your help! Yes, I agree the example is terrible. Thank you for the abstraction. I understand how to interpret OLS. 1 unit increase in x lead to a beta_1 increase in y. If I do a log-transformation to y, then a 1 unit increase in x leads to a 100*beta_1% increase in y. I don't understand what to do with Poisson. If I know beta_1, a 1 unit increase in x leads to a what increase in y? – user1690130 Mar 18 '12 at 04:52
  • It's in the answer, in the 3rd paragraph. A 1 unit increase in x leads to an exp($\beta_1$) *times* increase in y. Say your 'old' y was 10, and $\beta_1=2$, then exp($\beta_1$)=7.4, and y would be 10 *times* 7.4, ie 74. If there were another observation that was 1 unit higher still, that would be 74 * 7.4, etc. – gung - Reinstate Monica Mar 18 '12 at 05:15
  • I'm not understanding because it seems to depends on the values of x and y? Is there a "marginal effects" that people tend to go by? For example, don't people use mfx in Stata to report probit estimates? – user1690130 Mar 18 '12 at 12:48
  • The relationship in Poisson reg is *multiplicative* instead of *additive* like in OLS reg. Given that, the role of x isn't that dissimilar. Eg, w/ OLS reg, the value of x tells you how many times you *add* $\beta_1$, w/ Poisson, the value of x tells you how many times you *multiply by* exp($\beta_1$) (both times when going up from 0). OTOH, Poisson certainly does depend on the value of y. B/c the relationship in Poisson is multiplicative, the size of the *jump* at a given point depends on the current value of y, but the size of the *multiplicative factor* remains constant @ exp($\beta_1$). – gung - Reinstate Monica Mar 18 '12 at 13:42
  • How do I compare OLS to Poisson. I want to say that the results are not driven by the distributional assumption. Is there a way to estimate the mean effects of Poisson estimates, using marginal effects like in Poisson? – user1690130 Mar 18 '12 at 15:17
  • 1
    I don't follow that. You don't compare OLS to Poisson; they are different types of models for dif types of situations / phenomena. They are *not* 2 dif models of the same thing where 1 model might be a better account than the other. You wouldn't compare a kitten & a Christmas tree to see if 1 were better. I don't quite get how you're using the phrase "marginal effect", if you mean the effect of a predictor *ignoring the effects of all other variables* (like the marginal effect of a factor in ANOVA), then exp($\beta_1$) is the marginal multiplicative effect of $x_1$. – gung - Reinstate Monica Mar 18 '12 at 15:31
  • 1
    I, like @gung , am not sure what you are trying to do. But if you want to compare the *results* of the two models, you can plot the predicted values of each against each other in a scatterplot. Comparing the coefficients doesn't make sense. – Peter Flom Mar 18 '12 at 15:52
  • I have often seen probit coefficients compared to OLS. Why would Poisson v.s. OLS be a different comparison than probit v.s. OLS? – user1690130 Mar 18 '12 at 17:14
  • That's interesting, I've never seen probit actually used outside of a textbook--I've always seen / used logistic regression. At any rate, probit assumes a latent normal distribution which has been converted into a binary variable, so there is a natural connection between probit & OLS that doesn't exist b/t logistic & OLS, Poisson & OLS, etc. – gung - Reinstate Monica Mar 18 '12 at 17:23
  • If the dependent variable is binary, either a linear probability model or a probit can be used. Isn't the difference just the distributional assumptions made on the error? – user1690130 Mar 18 '12 at 22:15
  • I'm not sure what you mean. If the DV is binary, you could use logistic regression, or probit, or some other possibilities like complementary-log-log, but you couldn't use OLS regression. Probit models do have a natural connection to OLS regression, but they are not the same thing, and that connection doesn't exist for logit or Poisson, for example. – gung - Reinstate Monica Mar 19 '12 at 03:29