7

So, I have a homework assignment in which I'm being asked to compare the fit of two similar models by comparing their $R^2$ and AIC. Both models were run in R, one using the lm command (for OLS) and the other the glm command; the former yielded an adjusted $R^2$ of 0.82, and the second model, with the same DV and covariates, produced an AIC of 365.96.

The only difference between the models is the "g" in the lm/glm command. The coefficients and standard errors they returned are virtually identical.

How does one compare $R^2$ and AIC? How can I tell which regression model fits the data better?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Gordon
  • 71
  • 2
  • 3
    So, if coefficients and SEs are identical, what is the difference between the fits (i.e., the results not the R commands)? – Roland Jan 23 '14 at 17:12
  • 2
    Hint: think about what it means to fit a model using the GLM procedure versus the lm (OLS) procedure. – Sycorax Jan 23 '14 at 17:16
  • 1
    of course, the R^2 is the proportion of the variance explained by the regression line; AIC I understand as decreasing as the log-likelihood increases. So we want a high R^2 and a low AIC. But the R^2 is bounded between 0 and 1; the AIC is only judged by how low it can get. it doesn't have an upper and lower bound such that comparisons would be easy to make. also, i know GLM is generalized linear models; LM is linear model in R. I'm not sure how that relates to how to understand AIC and R^2, though... – Gordon Jan 23 '14 at 17:29
  • 4
    Don't compare AIC and $R^2$, instead compute AIC and $R^2$ for both. – Ellis Valentiner Jan 23 '14 at 17:46

1 Answers1

4

(I assume your homework has been turned in by now ;-). I'll answer this so that it doesn't stay officially unanswered.)

@user12202013 is right you don't compare an AIC to an $R^2_{\rm adj}$. You can compare the AICs from two different models, and you can compare the $R^2_{\rm adj}$s from two different models, as ways to help you think about which model fits better. However, I don't think that was the point of the exercise. What you need to recognize is that linear regression (OLS) is a special case (i.e., simplified version) of the generalized linear model. (The topic of GLiMs is a bit involved, but to learn more about it, it may help you to read my answer here: Difference between logit and probit models.) Moreover, the R function glm() fits a linear model assuming normally distributed errors by default. In other words, I think the assignment was to notice that you got the same model using two different function calls as @Roland and @user777 hinted.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650