how to calculate R-squared in glm?

Question

I came up with below for my glm analysis but I need to calculate R-squared to cite in the paper? anyone can help me with this please?

summary(lrfit)

Call:

glm(formula = cbind(CumNumberTakeOff, CumNumberNOTakeOff) ~ Sex + 
    PlantQuality + Minlog + Temperature + Temperaturetm + +Temperature:Sex + 
    Temperature:PlantQuality + Sex:PlantQuality + Minlog:PlantQuality, 
    family = binomial, data = expdataNo20)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.3724  -0.6914  -0.2577  -0.0168   3.1202

Coefficients:

                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -38.04288    2.22259 -17.116  < 2e-16 ***
SexMale                      10.20370    1.78445   5.718 1.08e-08 ***
PlantQualityF/W              19.99712    1.84748  10.824  < 2e-16 ***
Minlog                        1.01936    0.04873  20.918  < 2e-16 ***
Temperature                   1.21796    0.08583  14.191  < 2e-16 ***
Temperaturetm                -0.52639    0.02479 -21.235  < 2e-16 ***
SexMale:Temperature          -0.35374    0.06807  -5.197 2.03e-07 ***
PlantQualityF/W:Temperature  -0.68000    0.07118  -9.553  < 2e-16 ***
SexMale:PlantQualityF/W      -1.13717    0.17008  -6.686 2.29e-11 ***
PlantQualityF/W:Minlog       -0.38413    0.06478  -5.930 3.03e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

see this question on goodness-of-fit for logistic regression: http://stats.stackexchange.com/questions/169000/goodness-of-fit-test-in-logistic-regression-which-fit-do-we-want-to-test. This may also be interesting: http://stats.stackexchange.com/questions/164120/interesting-logistic-regression-idea-problem-data-not-currently-in-0-1-form/164127#164127 — , Aug 27 '15 at 08:03
See [Which pseudo-R2 measure is the one to report for logistic regression (Cox & Snell or Nagelkerke)?](http://stats.stackexchange.com/q/3559/17230). — Scortchi - Reinstate Monica, Aug 28 '15 at 10:11

score 1 · Answer 1 · answered Aug 27 '15 at 04:13

1

There is no R-squared for glm's.

Closes thing are so-called "pseudo-R" statistics derived from the deviance and/or likelihood.

http://www.ats.ucla.edu/stat/mult_pkg/faq/general/Psuedo_RSquareds.htm

answered Aug 27 '15 at 04:13

Analyst

2,527
10
11

score 1 · Answer 2 · answered Aug 27 '15 at 13:10

1

As @Analyst noted, there is no R-Squared for logistic regression. While there are several 'pseudo-R-squared' options available, I would advise against using them - there are simply too many and none of them properly get at the issue you are trying to solve. Remember that the purpose of logistic regression is different from OLS regression. In the latter, you minimize the squared error, and the R^2 is conceptually straightforward - the total % variance accounted for by the model. In contrast, logistic regression seeks classification accuracy. This can be assessed numerically in several ways, using such metrics as AUC (area under the ROC curve), confusion matrices, positive predictive value (PPV), etc, etc. For AUC/ROC, I suggest you look into R package 'pROC'. For confusion matrices, PPV, etc., try R package 'caret'.

answered Aug 27 '15 at 13:10

HEITZ

1,682
7
15

Despite their awkwardness the pseudo-R^2s are at least likelihood-based [proper scoring rules](https://en.wikipedia.org/wiki/Scoring_rule#Proper_scoring_rules). When your logistic regression model is not being developed for a specific classification task there's no sense in assessing its performance as a classifier using an arbitrary cut-off - thus ignoring that an predicted probability of "success" slightly below the cut-off is much less discrepant with an observed "success" than one far below. And AUC measures pure discrimination without any regard to calibration. – Scortchi - Reinstate Monica Aug 28 '15 at 10:54
See e.g. [Compare classifiers based on AUROC or accuracy?](http://stats.stackexchange.com/q/58756/17230) & [Measuring accuracy of a logistic regression-based model](http://stats.stackexchange.com/q/18178/17230). – Scortchi - Reinstate Monica Aug 28 '15 at 10:54

score -3 · Answer 3 · answered Aug 28 '15 at 05:01

-3

You should look at the confusion metric and then calculate the specificity and sensitivity using different variable selection. This would help to calculate accuracy of the model

answered Aug 28 '15 at 05:01

Manish Bhoge

1

1

Could you explain what you mean in a little more detail? (And presumably that should be "confusion matrix"). – Scortchi - Reinstate Monica Aug 28 '15 at 10:56

how to calculate R-squared in glm?

3 Answers3

Linked