0

I came up with below for my glm analysis but I need to calculate R-squared to cite in the paper? anyone can help me with this please?

summary(lrfit)

Call:

glm(formula = cbind(CumNumberTakeOff, CumNumberNOTakeOff) ~ Sex + 
    PlantQuality + Minlog + Temperature + Temperaturetm + +Temperature:Sex + 
    Temperature:PlantQuality + Sex:PlantQuality + Minlog:PlantQuality, 
    family = binomial, data = expdataNo20)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.3724  -0.6914  -0.2577  -0.0168   3.1202  

Coefficients:

                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -38.04288    2.22259 -17.116  < 2e-16 ***
SexMale                      10.20370    1.78445   5.718 1.08e-08 ***
PlantQualityF/W              19.99712    1.84748  10.824  < 2e-16 ***
Minlog                        1.01936    0.04873  20.918  < 2e-16 ***
Temperature                   1.21796    0.08583  14.191  < 2e-16 ***
Temperaturetm                -0.52639    0.02479 -21.235  < 2e-16 ***
SexMale:Temperature          -0.35374    0.06807  -5.197 2.03e-07 ***
PlantQualityF/W:Temperature  -0.68000    0.07118  -9.553  < 2e-16 ***
SexMale:PlantQualityF/W      -1.13717    0.17008  -6.686 2.29e-11 ***
PlantQualityF/W:Minlog       -0.38413    0.06478  -5.930 3.03e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Sycorax
  • 76,417
  • 20
  • 189
  • 313
moradi614
  • 1
  • 1
  • 2
  • 2
    see this question on goodness-of-fit for logistic regression: http://stats.stackexchange.com/questions/169000/goodness-of-fit-test-in-logistic-regression-which-fit-do-we-want-to-test. This may also be interesting: http://stats.stackexchange.com/questions/164120/interesting-logistic-regression-idea-problem-data-not-currently-in-0-1-form/164127#164127 –  Aug 27 '15 at 08:03
  • See [Which pseudo-R2 measure is the one to report for logistic regression (Cox & Snell or Nagelkerke)?](http://stats.stackexchange.com/q/3559/17230). – Scortchi - Reinstate Monica Aug 28 '15 at 10:11

3 Answers3

1

There is no R-squared for glm's.

Closes thing are so-called "pseudo-R" statistics derived from the deviance and/or likelihood.

http://www.ats.ucla.edu/stat/mult_pkg/faq/general/Psuedo_RSquareds.htm

Analyst
  • 2,527
  • 10
  • 11
1

As @Analyst noted, there is no R-Squared for logistic regression. While there are several 'pseudo-R-squared' options available, I would advise against using them - there are simply too many and none of them properly get at the issue you are trying to solve. Remember that the purpose of logistic regression is different from OLS regression. In the latter, you minimize the squared error, and the R^2 is conceptually straightforward - the total % variance accounted for by the model. In contrast, logistic regression seeks classification accuracy. This can be assessed numerically in several ways, using such metrics as AUC (area under the ROC curve), confusion matrices, positive predictive value (PPV), etc, etc. For AUC/ROC, I suggest you look into R package 'pROC'. For confusion matrices, PPV, etc., try R package 'caret'.

HEITZ
  • 1,682
  • 7
  • 15
  • Despite their awkwardness the pseudo-R^2s are at least likelihood-based [proper scoring rules](https://en.wikipedia.org/wiki/Scoring_rule#Proper_scoring_rules). When your logistic regression model is not being developed for a specific classification task there's no sense in assessing its performance as a classifier using an arbitrary cut-off - thus ignoring that an predicted probability of "success" slightly below the cut-off is much less discrepant with an observed "success" than one far below. And AUC measures pure discrimination without any regard to calibration. – Scortchi - Reinstate Monica Aug 28 '15 at 10:54
  • See e.g. [Compare classifiers based on AUROC or accuracy?](http://stats.stackexchange.com/q/58756/17230) & [Measuring accuracy of a logistic regression-based model](http://stats.stackexchange.com/q/18178/17230). – Scortchi - Reinstate Monica Aug 28 '15 at 10:54
-3

You should look at the confusion metric and then calculate the specificity and sensitivity using different variable selection. This would help to calculate accuracy of the model