2

The summary of glm() function in R gives the quantile of deviance residuals (e.g., see below). I know how to get them (e.g., without using glm()) but I don't know how to use them. Because it is a standard output, I assume it may give some useful (crude) diagnostic for something, but I don't know what it is. I am not asking how to use deviance residuals in general. I want to know how to use this quantile information.

Although the example below uses a Poisson, I am interested in general interpretation, not the interpretation of this specific example.

> summary(glm(rpois(100,1)~1,family=poisson))

Call:
glm(formula = rpois(100, 1) ~ 1, family = poisson)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-1.46969  -1.46969  -0.07796   0.79041   2.73583 
quibble
  • 1,167
  • 10
  • 17
  • 3
    This is a default output template. It may *or may not* be useful, & useful *for what* could differ from one situation to another. It may help to read [Interpretation of plot (glm.model)](https://stats.stackexchange.com/a/139624/7290) (which is about plots of the residuals, but may still yield some insights). For the specific case of a Poisson regression (I recognize you want an answer that would cover any family, but I don't see a helpful answer that does), you can read about helpful model diagnostics here: [Diagnostic plots for count regression](https://stats.stackexchange.com/q/70558/). – gung - Reinstate Monica May 30 '18 at 16:49

2 Answers2

4

Under certain conditions, the deviance residuals from a Poisson regression model are approximately normally distributed with mean 0 and variance 1. If those conditions are satisfied, then the median of deviance residuals should be close to 0 and the minimum and maximum values of the deviance residuals should be close to -3 and +3, respectively.

The conditions are described in http://www.markirwin.net/stat149/Lecture/Lecture15.pdf (slide 18), for example.

Isabella Ghement
  • 18,164
  • 2
  • 22
  • 46
  • It says if $\mu_i$ is large (e.g., >5) in a Poisson model, residuals are distributed as the standard normal. But why do we want to know if means are greater than 5? – quibble May 30 '18 at 02:49
  • I guess we can first check whether $\mu_i>5$. Given that the condition is satisfied, if the quantile differs from a standard normal, something is wrong with the model? The information is not useful if $\mu_i<5$. How about for non-Poisson distributions? – quibble May 30 '18 at 02:53
  • 6
    As the mean approaches zero, residuals are likely to be more skewed so a normal approximation can't be sustained. – Nick Cox May 30 '18 at 07:47
1

The minimum and maximum of $x$ values are obvious. The first quartile (1Q) value is the $x$-value such that one quarter of the $x$-values are lesser than that 1Q value, and 3 quarters of the $x$-values are greater. Similarly, the third quartile (3Q) is that $x$-value for which 3 quarters of the $x$-values are less than that and one quarter are greater than that. The median is the half point of sorted ranked range with 1/2 lesser and 1/2 greater values. If the number of values is even, the median is the average of the two values closest to the rank sorted half-range, and if odd, the value is that of the middle rank sorted value. Some algorithms do interpolation of quartile values differently than others. The IQR is the interquartile range or the $x$ difference between 3Q and 1Q. A quantile is a generalized version of the same. That is, there are quintiles, which would be fifths of the rank sorted range, and other similar divisions but in general one uses the word quantile to express this as either a variable or a specific number. Examples, one speaks of the 1% quantile, and one speaks of quantile-quantile plotting (Q-Q plotting) which is the plot of a known distributions quantiles versus a test distribution or test set of rank-sorted $x$-values.

Carl
  • 11,532
  • 7
  • 45
  • 102