Questions tagged [deviance]

Deviance is a measure of distance between two probability distributions. In the case of GLMs, (total) deviance is twice the difference in log-likelihood between the full model and the restricted model.

Deviance is a measure of distance between two probability distributions, $f_{\theta_1}$ and $f_{\theta_2}$ defined as:

$$D(\theta_1,\theta_2) = 2E_{\theta_1}\log\frac{f_{\theta_1}(Y)}{f_{\theta_2}(Y)} = 2 \int f_{\theta_1}(y)\log\frac{f_{\theta_1}(y)}{f_{\theta_2}(y)}dy$$

For members of an exponential family,

$$D(\theta_1,\theta_2) = 2[(\theta_1 - \theta_2)\mu_1 - (K(\theta_1) - K(\theta_2))]$$

Where $\mu_1$ is the mean response of $f_{\theta_1}$ and $K(\cdot)$ is the cumulant generating function of $f_{\theta_1}$.

Strictly speaking deviance is not a proper distance metric because $D(\theta_1,\theta_2) \ne D(\theta_2,\theta_2)$. Nevertheless it measures how close two distributions are.

Note that $\frac{D(\theta_1,\theta_2)}{2}$ is also called the Kullback-Leibler distance or "mutual information".

In the case of GLMs, (total) deviance is twice the difference in the log-likelihood between the full model and the model under consideration.

$$D(y,\mu) = 2[\log(f_y(y)) - \log(f_\mu(y))]$$

where $f_y(y)$ is the full (or saturated) model.

200 questions
63
votes
3 answers

Interpreting Residual and Null Deviance in GLM R

How to interpret the Null and Residual Deviance in GLM in R? Like, we say that smaller AIC is better. Is there any similar and quick interpretation for the deviances also? Null deviance: 1146.1 on 1077 degrees of freedom Residual deviance: 4589.4…
Anjali
  • 891
  • 3
  • 10
  • 10
52
votes
3 answers

What is Deviance? (specifically in CART/rpart)

What is "Deviance," how is it calculated, and what are its uses in different fields in statistics? In particular, I'm personally interested in its uses in CART (and its implementation in rpart in R). I'm asking this since the wiki-article seems…
Tal Galili
  • 19,935
  • 32
  • 133
  • 195
39
votes
3 answers

Logistic Regression: Bernoulli vs. Binomial Response Variables

I want to perform logistic regression with the following binomial response and with $X_1$ and $X_2$ as my predictors. I can present the same data as Bernoulli responses in the following format. The logistic regression outputs for these 2 data sets…
36
votes
1 answer

Error metrics for cross-validating Poisson models

I'm cross validating a model that's trying to predict a count. If this was a binary classification problem, I'd calculate out-of-fold AUC, and if this was a regression problem I'd calculate out-of-fold RMSE or MAE. For a Poisson model, what error…
Zach
  • 22,308
  • 18
  • 114
  • 158
20
votes
1 answer

Logistic Regression : How to obtain a saturated model

I just read about the deviance measure for the logistic regression. However, the part that is called saturated model is not clear to me. I did an extensive Google search but none of the results answered my question. So far I found out that a…
toom
  • 303
  • 1
  • 2
  • 7
19
votes
2 answers

Pearson VS Deviance Residuals in logistic regression

I know that standardized Pearson Residuals are obtained in a traditional probabilistic way: $$ r_i = \frac{y_i-\hat{\pi}_i}{\sqrt{\hat{\pi}_i(1-\hat{\pi}_i)}}$$ and Deviance Residuals are obtained through a more statistical way (the contribution of…
18
votes
3 answers

In a GLM, is the log likelihood of the saturated model always zero?

As part of the output of a generalised linear model, the null and residual deviance are used to evaluate the model. I often see the formulas for these quantities expressed in terms of the log likelihood of the saturated model, for example:…
Alex
  • 3,728
  • 3
  • 25
  • 46
14
votes
1 answer

R-squared in linear model verses deviance in generalized linear model?

Here's my context for this question: From what I can tell, we cannot run an ordinary least squares regression in R when using weighted data and the survey package. Here, we have to use svyglm(), which instead runs a generalized linear model (which…
RickyB
  • 951
  • 1
  • 10
  • 21
14
votes
1 answer

Why does adding a lag effect increase mean deviance in a Bayesian hierarchical model?

Background: I'm currently doing some work comparing various Bayesian hierarchical models. The data $y_{ij}$ are numeric measures of well-being for participant $i$ and time $j$. I have around 1000 participants and 5 to 10 observations per…
Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250
13
votes
2 answers

Exact definition of Deviance measure in glmnet package, with crossvalidation?

For my current reseach I'm using the Lasso method via the glmnet package in R on a binomial dependent variable. In glmnet the optimal lambda is found via cross-validation and the resulting models can be compared with various measures, e.g.…
Jo Wmann
  • 153
  • 1
  • 6
12
votes
0 answers

Is the percent of total deviance explained a useful model summary?

My question is regarding the interpretation of the percent of deviance explained (and other $R^2$ anaologs or pseudo $R^2$ values for GLMs. Is this a meaningful summary statistic for models other than Gaussian? That is, is it at least as…
Brett
  • 5,708
  • 3
  • 29
  • 41
11
votes
1 answer

Measure of "deviance" for zero-inflated Poisson or zero-inflated negative binomial?

Scaled deviance, defined as D = 2 * (log-likelihood of saturated model minus log-likelihood of fitted model), is often used as a measure of goodness-of-fit in GLM models. Percent deviance explained, defined as [D(null model) - D(fitted model)] /…
aleanjeo
  • 111
  • 4
10
votes
1 answer

Deviance vs Pearson goodness-of-fit

I am trying to come up with a model by using negative binomial regression (negative binomial GLM). I have a relatively small sample size (greater than 300), and the data are not scaled. I noticed that there are two ways to measure goodness of fit -…
10
votes
3 answers

How to assess goodness of fit of a particular nonlinear model?

I have a nonlinear model $y=\Phi(f(x,a)) + \varepsilon$, where $\Phi$ is the cdf of the standard normal distribution and f is nonlinear (see below). I want to test the goodness of fit of this model with parameter $a$ to my data…
spadequack
  • 209
  • 2
  • 5
9
votes
1 answer

How to calculate the hat matrix for logistic regression in R?

I want to calculate the hat matrix directly in R for a logit model. According to Long (1997) the hat matrix for logit models is defined as: $$H = VX(X'VX)^{-1} X'V$$ X is the vector of independent variables, and V is a diagonal matrix with…
Thomas Jensen
  • 1,033
  • 1
  • 12
  • 22
1
2 3
13 14