2

I was reading a post on this website and it said:

For linear models the deviance equals the RSS (SSE)

$\ D_i=−2∑n_{ik}~log(p_{ik})$
so,
$\ SSE=Di=∑(y_{j}−μ_{i})^2,$

I can not see why this is true , also is there a good source for this kind of material. Thanks

What is Deviance? (specifically in CART/rpart)

Ferdi
  • 4,882
  • 7
  • 42
  • 62
  • You have misread the link. It does not say that the definition of deviance you have written in your first equation is for linear models; it says that's the definition of deviance for classification problems. With respect to your second expression, the link says that is for regression trees and mentions that the probability model within each leaf is Gaussian. – jbowman Nov 26 '17 at 22:42

1 Answers1

5

As @jbowman explains in the comment above, the specific form of deviance $D_i= -2 \sum_k n_{ik} \log(p_{ik})$ is only true for the classification problems discussed in the referenced post.

For linear models with Normal errors, the result $SSE = D = \Sigma (y_i - \mu)^2$ can be derived from the general definition of deviance ($D$), scaled deviance ($D^*$) and the normal pdf as follows:

First remember that

$ f(y; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}}\exp\left(-\frac{(y-\mu)^2}{2\sigma^2}\right) $

then for a single observation we have

$$ \begin{align} D^{*}(y, \mu) &= 2l(y;y) - 2l(\mu;y) \\ &= 2\log(f(y;y)) - 2\log(f(y;\mu))\\ &= 2 \bigg[-\frac{1}{2}\log(2 \pi \sigma^2) - \frac{(y - y)^2}{2\sigma^2}\bigg] - 2 \bigg[-\frac{1}{2}\log(2 \pi \sigma^2) - \frac{(y - \mu)^2}{2\sigma^2}\bigg]\\ &= \frac{(y - \mu)^2}{\sigma^2} \end{align} $$

and $D^{*}(y; \mu) = \frac{D(y; \mu)}{\phi} = \frac{D(y; \mu)}{\sigma^2}$, where the last equality holds since for a Normal distribution the dispersion parameter equals the variance, i.e. $\phi = \sigma ^ 2$. Therefore

$$ D(y, \mu) = (y - \mu)^2 $$

For a good introduction to deviance, see McCullagh and Nelder (1989). Generalized Linear Models (2nd ed.). Chapman and Hall. pp. 23-25, 33-36

Ben Bolker
  • 34,308
  • 2
  • 93
  • 126
mgilbert
  • 580
  • 3
  • 13