Deviance is a measure of distance between two probability distributions. In the case of GLMs, (total) deviance is twice the difference in log-likelihood between the full model and the restricted model.
Deviance is a measure of distance between two probability distributions, $f_{\theta_1}$ and $f_{\theta_2}$ defined as:
$$D(\theta_1,\theta_2) = 2E_{\theta_1}\log\frac{f_{\theta_1}(Y)}{f_{\theta_2}(Y)} = 2 \int f_{\theta_1}(y)\log\frac{f_{\theta_1}(y)}{f_{\theta_2}(y)}dy$$
For members of an exponential family,
$$D(\theta_1,\theta_2) = 2[(\theta_1 - \theta_2)\mu_1 - (K(\theta_1) - K(\theta_2))]$$
Where $\mu_1$ is the mean response of $f_{\theta_1}$ and $K(\cdot)$ is the cumulant generating function of $f_{\theta_1}$.
Strictly speaking deviance is not a proper distance metric because $D(\theta_1,\theta_2) \ne D(\theta_2,\theta_2)$. Nevertheless it measures how close two distributions are.
Note that $\frac{D(\theta_1,\theta_2)}{2}$ is also called the Kullback-Leibler distance or "mutual information".
In the case of GLMs, (total) deviance is twice the difference in the log-likelihood between the full model and the model under consideration.
$$D(y,\mu) = 2[\log(f_y(y)) - \log(f_\mu(y))]$$
where $f_y(y)$ is the full (or saturated) model.