1

Show that each iteration of Fisher Scoring (also Iterated ReWeighted Least Squares - IRLS or IWLS) algorithm is the same as doing least squares on the working responses, where the working responses are defined as:

$$z_i^{(t)} = \eta_i + (y_i-\mu_i^{(t)})\frac{\partial\eta_i^{(t)}}{\partial\mu_i^{(t)}}$$

($\eta$ is the Linear Predictor).

Maverick Meerkat
  • 2,147
  • 14
  • 27

1 Answers1

1

The score, or first derivative of GLM likelihood equations, is $S(\beta) = X^TDV^{-1}(y-\mu)$ where $D$ is a diagonal matrix with $\frac{\partial \mu_i}{\partial \eta_i}$ in it's diagonal, and $V$ is a diagonal matrix with $Var(y_i)$ for it's diagonal.

The information matrix of $\beta$ is equal to $I(\beta) =X^TWX$, where $W$ is a diagonal matrix with $(\frac{\partial\mu_i}{\partial \eta_i})^2/V(y_i)$.

Note that $DV^{-1} = WD^{-1}$.

Each iteration of Fisher scoring is:

$\beta^{(t+1)} = \beta^{(t)} + I(\beta^{(t)})^{-1}S(\beta^{(t)}) = I(\beta^{(t)})^{-1}(I(\beta^{(t)})\beta^{(t)} + S(\beta^{(t)})) = \\ (X^TW^{(t)}X)^{-1}(I(\beta^{(t)})\beta^{(t)} + S(\beta^{(t)})) = (X^TW^{(t)}X)^{-1}X^TW^{(t)}Z^{(t)}$

For the last equality we need to use the following:

$S(\beta) = X^TD V^{-1}(y-\mu) = X^TW D^{-1}(y-\mu) = X^TW(Z-\eta) = X^TWZ -X^TW(X\beta) = X^TWZ - I(\beta)\beta$

Hence $I(\beta)\beta + S(\beta) = X^TWZ$.

Maverick Meerkat
  • 2,147
  • 14
  • 27