Would this modification accelerate convergence of generalized linear model, or break it?

Question

This page describes the following iteratively reweighted linear least-squares (IRLS) method for solving a generalized linear model (GLM):

let $x_1=0$

for $j=1,2,...$ do

linear predictor: let $\eta=Ax_j$

Newton-method correction: let $z=\eta+\frac{b-g(\eta)}{g'(\eta)}$

reweighting: let $W = \text{diag}(g'(\eta)^2/\text{var}(g(\eta)))$

solve linear least-squares: let $x_{j+1}=(A^TWA)^{-1}A^TWz$

stop if $||x_{j+1}-x_j||$ is sufficiently small

where $b$ is the response vector and $g$ is the inverse link function (i.e. $g^{-1}$ is the link function)

(Note that this naming is consistent with the link above but opposite to Wikipedia's convention, where $g$ is the link function and $g^{-1}$ is the inverse link function.).

But why does step 2.2 (Newton-method correction) not simply do $z=g^{-1}(b)$ instead? The algorithm would obviously still have the same stationary point, and surely get there faster?

Edit

So the algorithm I am proposing would be:

let $z=g^{-1}(b)$

let $\eta=z$

for $j=1,2,...$ do

reweighting: let $W = \text{diag}(g'(\eta)^2/\text{var}(g(\eta)))$

solve linear least-squares: let $x_{j}=(A^TWA)^{-1}A^TWz$

linear predictor: let $\eta=Ax_j$

stop if $||x_{j}-x_{j-1}||$ is sufficiently small

and now only the weights and linear least-squares solution are being updated with each iteration and we aren't trying to reverse-engineer $g$ at the same time, since $g^{-1}$ is available. Or have I broken it?

From your reference I understand $b$ to be the response vector, and therefore $z$ is intended to be some vector *of the form $z=Ax.$* Isn't the entire issue that there usually doesn't exist any vector for which $g(Ax)=b$? How, then, could your algorithm possibly converge at all? — whuber, Jan 20 '20 at 18:36

Would this modification accelerate convergence of generalized linear model, or break it?

Edit

0 Answers0