I am trying to implement logistic regression using the following algorithm:
- fit a simple linear model $y \sim Xb_0$
- calculate $W = \frac{e^{Xb_0}}{(1+e^{Xb_0})^2}$
- calculate $z = Xb_0 + y \cdot \frac{(1+e^{Xb_0})^2}{e^{Xb_0}} - (1+e^{Xb_0})$ (the square and dot product means element-wise here)
- calculate infomation matrix $J = X^T W X$
- solve for a new coefficient vector $b_1 = J^{-1}X^T W z$
- if difference between $b_0$ and $b_1$ not small enough, assign b1 to b0, and repeat step 2-5
This sometimes work, but sometimes $b_0$ and $b_1$ will not converge. I am suspecting the initial b value was not picked optimally. Any suggestions? I can also post the c++ code here, but that may not be very helpful.