1

Logistic Regression has two possible formulations depending on how we select the target variable: $y \in \{0,1\}$ or $y \in \{-1,1\}$.

This question discusses the derivation of Hessian of the loss function when $y \in \{0,1\}$. The following is about deriving the Hessian when $y \in \{-1,1\}$.

The loss function could be written as,

$$\mathcal{L}(\beta) = \frac{-1}{n} \sum_{i=1}^{n} \log \sigma(y_i\beta^{T}x_i),$$

where $y_i \in \{-1, 1\},$ $x_i \in \mathbb{R}^p,$ and $\sigma (x) = \frac{1}{1 +e^{-x}}.$ is the sigmoid function and $n$ is the number of examples in $X$.

I'm looking to calculate the Hessian for this Loss.

$\nabla \mathcal{L}(\beta)$ can be calculated as follows:

Let $l_i(\beta) = - \log \sigma(y_iz_i),$ where $z_i = \beta^{T}x_i$,

$$\frac {\partial l_i(\beta)}{\partial \beta} = \frac{-1}{\sigma(y_iz_i)}\sigma(y_iz_i)(1 - \sigma(y_iz_i)).\frac{\partial {y_iz_i}}{\partial \beta}$$

$$\frac {\partial l_i(\beta)}{\partial \beta} = -(1 - \sigma(y_iz_i)).y_ix_i$$

$$\frac {\partial l_i(\beta)}{\partial \beta} = ( \sigma(y_i\beta^{T}x_i) -1 ).y_ix_i$$

Averaging over all the $n$ examples:

$$ \nabla \mathcal{L}(\beta) = \frac{1}{n} \sum_{i=1}^{n} \sigma(y_i\beta^{T}x_i) -1 ).y_ix_i $$

I'm not sure how to proceed with calculating $\nabla^{2} \mathcal{L}(\beta)$. Any pointer is appreciated.

akilat90
  • 184
  • 8

0 Answers0