I am trying to implement logistic regression where the label space is {-1,+1} instead of the usual {0,1}. I know that I can model this as a 0-1 problem but nevertheless I wanted to see if I can derive this from first principles (using MLE).
The min log likelihood expression I get is: $ \ l(\theta) = \Sigma_{i=1}^{m}\ \log(1+exp(-y^{i}\Theta^{T}x^{i})) $ where $\{\dots \ (x^{i},y^{i}) \dots \} $ are the $m$ training examples (x is a $n$-dimensional vector).
So now I try to find the gradient for this and I get: $ \frac{\partial l(\theta)}{\partial \theta_j} = \frac{\mu.y.x_j}{1+\mu} $ where $j=1\dots n$ are the indices corresponding to features and $\mu = exp(y\Theta^{T}x)$
However, when I try to solve this with Matlab's fminunc
I do not get any updates on my initial weight vector. My Matlab code for this is:
temp1 = exp((-y).*(X*w));
temp2 = temp1.*((1+temp1).^(-1)).*y;
grad = (X'*temp2);
Can somebody point what I am doing wrong here?