1

I am studying on some Machine Learning concepts. I am looking for logistic regression(multiclass) and logistic regression classifier and I should learn how to change it to penalize large weights.

I have looked here: http://en.wikipedia.org/wiki/Logistic_regression but doesn't give me information what I want. I checked Alpaydin's, Tom Mitchel's and Duda's book but couldn't find anything like that(maybe I don't know that there is another name for it) I could see some explanation at Bishop's Pattern Recognition book but I don't have full access to book to read.

My question is what is logistic regression classifier, and next, how can I penalize large weights on it?

kamaci
  • 571
  • 4
  • 9
  • 27
  • argh. By weight you mean the values of the estimated coefficients (e.g. the $\beta_1$ in the definition section of wiki article)? – user603 Nov 28 '12 at 23:46
  • @user603 actually they are lambda, sigma, w power of 2 however I don't know that parameters because I don't have any idea about algorithm. – kamaci Nov 29 '12 at 06:20
  • Possible duplicate of [What is the difference between linear regression and logistic regression?](https://stats.stackexchange.com/questions/29325/what-is-the-difference-between-linear-regression-and-logistic-regression) – kjetil b halvorsen Apr 28 '17 at 22:06
  • 2
    Possible duplicate of [Logistic regression using penalized likelihood (lasso?) in Matlab/R](https://stats.stackexchange.com/questions/91174/logistic-regression-using-penalized-likelihood-lasso-in-matlab-r) – Firebug Apr 29 '17 at 15:47
  • I think this question is so unclear that I can't tell what it is a duplicate of. So, I'm voting to close for that reason. – Peter Flom Apr 30 '17 at 11:58

1 Answers1

2

For your first question: What is a logistic regression classifier, I don't really know what kind of answer do you want.

But if you have some data that can be classified by category, you can use Logistic regression.

$$ J(\Theta ) = -[\frac{1}{m}\sum_{i=1}^{m}y^{(i)} log( h_{\Theta }(x^{(i)})) + (1-y^{(i)}) log (1-h_{\Theta }(x^{(i)}))] $$

with $h_{\Theta }(x) = \frac{1}{1+exp(-\Theta ^{T}x))}$ and with your data being {($x^{(1)}$, $y^{(1)}$), ($x^{(2)}$, $y^{(2)}$), ..., ($x^{(n)}$, $y^{(n)}$)}

You need to minimize $J(\Theta )$ using gradient descent to find good enough parameters ${\Theta }$

If you want to penalize large weight you can use the same formula and use regularization.

Try to find the course from Andrew Bg on Machine Learning and more specifically the Logistic Regression part.

(I wrote the all thing quickly, I hope I didn't forget anything) $$ J(\Theta ) = -[\frac{1}{m}\sum_{i=1}^{m}y^{(i)} log( h_{\Theta }(x^{(i)})) + (1-y^{(i)}) log (1-h_{\Theta }(x^{(i)}))] + \frac{\lambda}{2m} \sum_{j=1}^{n}\Theta_{j}^{2} $$ ${\lambda}$ being how much penalty you want to apply to your model

ThiS
  • 1,382
  • 1
  • 12
  • 13
  • Thanks for your explanation. I am new to that king of topics. Could you explain what you mean with Theta parameter? – kamaci Dec 04 '12 at 14:01
  • Look at this pdf, http://cs229.stanford.edu/notes/cs229-notes1.pdf Should be good to read it from the start to get a sense of the topic. And if you still have question don't hesitate. – ThiS Dec 04 '12 at 14:16
  • I am voting up your answer. Thanks for the source document I will check it and return back. – kamaci Dec 04 '12 at 14:17