10

Recently, a project I'm involved in made use of a linear perceptron for multiple (21 predictor) regression. It used stochastic GD. How is this different from OLS linear regression?

Simon Kuang
  • 2,051
  • 3
  • 17
  • 18
  • The `Perceptron` class you link to is for a classifier (binary output) rather than a regressor (continuous output). Is that the actual code you used? If so, that's the difference. :) – Danica Mar 30 '15 at 03:03
  • @Dougal, it still counts among the GLMs though: http://scikit-learn.org/stable/supervised_learning.html#supervised-learning – Simon Kuang Mar 30 '15 at 03:30
  • @Dougal: suppose you had a (G)LM that you optimized to L2 using [`SGDRegressor`](http://goo.gl/QmI6bM); would this be equivalent to linear regression? – Simon Kuang Mar 30 '15 at 03:34
  • Yes, some GLMs are classifiers. If you used `SGDRegressor(loss='squared_loss', penalty='none')`, that is OLS. – Danica Mar 30 '15 at 03:57

1 Answers1

7

scikit-learn's Perceptron class (equivalent to SGDClassifier(loss="perceptron", penalty=None, learning_rate="constant", eta0=1)) uses the following objective function: $$\frac{1}{N} \sum_{i=1}^N \max(0, - y_i w^T x_i).$$ In this case, $y_i \in \{-1, 1\}$. If $w^T x_i$ has the right sign, it doesn't incur any loss; otherwise, it gives linear loss. The perceptron in particular uses a fixed learning rate which can lead to some optimization weirdness as well.

Least squares regression, by contrast, uses $$\frac{1}{N} \sum_{i=1}^N (y_i - w^T x_i)^2.$$ Here $y_i$ can be any real; you can give it classification targets in $\{-1, 1\}$ if you want, but it's not going to give you a very good model. You can optimize this with SGDRegressor(loss="squared_loss", penalty=None) if you'd like.

The two define fundamentally different models: the perceptron predicts a binary class label with $\mathrm{sign}(w^T x_i)$, whereas linear regression predicts a real value with $w^T x_i$. This answer talks some about why trying to solve a classification problem with a regression algorithm can be problematic.

Danica
  • 21,852
  • 1
  • 59
  • 115