Why applying Linear Regression to a classification problem often isn't a great idea?

Asked Jul 30 '20 at 13:35

Active Jul 30 '20 at 13:41

Viewed 47 times

I got to know two reasons why we don't use Linear Regression for Classification:

For example- We want to predict a tumour is Malignant or Benign using one feature-x: Tumor size and we want to predict y: Malignant(1) or Benign(0). We can threshold the classifier output of hθ(x) at 0.5 as follows: If hθ(x) >= 0.5, predict 'y=1' If hθ(x) < 0.5 'predict y=0'

So if we apply this same technique with more training data:

1)it may predict wrong. 2)it may predict y>1 or y<0, but we need either 0 or 1 as answer in y.

Are there any better answers to my question?

Are there better reasons why we don't use linear regression for a classification problem?

edited Jul 30 '20 at 13:41

asked Jul 30 '20 at 13:35

Aswin Barath

There's actually something called a linear probability model that tries to model probabilities without applying the logistic regression (or probit) link function. https://en.wikipedia.org/wiki/Linear_probability_model – Dave Jul 30 '20 at 13:49
Do you mean linear regression or logistic regression? Linear regression isn't normally used for classification, but logistic regression is a workhorse (I would probably nearly always use it as a baseline model) of classification. – Adrian Keister Jul 30 '20 at 14:19
3

Does this answer your question? [Why do we use GLM?](https://stats.stackexchange.com/questions/104399/why-do-we-use-glm) The answer uses the linear probability model versus logistic regression as the example. – EdM Jul 30 '20 at 14:32
@AdrianKeister I meant why we use Logistic Regression over Linear Regression – Aswin Barath Jul 30 '20 at 15:28
@AdrianKeister I meant why we use Logistic Regression over Linear Regression – Aswin Barath Jul 30 '20 at 15:28

Why applying Linear Regression to a classification problem often isn't a great idea?

0 Answers0