Why isn't Logistic Regression called Logistic Classification?

Question

Since Logistic Regression is a statistical classification model dealing with categorical dependent variables, why isn't it called Logistic Classification? Shouldn't the "Regression" name be reserved to models dealing with continuous dependent variables?

While logistic regression can certainly be used for classification by introducing a threshold on the probabilities it returns, that's hardly its only use - or even its primary use. It was developed for - and continues to be used for - regression purposes that have nothing to do with classification. I'd argue that this is still easily what it's mostly used for, but I suppose it depends on what you look at. — Glen_b, Dec 08 '14 at 01:07
You might find [this paper](http://papers.tinbergen.nl/02119.pdf) on the development of logistic regression interesting, particularly since it does give some sense of the kinds of problems that it is used for as a regression technique. — Glen_b, Dec 08 '14 at 01:14

Sycorax · Accepted Answer · 2020-08-31T15:51:50.283

128

Logistic regression is emphatically not a classification algorithm on its own. It is only a classification algorithm in combination with a decision rule that makes dichotomous the predicted probabilities of the outcome. Logistic regression is a regression model because it estimates the probability of class membership as a (transformation of a) multilinear function of the features.

Frank Harrell has posted a number of answers on this website enumerating the pitfalls of regarding logistic regression as a classification algorithm. Among them:

Classification is a decision. To make an optimal decision, you need to asses a utility function, which implies that you need to account for the uncertainty in the outcome, i.e. a probability.
The costs of misclassification are not uniform across all units.
Don't use cutoffs.
Use proper scoring rules.
The problem is actually risk estimation, not classification.

If I recall correctly, he once pointed me to his book on regression strategies for more elaboration on these (and more!) points, but I can't seem to find that particular post.

edited Aug 31 '20 at 15:51

answered Dec 07 '14 at 18:48

Sycorax

76,417
20
189
313

3

If that's the case, all(or most) the classifiers predicts the probabilities to belong in a class first(as far as I know) and then transform this prob to classes.. Don't they? – Outlier Dec 10 '14 at 06:51
9

@Outlier Counterexample: SVM doesn't compute class probabilities at all, it just measures the distance between an observation and a hyperplane. – Sycorax Dec 10 '14 at 12:59
@Outlier in ML these are called probabilistic classifiers; trees and random forest are not, xgboost is - at least with logloss) – seanv507 May 25 '19 at 06:46
@SycoraxsaysReinstateMonica So is there **any** classification algorithm in the world? SVMs compute distance from the class boundary, Neural Networks compute some other continuous function... – Igor F. Dec 19 '19 at 11:56
1

@IgorF. Any algorithm with a decision rule discretizing the output as classes is a classifier. Logistic regression is properly named because it's a regression of class probabilities. – Sycorax Dec 19 '19 at 13:09
@SycoraxsaysReinstateMonica Let me rephrase the question: Can you name me classification algorithms which, in the training phase, use discretized outputs for adjusting the parameters? For all others, which use continuous values (e.g. SVM or neural networks), I don't see a justification for calling them "classification" algorithms any more than I see it for logistic regression. – Igor F. Dec 19 '19 at 14:09
@IgorF. A decision tree adjusts its parameters (chooses feature splits) on the basis of discretized outputs (class labels) by choosing the largest information gain. – Sycorax Dec 19 '19 at 14:11
@SycoraxsaysReinstateMonica OK. And, I guess, a (k)-nearest-neighbor(s) algorithm can also be called "classifier", right? But you agree that SVM and neural networks are actually regression algorithms(?). – Igor F. Dec 19 '19 at 14:55
@IgorF. KNN doesn’t seem to answer your question because it doesn’t have parameters optimized during training. If we read your question as “are there any models that maximize the number of correct classifications directly?” then I think nothing qualifies because number of correct classifications is not differentiable and is a combinatorial problem: each datum could belong to any class, and doing an exhaustive search is expensive. Not sure about SVM and NNs. I’d be surprised if all algorithms had to be exactly one of “classifier” and “regressor.” – Sycorax Dec 19 '19 at 15:37

score 15 · Answer 2 · answered Dec 10 '14 at 03:11

Abstractly, regression is the problem of calculating a conditional expectation $E[Y|X=x]$. The form taken by this expectation is different depending on the assumptions of how the data were generated:

Assuming (Y|X=x) to be normally distributed yields with classical linear regression.
Assuming a Poisson distribution yields Poisson regression.
Assuming a Bernoulli distribution yields logistic regression.

The term "regression" has also been used more generally than this, including approaches like quantile regression, which estimates a given quantile of $(Y|X=x)$.

score 2 · Answer 3 · answered Jan 27 '21 at 13:14

Blockquote The U.S. Weather Service has always phrased rain forecasts as probabilities. I do not want a classification of “it will rain today.” There is a slight loss/disutility of carrying an umbrella, and I want to be the one to make the tradeoff. Blockquote

Dr. Frank Harrell, https://www.fharrell.com/post/classification/

Classification is when you make a concrete determination of what category something is a part of. Binary classification involves two categories, and by the law of the excluded middle, that means binary classification is for determining whether something “is” or “is not” part of a single category. There either are children playing in the park today (1), or there are not (0).

Although the variable you are targeting in logistic regression is a classification, logistic regression does not actually individually classify things for you: it just gives you probabilities (or log odds ratios in the logit form). The only way logistic regression can actually classify stuff is if you apply a rule to the probability output. For example, you may round probabilities greater than or equal to 50% to 1, and probabilities less than 50% to 0, and that’s your classification.

if you want to read more please check this link for more detail https://ryxcommar.com/2020/06/27/why-do-so-many-practicing-data-scientists-not-understand-logistic-regression/

krish___na · Answer 4 · 2019-07-23T02:37:44.530

-1

Apart from already provided good answers, another view is that Logistic regression predicts probabilities (which is continuous value) that have got range from 0 to 1.

edited Jul 23 '19 at 02:37

answered Jun 26 '19 at 04:54

krish___na

171
4

what happens if u insert the outcome y of your linear model into another linear model (can think of it as an extreme case of sigmoid with very gentle change of gradient. Will that work for classification? If not why? – Zzy1130 Feb 05 '20 at 08:04

Why isn't Logistic Regression called Logistic Classification?

4 Answers4

Linked