Questions tagged [classification]

Statistical classification is the problem of identifying the sub-population to which new observations belong, where the identity of the sub-population is unknown, on the basis of a training set of data containing observations whose sub-population is known. Therefore these classifications will show a variable behavior which can be studied by statistics.

-- Wikipedia at https://en.wikipedia.org/wiki/Statistical_classification

6303 questions

275

votes

6 answers

What does AUC stand for and what is it?

Searched high and low and have not been able to find out what AUC, as in related to prediction, stands for or means.

asked Jan 09 '15 at 10:35

josh

3,119
4
12
14

173

votes

4 answers

Choice of K in K-fold cross-validation

I've been using the $K$-fold cross-validation a few times now to evaluate performance of some learning algorithms, but I've always been puzzled as to how I should choose the value of $K$. I've often seen and used a value of $K = 10$, but this seems…

machine-learning classification cross-validation

asked May 04 '12 at 03:52

Charles Menguy

2,277
4
15
16

169

votes

4 answers

Cohen's kappa in plain English

I am reading a data mining book and it mentioned the Kappa statistic as a means for evaluating the prediction performance of classifiers. However, I just can't understand this. I also checked Wikipedia but it didn't help too:…

classification data-mining cohens-kappa

asked Jan 13 '14 at 19:14

Jack Twain

7,781
14
48
74

128

votes

5 answers

How does a Support Vector Machine (SVM) work?

How does a Support Vector Machine (SVM) work, and what differentiates it from other linear classifiers, such as the Linear Perceptron, Linear Discriminant Analysis, or Logistic Regression? * (* I'm thinking in terms of the underlying motivations for…

machine-learning classification svm statistical-learning

asked Feb 16 '12 at 13:25

tdc

7,289
5
32
62

120

votes

6 answers

Why are neural networks becoming deeper, but not wider?

In recent years, convolutional neural networks (or perhaps deep neural networks in general) have become deeper and deeper, with state-of-the-art networks going from 7 layers (AlexNet) to 1000 layers (Residual Nets) in the space of 4 years. The…

machine-learning classification neural-networks deep-learning conv-neural-network

asked Jul 09 '16 at 06:35

Karnivaurus

5,909
10
36
52

110

votes

4 answers

How do you calculate precision and recall for multiclass classification using confusion matrix?

I wonder how to compute precision and recall using a confusion matrix for a multi-class classification problem. Specifically, an observation can only be assigned to its most probable class / label. I would like to compute: Precision = TP / (TP+FP)…

machine-learning classification precision-recall multi-class

asked Mar 04 '13 at 15:56

daiyue

1,203
2
9
7

109

votes

4 answers

Softmax vs Sigmoid function in Logistic classifier?

What decides the choice of function ( Softmax vs Sigmoid ) in a Logistic classifier ? Suppose there are 4 output classes . Each of the above function gives the probabilities of each class being the correct output . So which one to take for a…

machine-learning logistic classification softmax

asked Sep 06 '16 at 15:46

mach

1,545
3
10
12

102

votes

4 answers

Why isn't Logistic Regression called Logistic Classification?

Since Logistic Regression is a statistical classification model dealing with categorical dependent variables, why isn't it called Logistic Classification? Shouldn't the "Regression" name be reserved to models dealing with continuous dependent…

regression machine-learning logistic classification terminology

asked Dec 07 '14 at 18:44

Ismael Ghalimi

1,968
2
12
21

votes

5 answers

How to calculate Area Under the Curve (AUC), or the c-statistic, by hand

I am interested in calculating area under the curve (AUC), or the c-statistic, by hand for a binary logistic regression model. For example, in the validation dataset, I have the true value for the dependent variable, retention (1 = retained; 0 = not…

regression logistic classification roc auc

asked Apr 09 '15 at 17:53

Matt Reichenbach

3,404
6
25
43

votes

6 answers

What is the difference between Multiclass and Multilabel Problem

What is the difference between a multiclass problem and a multilabel problem?

classification clustering terminology multi-class multilabel

asked Jun 13 '11 at 05:35

Learner

4,007
11
37
39

votes

8 answers

How to compute precision/recall for multiclass-multilabel classification?

I'm wondering how to calculate precision and recall measures for multiclass multilabel classification, i.e. classification where there are more than two labels, and where each instance can have multiple labels?

machine-learning classification precision-recall multi-class

asked Jan 23 '12 at 12:54

Vam

1,245
1
10
9

votes

6 answers

Feature selection for "final" model when performing cross-validation in machine learning

I am getting a bit confused about feature selection and machine learning and I was wondering if you could help me out. I have a microarray dataset that is classified into two groups and has 1000s of features. My aim is to get a small number of…

machine-learning classification cross-validation feature-selection genetics

asked Sep 02 '10 at 10:25

danielsbrewer

2,385
3
20
17

votes

4 answers

How to produce a pretty plot of the results of k-means cluster analysis?

I'm using R to do K-means clustering. I'm using 14 variables to run K-means What is a pretty way to plot the results of K-means? Are there any existing implementations? Does having 14 variables complicate plotting the results? I found something…

data-visualization classification k-means unsupervised-learning

asked Jun 25 '12 at 17:47

JEquihua

3,525
2
24
44

votes

5 answers

How to plot ROC curves in multiclass classification?

In other words, instead of having a two class problem I am dealing with 4 classes and still would like to assess performance using AUC.

classification roc

asked Aug 27 '10 at 01:56

CLOCK

votes

8 answers

When is unbalanced data really a problem in Machine Learning?

We already had multiple questions about unbalanced data when using logistic regression, SVM, decision trees, bagging and a number of other similar questions, what makes it a very popular topic! Unfortunately, each of the questions seems to be…

machine-learning classification predictive-models unbalanced-classes

asked Jun 02 '17 at 12:08

Tim

108,699
20
212
390

2 3

…

99 100 Next