Questions tagged [accuracy]

Accuracy of an estimator is the degree of closeness of the estimates to the true value. For a classifier, accuracy is the proportion of correct classifications. (This second usage is not good practice. See the tag wiki for a link to further information.)

Accuracy of an estimator is the degree of closeness of the estimates to the estimand. Accuracy of a forecast rule is the degree of closeness of the forecasts to the corresponding realization. Accuracy can be contrasted to precision; accuracy is about bias while precision is about variability.

Given a set of estimates or forecasts, the estimator or the forecast rule that have generated them can be said to be accurate if the average of the set is close to the estimand or the realization, respectively. Meanwhile, the estimator or the forecast rule can be said to be precise if the values are close to each other (little scattered). The two concepts are independent of each other, so a particular estimator or forecast rule be either accurate, or precise, or both, or neither. Although the two words, precision and accuracy can be synonymous in colloquial use, they are deliberately contrasted in the context of the scientific method.

For example, lack of accuracy (large bias) may result from a systematic error. Eliminating the systematic error improves accuracy but does not change precision. Meanwhile, lack of precision (large variability) may result from a small sample on which the estimation or forecasting is based. Increasing the sample size alone may improve precision but not accuracy.

Statistical literature may prefer the terms bias and variability instead of accuracy and precision: bias is the amount of inaccuracy and variability is the amount of imprecision.

(Loosely based on Wikipedia's article "Accuracy and precision".)

Accuracy is not a good performance measure for classifiers.

757 questions
190
votes
10 answers

Why is accuracy not the best measure for assessing classification models?

This is a general question that was asked indirectly multiple times in here, but it lacks a single authoritative answer. It would be great to have a detailed answer to this for the reference. Accuracy, the proportion of correct classifications among…
Tim
  • 108,699
  • 20
  • 212
  • 390
85
votes
11 answers

What is the best way to remember the difference between sensitivity, specificity, precision, accuracy, and recall?

Despite having seen these terms 502847894789 times, I cannot for the life of me remember the difference between sensitivity, specificity, precision, accuracy, and recall. They're pretty simple concepts, but the names are highly unintuitive to me,…
Jessica
  • 1,781
  • 2
  • 15
  • 17
69
votes
1 answer

What are the shortcomings of the Mean Absolute Percentage Error (MAPE)?

The Mean Absolute Percentage Error (mape) is a common accuracy or error measure for time series or other predictions, $$ \text{MAPE} = \frac{100}{n}\sum_{t=1}^n\frac{|A_t-F_t|}{A_t}\%,$$ where $A_t$ are actuals and $F_t$ corresponding forecasts or…
Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357
62
votes
3 answers

F1/Dice-Score vs IoU

I was confused about the differences between the F1 score, Dice score and IoU (intersection over union). By now I found out that F1 and Dice mean the same thing (right?) and IoU has a very similar formula to the other two. F1 / Dice:…
pietz
  • 723
  • 1
  • 6
  • 6
56
votes
5 answers

Training a decision tree against unbalanced data

I'm new to data mining and I'm trying to train a decision tree against a data set which is highly unbalanced. However, I'm having problems with poor predictive accuracy. The data consists of students studying courses, and the class variable is the…
chrisb
  • 715
  • 1
  • 7
  • 8
43
votes
6 answers

Why do I get a 100% accuracy decision tree?

I'm getting a 100% accuracy for my decision tree. What am I doing wrong? This is my code: import pandas as pd import json import numpy as np import sklearn import matplotlib.pyplot as plt data =…
Nadjla
  • 441
  • 1
  • 4
  • 4
33
votes
5 answers

Is an overfitted model necessarily useless?

Assume that a model has 100% accuracy on the training data, but 70% accuracy on the test data. Is the following argument true about this model? It is obvious that this is an overfitted model. The test accuracy can be enhanced by reducing the…
Hossein
  • 3,170
  • 1
  • 16
  • 32
30
votes
2 answers

Interpretation of mean absolute scaled error (MASE)

Mean absolute scaled error (MASE) is a measure of forecast accuracy proposed by Koehler & Hyndman (2006). $$MASE=\frac{MAE}{MAE_{in-sample, \, naive}}$$ where $MAE$ is the mean absolute error produced by the actual forecast; while $MAE_{in-sample,…
Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
29
votes
4 answers

Good accuracy despite high loss value

During the training of a simple neural network binary classifier I get an high loss value, using cross-entropy. Despite this, accuracy's value on validation set holds quite good. Does it have some meaning? There is not a strict correlation between…
user146655
  • 291
  • 1
  • 4
  • 3
26
votes
3 answers

Is my model any good, based on the diagnostic metric ($R^2$/ AUC/ accuracy/ RMSE etc.) value?

I've fitted my model and am trying to understand whether it's any good. I've calculated the recommended metrics to assess it ($R^2$/ AUC / accuracy / prediction error / etc) but do not know how to interpret them. In short, how do I tell if my model…
mkt
  • 11,770
  • 9
  • 51
  • 125
24
votes
1 answer

How to determine the accuracy of regression? Which measure should be used?

I have problem with defining the unit of accuracy in a regression task. In classification tasks is easy to calculate sensitivity or specificity of classifier because output is always binary {correct classification, incorrect classification}. So I…
Mr Jedi
  • 373
  • 1
  • 2
  • 7
23
votes
1 answer

Is accuracy an improper scoring rule in a binary classification setting?

I have recently been learning about proper scoring rules for probabilistic classifiers. Several threads on this website have made a point of emphasizing that accuracy is an improper scoring rule and should not be used to evaluate the quality of…
Zyzzva
  • 231
  • 2
  • 3
23
votes
1 answer

How is the confusion matrix reported from K-fold cross-validation?

Suppose I do K-fold cross-validation with K=10 folds. There will be one confusion matrix for each fold. When reporting the results, should I calculate what is the average confusion matrix, or just sum the confusion matrices?
der
  • 231
  • 1
  • 2
  • 3
22
votes
2 answers

Proper scoring rule when there is a decision to make (e.g. spam vs ham email)

Among others on here, Frank Harrell is adamant about using proper scoring rules to assess classifiers. This makes sense. If we have 500 $0$s with $P(1)\in[0.45, 0.49]$ and 500 $1$s with $P(1)\in[0.51, 0.55]$, we can get a perfect classifier by…
Dave
  • 28,473
  • 4
  • 52
  • 104
20
votes
3 answers

How can we judge the accuracy of Nate Silver's predictions?

Firstly, he gives probability of outcomes. So, for example, his predictions for the U.S. election is currently 82% Clinton vs 18% Trump. Now, even if Trump wins, how do I know that it wasn't just the 18% of the time that he should've won? The other…
1
2 3
50 51