4

In the Gini Coefficient's Wikipedia page, it is defined as $G= 1 - \frac{\Sigma_{i=1}^n f(y_i)(S_{i-1}+S_i)}{S_n}$ for discrete variables, where $S_i= \Sigma_{j=1}^i f(y_i)y_i$ and $S_0=0$ ($y$ being the discrete variable that takes $n$ different values), and it says that $AUC= \frac{1}{2}(1+G).$. However, I don't quite understand how I can get the AUC from that.

Let's suppose that I have response variables $y_1,...,y_k \in \{0,1\}$ in some test set and probabilities $\hat{p_1},...,\hat{p_k}$ for these observations generated by a logistic regression. With that information, how can I obtain $G$ to subsequently calculate AUC?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
nestor556
  • 260
  • 2
  • 8

1 Answers1

2

You need to sort y and p by by p first, then calculate the cumulative sums and then you get G which then gives you AUC.

Dirk Nachbar
  • 164
  • 8