Why do we calculate Information value?

Question

I have the data with categorical variables and continuous variables, but is the need for finding information value in explanatory data analysis.

Just give the reason for why we are calculating the information value for each variables at the beginning of the data analysis and what will be the cutoff point of INFORMATION VALUE for taking in care of the analysis

Please tell us more specifically what calculation "information value" refers to: there does not seem to be a standardized quantitative meaning for that term that all readers will understand in the same way. When you edit your question, please also provide more context to help us understand what kind of analysis you are discussing and what you are using the "cutoff point" for. — whuber, Apr 09 '14 at 14:31
@whuber: he probably intends the sense used at https://stats.stackexchange.com/questions/462052/intuition-behind-weight-of-evidence-and-information-value-formula/462445#462445 — kjetil b halvorsen, May 20 '21 at 05:22

score 14 · Answer 1 · answered Apr 09 '14 at 19:39

Generally speaking, Information Value provides a measure of how well a variable $X$ is able to distinguish between a binary response (e.g. "good" versus "bad") in some target variable $Y$. The idea is if a variable $X$ has a low Information Value, it may not do a sufficient job of classifying the target variable, and hence is removed as an explanatory variable.

To see how this works, let $X$ be grouped into $n$ bins. Each $x \in X$ corresponds to a $y \in Y$ that may take one of two values, say 0 or 1. Then for bins $X_i$, $1 \leq i \leq n$,

$$ IV= \sum_{i=1}^n (g_i-b_i)*\ln(g_i/b_i) $$

where

$b_i= (\#$ of $0$'s in $X_i)/(\#$ of $0$'s in $X) =$ the proportion of $0$'s in bin $i$ versus all bins

$g_i= (\#$ of $1$'s in $X_i)/(\#$ of $1$'s in $X) =$ the proportion of $1$'s in bin $i$ versus all bins

$\ln(g_i/b_i)$ is also known as the Weight of Evidence (for bin $X_i$). Cutoff values may vary and the selection is subjective. I often use $IV < 0.3$ (as does [1] below).

In the context of credit scoring, these two resources should help:

[1] http://www.mwsug.org/proceedings/2013/AA/MWSUG-2013-AA14.pdf

[2] http://support.sas.com/resources/papers/proceedings12/141-2012.pdf

Do you know of any sort of correction for calculating information value when one of the bins is either all good or all bad? My idea is to add 1 to each column of each bin to correct for this situation. I am wondering if this is a common practice or if there are any other theoretical concerns. I am mostly considering this step out of pragmatism. — Zelazny7, Sep 30 '14 at 14:52
I've seen some practitioners remove the term with all good or all bad from the summation, but I wouldn't recommend this because you'd be essentially nullifying a perfect association. Adding a constant (say c) is an interesting solution, but the choice and constant and size of the bin will greatly affect your IV. As c approaches 0 or bin size approaches infinity, the IV approaches infinity. To obtain a more representative IV, you might want to consider combining adjacent bins that have all goods or all bads. — dmanuge, Jan 07 '15 at 00:21

Why do we calculate Information value?

1 Answers1

Linked