Reverse a label encoded target in test and train series?

Question

when performing a Scikit train/test split like so:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

with categorical target values (y from above) already label encoded :

class_le = LabelEncoder()
aDataFrame['aTarget'] = class_le.fit_transform(aDataFrame['aTarget'].values)

I can run a classification report from the result of a classification:

print (classification_report(results, y_test))

that prints out info about the precision:

         precision    recall  f1-score   support

      0       1.00      1.00      1.00        18
      1       0.40      0.25      0.31         8
      2       0.08      0.10      0.09        10

Is there a way to say what decoded category each of those results referred to?

How can I determine what the already encoded target values were before encoding? For example, if I print out the contents of the y_train, y_test variables I'll see a series like so:

    aTarget
12799   192
145162  15
140041  205

Just looking at the target of 192, how would I determine what category it originally referred to given the original class_le label encoding object? thanks very much for any tips!

score 0 · Accepted Answer · answered Apr 10 '18 at 18:48

From the source code:

def fit_transform(self, y):
    """Fit label encoder and return encoded labels
    Parameters
    ----------
    y : array-like of shape [n_samples]
        Target values.
    Returns
    -------
    y : array-like of shape [n_samples]
    """
    y = column_or_1d(y, warn=True)
    self.classes_, y = np.unique(y, return_inverse=True)
    return y

So you can see it's simply returning np.unique(y, return_inverse=True). And from the numpy documentation:

unique_inverse : ndarray, optional

The indices to reconstruct the original array from the unique array. Only provided if return_inverse is True.

For example:

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y = ['a','b','a','b','c','c','b']
np.all(le.fit_transform(y) == np.unique(y, return_inverse=True)[1]) 
# Returns True

Reverse a label encoded target in test and train series?

1 Answers1

Linked