1

when performing a Scikit train/test split like so:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

with categorical target values (y from above) already label encoded :

class_le = LabelEncoder()
aDataFrame['aTarget'] = class_le.fit_transform(aDataFrame['aTarget'].values)  

I can run a classification report from the result of a classification:

print (classification_report(results, y_test))

that prints out info about the precision:

         precision    recall  f1-score   support

      0       1.00      1.00      1.00        18
      1       0.40      0.25      0.31         8
      2       0.08      0.10      0.09        10

Is there a way to say what decoded category each of those results referred to?

How can I determine what the already encoded target values were before encoding? For example, if I print out the contents of the y_train, y_test variables I'll see a series like so:

    aTarget
12799   192
145162  15
140041  205

Just looking at the target of 192, how would I determine what category it originally referred to given the original class_le label encoding object? thanks very much for any tips!

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Rob
  • 39
  • 1
  • 8

1 Answers1

0

From the source code:

def fit_transform(self, y):
    """Fit label encoder and return encoded labels
    Parameters
    ----------
    y : array-like of shape [n_samples]
        Target values.
    Returns
    -------
    y : array-like of shape [n_samples]
    """
    y = column_or_1d(y, warn=True)
    self.classes_, y = np.unique(y, return_inverse=True)
    return y

So you can see it's simply returning np.unique(y, return_inverse=True). And from the numpy documentation:

unique_inverse : ndarray, optional

The indices to reconstruct the original array from the unique array. Only provided if return_inverse is True.

For example:

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y = ['a','b','a','b','c','c','b']
np.all(le.fit_transform(y) == np.unique(y, return_inverse=True)[1]) 
# Returns True
ilanman
  • 4,503
  • 1
  • 22
  • 46