same cross-validation set for parameter tuning and RMSE calculations

Question

I miss some very basic distinction between cross-validations used for parameter tuning and cross-validation used for calculating the performance of my algorithms (RMSE).

I have two functions: one performs grid search and the other calculates cross-validated RMSE.

def grid_search(clf, param_grid, x_train, y_train, kf):
    grid_model = GridSearchCV(estimator = clf, 
                              param_grid = param_grid,
                              cv = kf, verbose = 2)
    grid_model.fit(x_train, y_train)

def rmse_cv(clf, x_train, y_train, kf):
     rmses_cross = np.sqrt(-cross_val_score(clf, x_train, y_train, scoring="neg_mean_squared_error", cv = kf))
     return rmses_cross

The functions are called this way:

X_train, X_test, y_train, y_test =  train_test_split(dataset, Y, test_size=0.2, random_state=26)
kf = KFold(10, shuffle = True, random_state = 26)    

grid_search(clf, param_grid, X_train, y_train, kf)
# adjust parameters of a regressor
rmses_cross = rmse_cv(clf, splits, X_train, y_train, kf)

As you see I use the same KFold for my parameter tuning and exactly the same KFold set for my calculation of cross-validation RMSE.

And on basis of the calculated cross RMSEs I chose which algorithms performs better. BUT RMSEs are calculated exactly on the same folds on which hyper parameter tuning was performed.

Is it incorrect to do so? I feel that while tuning my model learns on the hold-out folds and it would be incorrect to use them when calculating the RMSEs. Should I choose different KFold for calculation of RMSE?

EDIT:

Why do those two codes produce two different results? I though the cross_val_score refits a given model to each fold. And therefore applying cross_val_score on grid_model or parameterised model should be the same.

kf = KFold(10, shuffle = True, random_state = 26)

First:

grid_model = grid_search(clf, param_grid, X_train, y_train, kf)
grid_model.fit(x_train, y_train)
clf = SVM(kernel='rbf',C=grid_model.best_params_['C'])
rmses_cross = np.sqrt(-cross_val_score(clf, x_train, y_train, 
                      scoring="neg_mean_squared_error",cv = kf))

Second:

grid_model = grid_search(clf, param_grid, X_train, y_train, kf)
grid_model.fit(x_train, y_train)
rmses_cross = np.sqrt(-cross_val_score(grid_model, x_train, y_train, 
                      scoring="neg_mean_squared_error", cv = kf))

Just a reminder that **R**oot**MSE** is [**subadditive**](https://math.stackexchange.com/questions/1588776/subadditivity-of-square-root-function) and should only be calculated at the very end -- and based on *"all"* **S**quared **E**rrors. — Jim, Feb 14 '18 at 17:05

score 1 · Answer 1 · answered Feb 11 '18 at 16:17

RMSEs are calculated exactly on the same folds on which hyper parameter tuning was performed.
Is it incorrect to do so?

yes.

I feel that while tuning my model learns on the hold-out folds and it would be incorrect to use them when calculating the RMSEs. Should I choose different KFold for calculation of RMSE?

yes.

What you need to do is called nested cross validation.

I recommend treating the hyperparater tuning as part of model training (that's a particular point of view on what nested cross validation or a train/optimize/valdation (aka train/validate/test depending on your field) split does.

Briefly, you have 3 functions:

a bare bone (low level) training function (clf (training_data, hyperparameters)) w
a tuned model (high level) training function that internally does the hyperparameter fitting (grid_search (training_data))
a testing funcion (rmse_cv)

Now, as you want to measure the performance of the ready-to-use tuned model, you call rmse_cv on the tuned model training function: rmse_cv (grid_search, dataset)
(regardless of whether or not grid_search makes internal use of rmse_cv for tuning purposes as well).

same cross-validation set for parameter tuning and RMSE calculations

1 Answers1