I'm performing a study where I'm selecting kernel type and hyperparameters in an inner CV loop and an outer loop doing 10-fold CV (using SVR). The output is 10 trained models and performance measures.
My question is where do I go from here. When I train a new model with the complete dataset using the selected kernel (by either the hyperparameters that gave the min error during the 10-fold CV or finding the optimal ones with the selected kernel for the complete dataset) the final model I end up with is not validated against training data. Is it reasonable to do this and use the average error previously obtained from 10-fold CV as an "informal" performance estimate since the model is trained on a slightly larger dataset? How would I word this in a journal paper? My thesis advisor is questioning it for one.