I am new in sklearn and I try to learn how to use cross-validation to choose the best model of an SVM. I found this example How to split the dataset for cross validation, learning curve, and final evaluation?and I tried to understand how it does work. Here are some lines that I am not sure that I have understood.
from sklearn.learning_curve import learning_curve
title = 'Learning Curves (SVM, linear kernel, $\gamma=%.6f$)' %classifier.best_estimator_.gamma
estimator = SVC(kernel='linear', gamma=classifier.best_estimator_.gamma)
plot_learning_curve(estimator, title, X_train, y_train, cv=cv)
plt.show()
1) What is the estimator object here, is it a clone of a the best model returned by the cross_validation? I did not think!
2) Is this function plot_learning_curve
will apply the cross-validation selection again? I think yes, because it take a cross-valiation iterator.
classifier.score(X_test, y_test)
3) What is the model that returns this score. Is it the best model selected in the section 5)
of the previous link?
classifier.fit(X,y)
4)What is the utility of this operation?