I understood the purpose behind CV (i.e. just to make sure the data is spread well across the folds, and the skewness is somewhat averaged-out).
Let's take this example -
cv_cart_Scores=[]
for k in myList:
cart = DecisionTreeClassifier(random_state = 777, min_samples_leaf=k)
cart_scores = model_selection.cross_val_score(cart, x_train, y_train, cv=10, scoring='accuracy')
cv_cart_scores.append(cart_scores.mean())
Here, cv_scores
stores the accuracy score of each fold in the CV. And later we mean it in next line.
I'm not getting:
- What is the purpose of
cv_cart_scores
. - What is the purpose of looping it
myList
number of times? - Shall we take
cv_cart_scores.mean()
at last to get the final score of the model?
RB (on Python 3.6)