I am trying to use repeated cross validation to test my classifier. Moreover, I want to use imputation due to missing values and downsampling due to unbalanced data (I have 88% of my data in the positive class and 12% in the negative class). My approach is the following:
r <- 10 (repetitions)
k <- 10 (folds of CV)
for 1:r:
assign folds/split data into train and test set
for 1:k
imputing train set and test set separately
building classifier on downsampled train set
predicting classes and comparing to test set
end
averaging to get cross-validated performance measures
end
Is this correct?