I have a dataset of roughly 1000 subjects' data and their labels. The ultimate goal is to build a neural network which can classify the data as either 0 or 1. My current strategy is to do 10-fold CV, wherein I take a different 10% of the data for testing while training on the remaining 90% (repeated many times). There are many hyperparameters (hidden layer size, net type, etc) which I am optimizing based on the average accuracy of the 10-folds. So at the end of this, I've obtained the best hyperparameters to get the highest average accuracy.
When it is time to actually use this classifier on new data, which model do I use? If my training set is now those 1000 datapoints, then my previously optimized hyperparameters are no longer optimal for that training set.
I may have a fundamental misunderstanding of k-fold, but if not then how do I go about doing this (designing a classifier with labels to be used on unlabeled data in the future)?