I struggle to understand k-fold cross validation. I understand it is a tool to check the generalization error of a model and works shuffling the data and diving it into k-chunks. Than $k$ models are trained, each time using a different chunk for testing and the remaining $k-1$ for training. The mean and spread of the test errors give access to the generalization error.
My first question is:
1- Since k-fold CV is used just for checking generalization error, the finally deployed model can be trained on all data available ?
My second question probably shows some lack of my understanding:
2- How to choose and compare results with different k ? If we do 10-fold or 5-fold cross validation, the samples used for training are of different sizes $N_t$ ($N_t=9/10 N$ and $N_t=4/5 N$ respectively, where N is the amount of data available). If we plot a learning curve https://en.wikipedia.org/wiki/Learning_curve_(machine_learning) we now that the error depends on the number of training samples. So how should we compare results of cross validations with different $k$ ? But it would be strange that the errors obtained were not comparable, because than they would depend on $k$, which is not something nice.
Thanks for any insight.