Quite recently I stumbled upon several posts here on nested cross-validation, which showed me how wrong my understanding was of such procedure. Now, trying to put all the pieces together, I still have some doubts.
One of my questions is the following: let us suppose I want to come up with a model for my data+task. I know that the purpose of nested cross-validation is to give an estimate of the generalization performance of the model (whatever it may be) on the data I have. I also know that if in all the inner validation folds I obtain the same model with the (approximately) same hyperparameters, the results are stable and I can say that the estimated generalization performance is somehow reliable. In this case I can re-run the inner validation procedure on the entire dataset and obtain the final model. However, if in the inner validation folds the best models (chosen by an optimization procedure such as a grid search) are different, what I can only say is that the estimated generalization performance that comes out of this nested cross-validation run, provided it is "stable" in the outer loop (negligible differences/low variance), is reliable (according to this answer). In this case I would do exactly the same as above to find the final model (all the models are somehow equal in the performance, please correct me if I am wrong).
However, what if the models are different and the outer loop reports very different generalization performance estimates? I would say, according to this answer and many others, that I need to stabilize the procedure. Is that right? In particular, what I would do is to add some regularization or increase the number of repetitions for each internal k-fold cross-validation (other approaches to stabilize the optimization procedure may be possible, I guess, and this does not convince me though). Is this correct reasoning?
And finally, assuming I chose to increase the number of repetitions for the inner k-fold cross-validations, when I move on to obtaining the final model (running the inner cross-validation procedure) should I use the same number of repeats? I would say so, but I am not sure I am right (also in this answer it is suggested that repetitions are very useful for the outer loop but it does not mention the same for the inner loop under the same aspect, which to me seemed quite counterintuitive).