Cross Validation (Wikipedia) is not a technique to avoid overfitting. It is a method that allows you to judge a model, by training it on some data, and testing it on some other, in order to measure its performance. That is all it is supposed to do.
The reason why we use Cross Validation, instead of one fixed test set, is that when doing model selection or hyperparameter tuning, you want to select the model that does best with respect to your test set. If you do that with one, fixed test set, you will have fit your model to your testing data, which you should not do if you want the estimate of your error to be meaningful. This is the same reason why Machine Learning competitions limit the number of test you can make per day (see this the Baidu controvery).
This does not change if you have a small or a big dataset, or with any type of learner.
After having used cross validation to build $K$ models, you test them on the $K$ tests sets, and it gives you $K$ value of your cost function. You can take the mean to be a measure of the bias of your model (how wrong is it), and the standard deviation to be a measure of the variance of your model (how much does it change if the input data changes a little) (See the Bias-variance tradeoff (Wikipedia). This should guide you in the design of your model, what hyperparamters you need to change, etc...
But when you are done tuning your model, and you do not need an estimate of the error anymore, you should train your model on your complete training data. You can also train $K$ sub models and average them, but this is not Cross Validation, this is Bagging (Wikipedia).
Bullet points Q/A
Which of those K sets of learned params should I select?
None of them, they serve the purpose of judging the performance of your model
Do I retain the params set which gave the least error?
If you are talking about the hyperparameters, yes
Or should I now learn a new model with all training data together
Yes
(in which case, why did I do CV in the first place)?
To get an estimate of what your generalization error was.
Does CV help avoid overfitting?
It does nothing by itself. It can help you spot overfitting, and you can do something about it.
Is this true in all cases or is it true only for specific type of learners (like Decision trees)?
True for all
How does CV avoid overfitting if the above is true in general for any learner?
By showing you the training error and the test error
Is CV needed only for small datasets or also for large datasets (think big data scale)?
It is valid for all scales
If the purpose of CV is to get a better estimate of the prediction error (and assuming we have a large test set), why shouldn't we just use a fixed training set, and a number of disjoint test subsets, averaging the errors across all test subsets?
Because you will fit the hyperparameters of your model to reduce the test error, and you will have fitted your model to your test set. If you do this, the "test error" you get is no longer a test error, and is too optimistic.