There is a very similar question about t-test or nonparametric with very good answers, most of which can be applied to ANOVA. See also this very relevant post: Role of central limit theorem in ANOVA (with an answer by Frank Harrell).
But the practice you are alluding to, choose which test to apply after seeing the results of some preliminary test of normality, is strongly advised against. If you are not reasonably sure about the normality assumption, choose some test which do not depend on it, at the outset! In R
that could be kruskal.test
.
That practice is called a multi-step procedure, and does not in general have the usual properties. So you cannot longer trust computed p-values are correct. You could of course still do normality tests or qq-plots, but to learn for the future (probably you will see some similar problems in the future).
You could look at the R
package (on CRAN) for package WRS
which have modern nonparametric methods. See this expository article by Rand Wilcox: New statistical methods would let researchers deal with data in better, more robust ways.
Your follow-up question: If sample size is large in all the groups, then you can use the central-limit theorem to justify normal-based inference, but not to justify normality of the residuals. The CLT is about means, not about individual random variables. But then analysis will be based only on large-sample approximations, which only can guarantee approximately correct significance levels. Alternative analysis could well give much more powerful tests, so if you are not (from theory/experience) reasonably sure about the normal assumption, better to plan to use tests that do not depend on it.