How to analyze a randomized complete block design (block, plot, year) when assumptions of normality and heteroscedasticity are violated?

Question

How do you analyze a randomized complete block design (6 plots within 4 blocks, data collected over three years) when assumptions of normality and heteroscedasticity are violated?

Is this correct? Use a non-parametric Kruskal-Wallis test when data violates assumptions of normality but not homoscedasticity. Use Welch’s analysis of variance (ANOVA) when data are heteroscedastic. And when both normality and heteroscedasticity are violated, data needs transformed prior to statistical analysis by ANOVA?

My first thought wield be to choose a more suitable parametric model (before collecting the data). What's the response measuring? My second thought might be to consider a bootstrapping approach to estimating quantities of interest from some suitable pivot all or pivot-like quantity (which again depends on what you're looking at) — Glen_b, Aug 27 '21 at 03:48
My third thought would be to look at transformation, but the same considerations would apply as before (considering the nature of the variable and not choosing on the basis of the specific observations you want to use to fit your model) — Glen_b, Aug 27 '21 at 04:17
If another distribution is theoretically a good fit (e.g., binomial, inverse gaussian, ...) an alternative might be to use GL(M)M. — KrisBae, Aug 27 '21 at 08:00
@Glen_b we are measuring the response of fertilizer (5 treatments and 1 control) on crop yield (data collected once annually; three times total) and water quality (collected during storm events). — user333304, Aug 27 '21 at 13:54
Depending on the crop, yields might be either right-skew or left-skew or fairly symmetric. Weather events might even lead to zero-inflated mixtures. How did you assess normality, given that the distribution will be different within blocks and years? — Glen_b, Aug 27 '21 at 17:49
What values did you supply to the Shapiro-Wilk test? Did your Levene test account for both blocks and years? — Glen_b, Aug 30 '21 at 16:11
@Glen_b For example, I supplied the yields to the Shapiro-Wilks test in R: shapiro.test(Yield$Mg_ha). I also used leveneTest(Mg_ha ~ Trt, Yield) without including block and year, and then also separated the data by year and re-ran shapiro and levene separately for each year. — user333304, Aug 30 '21 at 17:53
Thanks. You cant use raw responses aggregated across one or both factors, since the assumption is within both. Neither is it practical to check within each factor combination. Instead you would look at residuals from the full model. However, *testing* these assumptions is not helpful on multiple grounds. See for example https://stats.stackexchange.com/q/2492/805 — Glen_b, Aug 31 '21 at 00:00
E.g. Harvey Motulsky's answer, though there's a great deal more that could be said. Similar issues apply for checking heteroskedasticity. — Glen_b, Aug 31 '21 at 00:06

How to analyze a randomized complete block design (block, plot, year) when assumptions of normality and heteroscedasticity are violated?

0 Answers0