Split plot design with psudo replicates

Question

I am working on a project and need to analyze data that I know is an example of a split plot design but I am having trouble setting the model up correctly.

Here's the situation: A bakery is testing cookies. They are interested in how sugar levels and freshness impact taste ratings (on a scale of 1 -10). They bake cookies at three sugar levels (1/2 of what the recipe calls for, the regular amount, and double what the recipe calls for) and hand out 5 cookies to 5 random customers on each of 3 days (the day the cookies are baked - day 1, the following day - day 2, and the following day - day 3). The bakery bakes 9 batches of cookies and each batch is randomly assigned to one of the 3 sugar levels. So we have information on 135 cookie ratings.

I know that the 5 cookies per batch are psudoreplicates so I averaged those to come up with 3 average taste ratings per batch - one for each freshness level (day 1, day 2, and day 3).

I am trying to analyze this using a mixed model with average taste rating as a response variable, sugar level and freshness as random factors and batch(sugar level) as a covariate. But my model isn't coming out correctly. Thus far I am working in Minitab but I could also use SAS.

You should use the raw data, not be averaging, I think. Maybe start out with some graphics. Can you post (a link to) the data? — kjetil b halvorsen, Dec 06 '19 at 04:57

kjetil b halvorsen · Answer 1 · 2019-12-18T20:58:35.430

You should model the raw data as is, not replace by averages. And, this are not really pseudoreplicates, the five cookies are given to five different persons, yes? And, as the response is the ratings, not some measured characteristics of the cookies, the variation in the ratings is what is relevant.

Before going into the split-plot model. some comments on the design.

Design of experiments for food sensory research is a very specialized field, some links. By only asking some persons about giving the product some rating, how do you know they interpret/use the scale the same way? Maybe it would have been better to ask some more specific questions and even better a design where the same experimental persons where asked of comparing/evaluating different variants of the cookies ...

So you have a data file with 135*5=675 observations, in a format something like:

rating    batch   level    day
  .         1       1       1
  .         1       1       1
  .
  .
  .
  .         1       1       2 
  .

where rating is a numerical variable, the others are factors. Batch with 9 levels, level with 3 (sugar) levels, day with 3 levels.

There is a nesting structure batch/day and we model level (the focus variable) as a fixed effect. Maybe we are also interested in day as a fixed effect. I do not know about SAS but in R with package lme4 we could say:

library(lme4)
mod  <-  lmer(rating ~ level + day + (1 | batch/day), data=your_data_frame)

The notation | batch/day can be read within batch, and then for each batch, within day, and the 1 before it stands for a constant. So gives a set of random constants with that structure.

There are some similar questions, so look at split-split plot design with unbalanced repeated measures in lme4 or nlme (SAS translation), Split plot in time mixed-effect model in R, Cheat Sheet ANOVA Alphabet Soup & Regression Equivalents

EDIT

Trying to answer the question in the comment by @MichiganWater. It seems to me that pseudoreplication here maybe depends on the goal of the analysis. If the goal is to ascertain some objective property of the cookies, then the five people trying cookies from the same batch is pseudoreplication. But the OP speaks about taste ratings without further explication, and taste is not an objective property of the cookies, it is an interaction between cookie and the person eating it. As an example, I, as part of my Asberger, have sense hypersensitivity, and for me food tastes the same if cooked without salt as with (unless really to much). That would make me an outlier in a sense experiment, probably. I don't know how large the variation in subjective taste is, but it is there. So if subjective felt taste is the objective, then variation between persons should be relevant.

But, using the mixed model analysis I proposed, maybe this does not make a large difference. The mixed model analysis estimates the variance at each level (block(level)/day/person), and that information could be interesting itself. But in testing the effect of level, only the variance from the level below would be used. I quote from Casella "Statistical Design" (page 5)

This is an example of a nested design, where Tanks are nested in Diets and Fish are nested in Tanks. In such designs the testing is straightforward – the nested factor provides the error mean square for the factor in which it is nested. (See Section 1.5.) Of course, we can test the significance of tanks using MS(Tank)/MS(Fish), but this is wasted effort. There is typically no interest in assessing the significance of tanks; they are merely there to hold the fish!

So in using the modern mixed model formulation, maybe the question about pseudoreplication is taken care of automatically? I would like to look at this with a simulated example, but that will have to wait. Will think more about this. But anyhow, the original data should be used, and not the reduction to summaries, since they are more informative.

Hi kjetil. I disagree that the 5 cookies are not pseudoreplicates. I have given more explanation in my Answer to Megan, but I'd like to know whether you agree or disagree, given my rationale. Thanks. — MichiganWater, Dec 17 '19 at 23:42
(1.25 years later!) I just ran across this question today on a web search and thought "this seems oddly familiar"...then I saw the answers, haha! For some reason I missed that you added more in an Edit, responding to my question. Thanks for doing that. I want to spend more time thinking through what you've said, but in the meantime I noticed you said 675 observations. I figured only 135 observations (9 batch * 3 day * 5 people) so I'm wondering if there's a disconnect here regarding the design proposed by the OP? (3 batches/sugar level) — MichiganWater, Apr 04 '21 at 19:39

score 1 · Answer 2 · answered Dec 17 '19 at 23:41

I know nothing of SAS, but if you are using Minitab to try and analyze this dataset, the first thing to note is that Batch is not a covariate according to Minitab's definition (a covariate is a continuous variable). Batch is a factor. In addition, Batch will be nested within Whole Plots. I've given some code below that should generate the design with some simulated (fake) data, along with the analysis using Minitab 18's General Linear Model tool, which I assume you're using. See if that helps you to understand how to structure things. If you're not used to using session commands, let me know and I'll put more description around using the GUI.

I disagree with kjetil regarding whether the 5 cookies should be considered pseudoreplicates. I think they are pseudoreplicates. Before going any further I do want to acknowledge that this gets complicated real quick because of the 5-cookies-to-5-different-people, so it's certainly an easy mistake to make. I'm pretty sure I'm correct, as I'll explain below, but I'm certainly open to arguments the other way.

Now, on to the issue of pseudoreplication. Because the five cookies were produced by the same process at the same time, any variations in the process steps will be common to those five cookies (actually to all 15 cookies that are baked together). Let's say that the oven lost power for a short period of time while it was baking batch #4. The variation would be common to all 5 cookies, and so they are not independent of each other. Those cookies baked together do not capture the batch-to-batch variation, which would be required to consider the cookies independent. The only way to make them independent would be to bake each individual cookie by itself. That's why I see them as pseudoreplicates.

What about the fact that each cookie is given to a different person? To me the only thing that makes sense is to consider that as simply part of the measurement noise, and it's just rolled into the uncertainty around your estimates (e.g. the confidence interval widths).

For anyone reading this that doesn't have access to Minitab, the table of factors and response should look like this, except that I can't get the headers to line up with the columns:

WP  SugarHTC    Day AvgResponse
1   0.5 1   3.71173
1   0.5 2   6.25209
1   0.5 3   3.81954
2   1.0 1   4.58830
2   1.0 2   5.13217
2   1.0 3   5.15029
3   2.0 1   4.78808
3   2.0 2   6.75095
3   2.0 3   4.19827
4   0.5 1   5.76470
4   0.5 2   4.94079
4   0.5 3   5.97733
5   1.0 1   7.28974
5   1.0 2   3.73834
5   1.0 3   4.80945
6   2.0 1   5.29424
6   2.0 2   6.84974
6   2.0 3   4.09316
7   0.5 1   6.11799
7   0.5 2   3.77689
7   0.5 3   4.33160
8   1.0 1   4.17665
8   1.0 2   3.73358
8   1.0 3   6.10543
9   2.0 1   4.61165
9   2.0 2   5.01459
9   2.0 3   6.66854

Also, you could analyze the raw data, assuming you get the multilevel structure correct, but as long as the number of cookies in that lowest level (the groups of five) is identical for each treatment condition, then you can use the averages and the results of the analysis will be the same.

Stephen Senn recently covered this tangentially in this article:

Now consider a second equivalent analysis of this. This just uses the average at baseline and outcome per hall. In other words, it is based on 20 pairs of values (baseline and outcome) not 200. This analysis produces the table in Figure 2. Note that the result is exactly as before, showing the irrelevance of the variances and covariances within halls. That is to say, that although the mean squares change, because now based on averages of 10 students per hall, the ratio of the term for treatment to its residual is the same and so are all inferences. The equivalence of summary measure approaches to more complex models for certain balanced cases is well known (Senn, S. J. et al., 2000).

https://errorstatistics.com/2018/11/22/stephen-senn-on-the-level-why-block-structure-matters-and-its-relevance-to-lords-paradox-guest-post/

Here is the code for running Minitab's session commands:

name c1 "WP" c2 "SugarHTC" c3 "Day" c4 "AvgResponse"

Set 'WP'
  1( 1 : 9 / 1 )3
  End.
Set 'SugarHTC'
  3( 0.5 1 2 )3
  End.
Set 'Day'
  9( 1 2 3 )1
  End.

Random 27  'AvgResponse';
  Normal 5 1.0.

GLM;
  Response 'AvgResponse';
  Nodefault;
  Categorical 'WP' 'SugarHTC' 'Day';
  Nest 'WP'('SugarHTC');
  Random 'WP';
  Terms 'WP' 'SugarHTC' 'Day' 'SugarHTC'*'Day';
  TMethod;
  TAnova;
  TSummary;
  TCoefficients;
  TEquation;
  TFactor;
  TEMS;
  TVariance;
  TDiagnostics 0;
 GFOURPACK.

One final thing to consider is the distribution of the response (at each treatment condition). Both the raw data and the averages will be bounded between the upper and lower limits of the scale used for the ratings, and that could introduce problems with assuming Normality for the analysis. For cookies made at the normal sugar level, on Day 1, are they almost all at the highest rating, or at least very close to it? Or for 2x sugar and 3 days later, are they all near the bottom of the ratings? If so, you might have issues with floor and ceiling effects. If that's the case, I'd recommend opening a new Question to elicit responses from others who are more familiar with how to deal with that and give advice specific to your situation.

Split plot design with psudo replicates

2 Answers2