Compare treatments on mean difference between two times

Question

This sounds like an easy setup, and I'm absolutely sure it's not too complicated to do, but somehow I can't figure out how to approach this.

The setting is as follows : There are 6 treatments with 16 subjects in each treatment. Weight is measured for every subject before and after treatment.

The hypothesis to be tested : is there a difference between treatments in effect (the difference in weight before and after treatment).

Very simple repeated measures design, but the thing is that the subjects are not identified. The only thing you know is that it's the same subjects, but you can't use a repeated measures due to the lack of some id variable.

Up til now, I constructed the mean difference as the difference between the means at t=0 and t=1 for every tank. Calculating the SE for each of these differences allows me to do a series of T-tests and put a Dunn-Sidak correction on it afterwards. But somehow this is cumbersome and feels not right.

Alternatively, I can substract the mean for each treatment at t=0, and do a classical one-way ANOVA. But I am not sure how I should correct the variances in order to count for the fact that I don't substract a fixed number, but a sample mean.

What is the appropriate model to use here, and if possible, where do I find that function in R?

it is NOT aov(weight~treatment*time), as this will make a comparison between eg t=0/treatment=1 and t=1/treatment=3. That's not what I want.

Can you distinguish what pre-treatment weight is assigned to each different treatment group? (i.e. I know these 16 observations are pre-treatment weights for treatment A). Also were treatments randomly assigned? — Andy W, Oct 07 '10 at 13:03
The 16 observations are 16 fishes in a tank. The treatment is given to the tank. So the 16 fishes in the tank were weighed before and after treatment. But as the fishes weren't marked, it's impossible to link the weights to a specific individual. — Joris Meys, Oct 07 '10 at 13:08
Your issues with independence of observations go beyond repeated measurements. Your 16 fish within a tank are not independent and the treatment was applied to the tank. You do not have 16 independent pre and post observations. In fact, you have 6 observations assigned to one of 6 treatments. Any analysis that uses the fish as the analytical unit will grossly overestimate your degrees of freedom. At very least, you need to account for the nested structure and the intra-tank correlations. Needless to say, you need to increase your N, which is 6 right now (see pseudo-replication). — Brett, Oct 07 '10 at 14:41
@Brett : I am aware of the fact that my observations aren't independent. But I don't agree with your statement that N=6. That would mean that a clinical study with 4 hospitals involved would have N=4, which obviously doesn't make sense. It's impossible to give fish in the same tank different treatments, as the treatment is in the water. The 16 fish are independent, but the measurements at t=0 and t=1 aren't. I correct for this by calculating the df using Satterthwaite and then divide them by 2, which gives me a df of appx. 14 for the t-tests. — Joris Meys, Oct 07 '10 at 15:09
In your hospital example, it depends on how the patients are randomized. If the hospitals are assigned to the treatments, then your N is the number of hospitals. If patients are randomly assigned within hospitals, then your N is the number of patients. Again, I refer you to the literature on pseudoreplication which you'll find easily. Once you say "the treatment is given to the tank" you are done. Level of treatment implies appropriate level of analysis. Here is the original article http://www.masterenbiodiversidad.org/docs/asig3/Hurlbert_1984_Pseudoreplication.pdf There's plenty more. — Brett, Oct 07 '10 at 15:31
@Brett Magill - I completely disagree with the statement "Level of treatment implies appropriate level of analysis". Your units of analysis will be determined both by the question at hand and restrictions on your data. Now I agree that this design causes nesting complications, but it doesn't make it unreasonable to make assumptions about the independence of units within treatment groups and make adjustments to degrees of freedom accordingly. Your statement is essentially refuting the methodology of multi-level models and that entire body of work (as well as any non-experimental study.) — Andy W, Oct 07 '10 at 16:45
Notice, I said implies rather than dictates or controls. Also notice the mention that you need to at least account for intra-tank correlation--a nod to a multi-level approach. In fact, in this case, this is exactly what multi-level models will do--penalize the d.f. according to the degree of intra-class correlation, effectively adjusting the N downward to account for the structure. By the way, I've got slides from George Casella's Experimental Design session at ASA that talk about this specific problem--randomizing tanks rather than fish--as an example of pseudo-replication. — Brett, Oct 07 '10 at 17:00
If you don't want to read the original article and the ensuing literature that I cited previously, take a look at this 1.5 page writeup from the statistical consulting unit at Cornell. It's pretty easy to follow. http://cscu.cornell.edu/news/statnews/stnews75.pdf — Brett, Oct 07 '10 at 17:24
@Brett : I did read the article and I understand what you're getting at, but this is not an ecological experiment. Following Hurlbert, each fish should have its own tank, and the water in these tanks should not come from the same source. But the experiment done here is not equivalent to the ecological studies he describes. Plus, this approach is impossible if you can't collect water in 192 different lakes... Furthermore, his paper is not a statistical result, but a logical -and in a number of cases valid- argument that not everybody agrees upon. — Joris Meys, Oct 07 '10 at 20:11
Ok, how about Casella in hist Statistical Design of Experiments Book. Here's a link the relevant page in Google books. http://books.google.com/books?id=sqnbSUtryVAC&pg=PA5&lpg=PA5&dq=casella+pseudoreplication&source=bl&ots=634YgjM_h7&sig=hf8C_vWd03hELis6bC-nxyltTWA&hl=en&ei=3ymuTOeDJdT-nAfis4DuBQ&sa=X&oi=book_result&ct=result&resnum=2&ved=0CBYQ6AEwAQ#v=onepage&q&f=false — Brett, Oct 07 '10 at 20:14
@Brett : As I said, it is a point of view. One can easily argument that every fish has its own physiology. Biologically spoken, adding the weights of all fish in 1 tank doesn't even make sense, as you have males and females, different age groups and the likes. If I would have done the experiment, I would have marked the fish so it would be a truly repeated design. But having 4 seperate tanks filled with exactly the same water and given exactly the same treatment, does not mean for me that I suddenly have more degrees of freedom. It merely means I have split up 1 tank in 4. — Joris Meys, Oct 07 '10 at 20:19
@Brett : the main idea behind Hurlbert, is that you can't distinguish between the effect of "tank" and the effect of "treatment" theoretically spoken. Practically, it's proven in the lab that the used procedures do not cause a "tank" effect beyond natural variation between fish. Hence... edit : for the record, I do appreciate your input, it makes me think about the setup and it gives me something to tell the lab people not to repeat this setup any more in their life :-) — Joris Meys, Oct 07 '10 at 20:31
What sort of variance correction were you thinking about? Any error in estimate as a consequence of using the sample mean will result in increased variance in the difference scores. So isn't the variance already adjusted for? If anything it is going to be high. — russellpierce, Oct 15 '10 at 16:05

score 2 · Answer 1 · answered Oct 07 '10 at 13:21

First since you do not know which subjects are which, I think it makes more sense to treat each group of observations (before treatment and after treatment) as seperate observations (i.e. as if you randomly assigned treatment to 96 out of 192 subjects). I know this seems obvious but for heuristic purposes I think it helps clarify the question at hand. While this isn't optimal (regression to the mean) I would say you are still better off than many observational studies (assuming no self-selection into treatment).

My initial thought was if treatments are randomly assigned you might as well treat all the pre-treatments as one big control group. You could then use an OLS framework with dummy variables to estimate the treatment effects (with pre-treatments as the reference category). If you know what pre-treatments go to each treatment you could run an anova to see if any non-neglible differences exist between the mean weight in pre-treatment groups. If there are differences in the pre-treatment groups that are not ignorable you should be able to use a multi level framework.

Since fish are in the same tank a multi-level framework may be appropriate anyway.

I've been thinking about that as well, but the initial mean weight differs significantly between tanks, and using all t=0 observations as one control makes for a very unbalanced design. On top of that, the hypothesis is formulated in difference between treatments, which cannot be formally tested using a control as reference group. The multi-level framework doesn't help me either for the same reason. — Joris Meys, Oct 07 '10 at 13:55
I don't see why the unbalanced design is a problem, but the mean weight difference is obviously a problem with my suggestion. I would still think the multi-level framework would allow you to assess differences between treatments somehow (I do not know how offhand though). The only other thing I can think of would be to simply graph each control/comparison like this response did (minus the connecting lines) http://stats.stackexchange.com/questions/2067/follow-up-in-a-mixed-within-between-anova-plot-estimated-ses-or-actual-ses/2138#2138 , although I imagine this is not entirely satisfactory. — Andy W, Oct 07 '10 at 14:10

score 2 · Accepted Answer · answered Oct 15 '10 at 16:34

The one-way ANOVA approach you mention sounds fine to me. Sure the individual change scores aren't going to be the "true change" by any means, but they are better than nothing. If anything the resulting variance in the model should be over estimated as a consequence of this procedure.

In R the easiest way to do ANOVA in simple designs (IMO) is to use ezANOVA (package ez). E.g. ezANOVA(data,dv=.(WeightChange),sid=.(PseudoSubjID),between=.(Time))

I can't quite say anything about instantiation, but another approach might be to find the optimal set of paired scores such that the difference between scores is minimized and then treat that pairing as if it is the true pairing. I want to say that approach should be conservative, minimizing the differences between t0 and t1, but your mileage may vary.

Compare treatments on mean difference between two times

2 Answers2

Linked