1

Given a set of samples that are measured before and after treatment.

The size of the treatment effect depends on the measurement value before treatment.

However, the number of measurements per sample, and the number of measurements before and after treatment vary in the set.

How do I estimate the coefficients $a$ and $b$ in the relationship $y = ax^b$ in this case?

(I'm not really asking about a power function regression analysis, more how to handle the odd number of data points before and after)

Here is some R code to exemplify the problem:


set.seed( 100 )

x.design <- factor(c(
    1,1,1, 2, 3,3,  4,4,4,4
)) # each sample measured different number of times before

## generate a random profile for 4 samples
x0 <- runif(nlevels(x.design),max=10)

## add 10% CV to each measurment and expanding per the design
x <- unlist( sapply( seq_along(x0), function(i){
    nn <- summary(x.design)[i]
    xi <- x0[i]
    rnorm( nn, mean=xi, sd=.1*xi )
}))

## two parameters to estimate in a y = a * x^b relationship:
a <- rnorm(1,mean=1, sd=.2)
b <- rnorm(1,mean=1.5, sd=.2)

## generate some y's along the same lines as we did with x:
y0 <- a*(x0**b)

y.design <- factor(c(
    1, 2,2,2,2,2, 3,3,3,  4,4
))

## add noise in the same way
y <- unlist( sapply( seq_along(x0), function(i){
    nn <- summary(y.design)[i]
    yi <- y0[i]
    rnorm( nn, mean=yi, sd=.1*yi )
}))

## graph the relationship using sample means:
plot( by( x, x.design, mean ), by( y, y.design, mean ) )
curve( a*x^b, add=T, col="blue" )

## data combine in a data.frame
rbind.data.frame(
    data.frame(
        Sample = x.design,
        Value = x,
        Treatment = "before"
    ),
    data.frame(
        Sample = y.design,
        Value = y,
        Treatment = "after"
    )
)

EDIT: Added data.frame to combine both x and y

Now I could just use the mean per sample per treatment, and it wouldn't be very wrong, but it would be nice to take into account the power each sample lends to the estimation, seeing how they have different number of measurements.

I suspect this kind of problem must have been presented before, but I haven't been able to search for the right terms I guess.

Robert Long
  • 53,316
  • 10
  • 84
  • 148
Sirius
  • 126
  • 4
  • If you fit a mixed effects model then it doesn't matter if the data are unbalanced. – Robert Long Mar 05 '21 at 21:38
  • Then how would you set this up? Remember these are the same samples being measured before and after treatment – Sirius Mar 05 '21 at 22:18
  • random intercepts for the sample ID (or whatever grouping factor you have repeated measures for) – Robert Long Mar 05 '21 at 22:40
  • I agree, that would make sense, I still can't see how to set it up to solve for $a$ and $b$ in my case – Sirius Mar 07 '21 at 08:22
  • You specify the sample ID as a grouping variable. For example in `lme4` and some other R packages you would use `(1 | sampleID)` – Robert Long Mar 07 '21 at 12:04
  • I know, could you please show how to set it up in my specific case. Samples with measurements before and after is in my code. – Sirius Mar 07 '21 at 12:23
  • I don't follow your code very well and I don't see a data frame. If you can adapt your code to output a data frame, with each row as one measurement I will try to help. – Robert Long Mar 07 '21 at 12:31
  • I updated the code to combine both x and y data to a single data.frame – Sirius Mar 07 '21 at 14:43

1 Answers1

1

I am assuming that you have much more data in your actual dataset than these simulations, otherwise this approach is unlikely to work.

Using your code, I would slightly adapt the data as follows:

rbind.data.frame(
    data.frame(
        Sample = x.design,
        Value = x,
        Treatment = "before"
    ),
    data.frame(
        Sample = y.design,
        Value = y,
        Treatment = "after"
    )
) -> dt

dt$trt <- c(rep(-1,10), rep(1,11))

Here we are centering the measurement occasion variable.

The we can fit a model with random intercepts for sample and random slopes for measurement occasion

lmer(Value ~ trt + (trt | Sample), data = dt )

the fixed effect for trt estimates the treatment effect, while the correlation between random intercepts and random slopes handles the association between baseline and treatment effect.

See my answer below for details of why we specify the model this way, along with some references:
Analyzing impact of baseline value on change score using lmm

Robert Long
  • 53,316
  • 10
  • 84
  • 148
  • Thank you for this. If I measure a new "sample 5" before treatment - how could I have used this model to estimate what $Value$ that sample would have had after treatment? – Sirius Mar 07 '21 at 23:57
  • Just use the `predict` function – Robert Long Mar 13 '21 at 17:19
  • Thank you for the answer. I still don't get it. I'm likely slow , or more likely communicating poorly. Sample 5 comes in and measures $12.345$ after treatment. How can I find what value this sample would have had before treatment, using this predict with this model? – Sirius Mar 13 '21 at 18:16