You're basically right about data organization. If you have cases organized like this:
ID M1 M2 M3 EVENT
You will likely want to reorganize the data so that it looks like this:
ID TIME EVENT
1 1 0
1 2 1
1 3 1
2 1 0
2 2 0
. . .
. . .
I call this a conversion from a wide format to a long format. It is done easily in R using the reshape()
function or even more easily with the reshape2
package.
I personally would keep the ID
field for its potential use in identifying a source of variation in a mixed effects model. But this is not necessary (as pointed out by @BerndWeiss). The following assumes you would want to do so. If not, fit a similar model with glm(...,family=binomial)
without the random effect terms.
The lme4
package in R will fit a mixed effects logistic regression model similar to the one you're talking about, except with a random effect or two to account for variability in the coefficients across subjects (ID
). The following would be example code for fitting an example model if your data are stored in a data frame called df
.
require(lme4)
ans <- glmer(EVENT ~ TIME + (1+TIME|ID), data=df, family=binomial)
This particular model allows the TIME
and the intercept
coefficients to vary randomly across ID. In other words, this is a hierarchical linear mixed model of measurements nested in individuals.
An alternate form of a discrete time event history model breaks TIME
into discrete dummies and fits each as a parameter. This is essentially the discrete case of the Cox PH model because the hazard curve is not restricted to being linear (or quadratic, or however you can imagine transforming time). Although, you may wish to group TIME
into a manageable set (i.e. small) of discrete time periods if there are a lot of them.
Further alternates involve transforming time to get your hazard curve right. The previous method basically alleviates you from having to do this, but the previous method is less parsimonious than this (and the original linear case I posed) because you may have a lot of time points and thus, a lot of nuisance parameters.
An excellent reference on this topic is Judith Singer's and John Willet's Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence.