We are interested in determining whether there's an association between frequency of screening visits and cancer outcomes and whether that differs by race. We have Medicare data to analyze this. Typically, the knee jerk reflex for modeling survival times is that of Cox proportional hazards models, but the problem is that we're directly interested in modeling the effect due to time, since this influences the number of screening visits that a person would be eligible for.
The outcome is binary: whether or not a person died at the end of a study interval. Study intervals technical differ from subject to subject. At most, we have 5 years worth of screening data, where eligible in subjects, though some subjects are censored because they enroll in other health care plans and we can't determine what their further COAs are for screening, diagnosis, or treatment).
In my mind, these data are cross sectional, not prospective, since we use visit data from the future to infer about proportions that are homogeneous in the past. Yet, subjects differ in weight between one another due to the length of available follow-up.
I'm struggling to write a correct linear model for these data and propose a valid analysis plan for them. How does one write a linear model with an offset?