Logistic Regression with dependent observations

Question

I have a dataset that contains 100 different patients over 5 year’s period. Every patient is examined each month with regard to particular illness and marked as healthy or ill (0 or 1). Every person appears 60 times in my sample (5 * 12 = 60).

Every month patient provides A = Average blood pressure in that month, B = Average daily exercise hours and C = Average number of Cigarettes smoked in that month.

The layout of the dataset is as follows:

ID (Unique Patient Identifier)
Month (1 to 60)
A (Average blood pressure in that month)
B (Average daily exercise hours)
C (Average number of Cigarettes smoked in that month)
Ill (Yes, No)

I was thinking of using Logistic Regression which uses information from last three months and gives a probability for patient to be flagged as Ill in next 2 months.

My problem is that logistic regression assumes that observations are independent whereas in my case they are obviously not.

What should I do? Should I use something like GEE or GLMM or something else?

Related: http://stats.stackexchange.com/questions/16390/when-to-use-generalized-estimating-equations-vs-mixed-effects-models — boscovich, Dec 04 '12 at 19:35
Thanks... Any practical advices i.e. what would you do if you were to analyse above data? — user13467, Dec 05 '12 at 17:56

score 1 · Answer 1 · edited Jul 17 '15 at 16:20

1

You could always come up with a set of transformed variables that aggregate the data from 3 months into one observation for each patient (e.g., average blood pressure across the prior 3 months, 3-month exercise hours/cigarette, etc.). Then you have independent observations (1 per patient), and you could build the model. This kind of defeats the purpose of having monthly granularity, but you could evaluate a logistic regression like this against a legitimate longitudinal method.

edited Jul 17 '15 at 16:20

gung - Reinstate Monica

132,789
81
357
650

answered Jul 17 '15 at 15:43

Chris

11
1

Welcome to the site, @Chris. It is fine to answer a question years after it was asked. The actual purpose here is to build up a repository of information for posterity, not quite to help the OP themselves (that's just a byproduct), as counter-intuitive as that may seem. – gung - Reinstate Monica Jul 17 '15 at 16:22

Logistic Regression with dependent observations

1 Answers1