I've got a dataset where I'm attempting to predict when an individual will develop a particular disease based on a set of biomarkers. I'm able to find a pretty good fitting model, but it has a high degree of heteroskedasticity. However, this heteroskedasticity is expected--it makes sense that the model will have smaller residuals as the individual nears diagnosis. I began thinking about various "fixes," but wasn't sure if I should fix it. Any thoughts on this?
Asked
Active
Viewed 187 times
2
-
1There are a number of ways to address heteroscedasticity. I demonstrate many of them in my answer here: [Alternatives to one-way ANOVA for heteroscedastic data](http://stats.stackexchange.com/a/91881/7290). – gung - Reinstate Monica May 21 '14 at 16:24
-
1Is your dataset cross-sectional? – Sergio May 21 '14 at 16:24
-
@gung--these are great suggestions, but I'm wondering whether they need to be fixed at all. – dfife May 21 '14 at 16:28
-
@Sergio--these data are actually longitudinal. – dfife May 21 '14 at 16:29
-
@gung (again)...I'm rethinking the weighted least squares. It seems to make intuitive sense: we weight more heavily those observations closer to diagnosis than those further away. And if it fixes heteroskedasticity, that's a bonus. – dfife May 21 '14 at 16:46
-
2If your data are longitudinal (multiple measures on multiple patients) we're playing a different ballgame. You need to differentiate between changing residual variance for each individual's data vs diverging individual trends over time (which is common). You can fit a mixed effects model w/ random intercepts, slopes & a correlation b/t them, get predicted trends for each patient & their individual residuals. The key question is: does that variance change over time? – gung - Reinstate Monica May 21 '14 at 17:21
-
Ahh...good point @gung. I hadn't thought of that. I'll take a look at it. Thanks! – dfife May 21 '14 at 19:33
-
Okay...finally getting to the point of looking at it. Although the data are longitudinal, I'm actually only looking at it cross-sectionally (for a good reason). I think I like the idea of weighted least squares--it makes sense to give more weight to observations closer to diagnosis. – dfife May 23 '14 at 13:57
-
Look into gamlss – kjetil b halvorsen May 17 '21 at 02:40