0

I am wondering if predicting time taking for readmitted patients to the hospital is a regression model. I have necessary variables but I am kind of confused if the prediction model that predicts the time or days that patients will come back to the hospital after a period of time is a regression model or not.

EdM
  • 57,766
  • 7
  • 66
  • 187
StoryMay
  • 2,273
  • 9
  • 25

2 Answers2

4

This can be approached as a regression model, but it should be considered as a time-to-event ("survival") regression model, not an ordinary least squares regression. In your situation, readmission is considered the "event," and you are modeling the time from some starting point (e.g., time of prior hospital discharge) until the readmission "event" occurs. There are a couple of reasons why ordinary least squares regression doesn't work well on such data.

First, this type of situation typically involves individuals who have not had an "event" (readmission in your case) before data collection ended. For example, you might have a patient for whom your last information is that there hasn't been a readmission through 20 days. That patient might, however, end up being readmitted at 26 days, or never being readmitted at all. You just don't have data beyond 20 days for that patient. If you only have data through 20 days without a readmission, all you can say is that the time to readmission for that patient is at least 20 days, a lower limit. Times-to-event with only a lower limit known are called "right censored" event times.

If you were to do ordinary linear regression, how would you handle a case like that? If you count the time as 20 days, you know that you are probably underestimating the actual time to readmission; the individual might never even be readmitted! If you omit that case, you might be biasing your results in other ways. Survival analysis has built-in ways to handle censored time-to-event data without bias.

Second, the errors around model predictions for survival models often aren't normally distributed, as is assumed when you do inference with ordinary least squares regression. In fact, there are several different forms of survival models based on different assumptions about the error distributions.

There are many links on this site and on the web to survival models. The Wikipedia page is one place to start. This handout summarizes the general principles nicely, with some limited mathematical formalism. This handout explains the assumptions about error terms implicit in different types of survival models, although with a fair amount of math. The R survival package is one good place to start with data analysis itself.

EdM
  • 57,766
  • 7
  • 66
  • 187
  • What do you mean by the lower limit? And Can you tell me that the problems with ordinal regression model on this? I am trying to learning it and hope you can explain more plainly. – StoryMay Jan 28 '21 at 15:16
  • @StoryMay elaborated some in the revised answer. This is a large topic, start with the links provided and with a web search on survival analysis. – EdM Jan 28 '21 at 18:01
0

Definitely recommend survival model.perhaps intuitively, linear regression builds off constant slope. Survival models build off constant proportion surviving. If you work in 1 month blocks, and say probability of return in one month is p. Then probability of not returning after 3 months is (1-p)^3.i would suggest using a discrete survival model predicting return after each month (or weeks etc). You can use logistic regression to predict each month and then multiply together survival prediction s of each month to get survival curve.

assuming your original data is 1 row per person, you convert to 1 row per person-period ( person-x 2 month after) see link for R code and example.

you can add any additional predictors you want eg p(survive_month n|survive up to month n -1) ~ elapsed_months + age + gender + ....

so to get survival curve $p(survive up to month n)$, use the formula $p(\text{survive up to month n}) = p(\text{survive_month 1}|\text{survive month 0}) x ... p(\text{survive_month n}|\text{survive up to month n -1})$ and plug in the predictions for each separate month from your logistic regression.

https://stats.idre.ucla.edu/r/faq/how-can-i-convert-from-person-level-to-person-period/

see eg https://stats.stackexchange.com/a/57196/

seanv507
  • 4,305
  • 16
  • 25
  • I understand Logistic Regression but how should I use Logistic Regression to predict each month and multiply together survival predictions of each month. Can you please add details on your answer. I will appreciate if you can add more details to make it more understandable. – StoryMay Jan 29 '21 at 01:18
  • 1
    One-month periods are too coarse for this problem. Use time to readmission in days and a model like the Cox proportional hazards model. Patients not readmitted at the last known contact time are right censored at that point. – Frank Harrell Jan 29 '21 at 12:22