2

I have a hospital based dataset which conatins information on patient details. Right from their visit, drugs, diagnosis, lab tests, and death info etc.

So, now I would like to compute their follow up time from the date of the 1st visit to last visit (when they visited hospital for the last time).

How can I do this? I couldn't find any tutorials online. While I did fine one resource but am not sure how can this be implemented in python?

There should be some readymade packages or tools which could this, but am unable to locate it.

I am trying to calculate something as shown in table 4 in this paper

Can guide me with this?

The Great
  • 1,380
  • 6
  • 18
  • The question seems to say you want to compute the median time elapsed between the first and last visits. In that case, just take the sample median. But then, why mention other measurements like lab tests, etc--is there some connection? Also, table 4 in the linked paper compares various measurements between the first and last visits. I don't see any mention there of median time elapsed between visits. Could you clarify what you're asking? – user20160 Mar 04 '21 at 14:23
  • In the table, we have `median follow-up` at the last line – The Great Mar 04 '21 at 14:25
  • How can I compute the median time elapsed (follow up)? YOu mean get the first and last day of visit for a patient.. compute their difference (in years/months etc).. Repeart the same for other patients and finally find the median of that `difference` values? – The Great Mar 04 '21 at 14:27
  • 1
    Yes, given your description of follow up time, that seems to be exactly what one might do. Of course, this ignores censoring; if the data is censored, you'd use some kind of survival analysis instead. – user20160 Mar 04 '21 at 14:40
  • 1
    It doesn't seem to make sense to talk of "follow-up" if we are only considering the first and the last visit. If someone visits five times, then the follow-up should be the second visit (as a follow-up to the first one). And possibly later ones as well if the patient presented with a new complaint at, say, the third visit. – Stephan Kolassa Mar 04 '21 at 14:58
  • @StephanKolassa - Yes, agree. That seems to make more sense. So to compute based on the exact follow-up definition that you suggested, we take no of days between 1st and 2nd visit, 2nd and 3rd visit and so on. Finally we take a mean of those days and find out what's the average follow-up time this patient had for every visit? – The Great Mar 05 '21 at 02:19
  • That sounds more reasonable to me (though you would need to be clear on whether you want the *mean* or the *median*). Then again, what is the definition of "follow-up" if a patient initially presents with a condition and then needs to come in *multiple* times for treatment? Would "follow-up" only be the second visit? Or the last for that specific treatment? Or, as you write, each interval separately? Depending on your situation, you may or may not be able to discuss these with whoever provided the dataset, or who is interested in your results. Subject matter knowledge is always good. – Stephan Kolassa Mar 05 '21 at 07:16

1 Answers1

1

Since the paper doesn't explain how it was calculated, I would assume they used the time until the last known followup for each participant, then calculated the usual median, and the first and third quartiles. In other words, they ignored the reason for why that was the person's last time of followup. So, if a person died on day 5, it's counted as 5; if they were lost to followup on day 30, that person is counted as 30; if they were still being followed at the time the data was analyzed, which was Day 1000, then it is 1000; etc. They end up with 12,242 numbers. Sort them. The median is the average of the 6,121 and 6,122 smallest numbers. The first and third quartiles are roughly equal to the 3,060 and 9,181st numbers sorted from smallest to largest. There are different conventions for exactly how to define the first and third quartile. There are several other methods used that incorporate the reasons for loss to followup, for example "reverse Kaplan-Meier" and the pros and cons of some of them are described here.

John L
  • 2,140
  • 6
  • 15