0

I have BMI divided into quartiles. I fitted two cox regression models that included BMI as a categorical and continuous variable respectively. Now I want to plot the hazard ratio of categorical BMI and linear BMI in the same plot to visualize if the linear BMI running within the 95% confidence interval of the categorical BMI to decide which form of BMI to use.

I'm not sure how to plot the categorical BMI, should I plot it using the mean or median of each quartile of BMI or plot it like the stepwise function?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Zhoufeng
  • 101
  • 8

2 Answers2

1

If you have a continuous plot of hazard ratio (HR) versus BMI, then the closest correspondence for a categorized BMI would be a stepwise function. For each category, show a horizontal line (with CI) running over the range of continuous BMI that the category covers.

I worry, however, when you say:

Now I want to plot the hazard ratio of categorical BMI and linear BMI in the same plot to visualize if the linear BMI running within the 95% confidence interval of the categorical BMI to decide which form of BMI to use.

If you have enough data to subcategorize BMI values into quartiles, you also have enough data to fit a flexible continuous BMI model, for example with restricted cubic splines. There's a good chance that such a flexible continuous model will be superior to both the linear and the categorized BMI models.

Also, the quote sounds like you might be using the categorized BMI model as the default against which to compare the linear continuous model. It's not a good idea in general to categorize continuous predictors. Say that one of your cutoffs is at a BMI value of 28. Do you really think that there is a big change in HR between BMI values of 27.999 and 28.001? That's what you imply when you categorize.

So I urge you to expand your approach to start with flexible continuous modeling of BMI* as your reference model, which you might then choose to simplify a bit for practical considerations.


*There's also a strong argument that modeling with BMI can be inferior to modeling with both height and weight (and perhaps an interaction between them) as predictors instead. BMI imposes a fixed ratio of weight to height in terms of outcomes. It's quite possible that the relative associations of height and weight with outcome can differ depending on the type of outcome under study. That's beyond the scope of this question, however.

EdM
  • 57,766
  • 7
  • 66
  • 187
1

I agree with EdM's points. If you want to visualize the effect of BMI groups, you could create a graph using the following approach.

enter image description here

library(dplyr)
library(survival)
library(ggplot2)

set.seed(42)

event <- rbinom(100, 1, 0.5)
t1 <- rpois(100, 3)
BMI <- rnorm(100, 25, 5) + t1

df <- data.frame(event, t1, BMI)

df <- df %>% mutate(bmi.group = cut(BMI, breaks = c(0,25,35,99999), labels = c(0:2) ))

bmi.model <- with(df, coxph(Surv(t1, event) ~ BMI))
test.dat <- data.frame(BMI = seq(15,45,1))
test.dat$pred <- predict(bmi.model, newdata = test.dat, type = "risk")
rects <- data.frame(xstart = seq(15, 40, 5) , xend = seq(20, 45, 5) , cols = as.factor(1:6) )
                    
ggplot() + 
  geom_rect(data = rects, aes(xmin = xstart, xmax = xend, ymin = -Inf, ymax = Inf, fill = cols)) +
  geom_line(data = df, aes(BMI,pred), size = 1) +
  scale_x_continuous(breaks = seq(15,45,5)) +
  ylab("Hazard ratio") +
   guides(fill=guide_legend(title="BMI groups"))
Todd D
  • 1,649
  • 1
  • 9
  • 18