1

I have a question about calculating prevalence using predicted probabilities from a survey weighted generalized linear model.

Say my goal was to calculate the prevalence of a binary outcome using the predicted probabilities of that outcome over some characteristic (like age or sex).And I were to create a generalized linear model using survey weighted data, and then use that model to predict the probability of the outcome on the same data set I used to build the model for each individual in the data set.

If I wanted to calculate the prevalence of the outcome in the population that the data represents when weighted would I need to calculated the weighted average of the predicted probability over the other characteristic(age,sex), or would just the average of the probabilities be sufficient, since weighting was taken into account in the model?

Molls
  • 80
  • 6

1 Answers1

1

Technical documents would be helpful here, but here is a way you can roughly check:

Compute the mean of the binary outcome with and without weighting. In most cases, they should slightly differ.

Then, fit an intercept-only regression (aka, no predictor) using the same binary variable:

$$logit(Y=1) = \beta_0$$

with and without weighting. For each model, compute the predicted probabilities and check if their means agree with the results you got in the first step.

Penguin_Knight
  • 11,078
  • 29
  • 48
  • Thanks, but I don;t think this will get at the question completely. I for sure need to do a weighted regression. My question is after the weighted regression, and predicted probabilities are assigned to the data set, do I need to weighted the mean of these probabilities to get eh population probability, or do I need to take just the unweighted mean of the probabilities. The problem with an intercept only model, since everyone will be assigned the same probability, the weighted and unweighted will be the same – Molls Dec 19 '19 at 13:44
  • UPDATE: I did it with a single binary predictor instead of an intercept model-it appears tho get the population level prevalence, the individual predicted probabilities DO need to be weighted. Thanks! – Molls Dec 19 '19 at 14:07