Power analysis for binomial regression (success/failure)

Question

I've collect 161 people for a study which the original power analysis was based on correlations, but now I've realised binomial regression would be better. I can find a lot of articles on comparing binomial proportions (45/60 vs 48/60) and linear regressions, but nothing about a regression predicting a proportion (specifically one with success/fail: 48/60 rather than proportion success: 80% )

I have a variable (ordinal, 0-30) predicting performance in each block of 60 trials. My research question is: does Unex predict performance?

`res <- glm(cbind(B1_success, (60 - B1_success)) ~  Unex, data = Df, family = "binomial")`

I'm not sure what the effect size would be to use for this either, as glm does not give overall effect sizes except for beta coefficients(?). Overall, I wanted to work out power for 77%, 78%, 82%, 83% etc. (H1) compared to 80% (H0), or small, medium, strong effect size? And post-hoc achieved power.

Any help would be appreciated! Sorry if this is unclear, my first post.

I've seen Power analysis for binomial data when the null hypothesis is that $p = 0$ but not sure its relevant to me.

Edit: Thanks for the help. I've added an R function that other might find useful

powerBinom<-function(beta, N, outcomeVariance, predictorSD){

    if (outcomeVariance >.25) {print("ERROR: Maximum 0.25 for binomial outcome")}

    powerBinom = 1 - pnorm(1.96 - beta * predictorSD * sqrt((N*outcomeVariance)))
      print(powerBinom)
 }

score 1 · Accepted Answer · answered Apr 03 '19 at 14:59

1

First off, post hoc power is a load of crap, so let's just get that out of the way.

If you want to know power for detecting an effect, you need to know a few things first.

The marginal variance of the outcome.
A minimal detectable effect per one unit standard deviation increase in the predictor. The reason I say one standard deviation increase in the predictor is because it gives us one less thing to estimate in the power calculations.

The power achieved by having these things and a sample size of 161 at a 0.05 significance level is $$ \gamma = 1 - {\bf{\Phi}}(1.96-\vert\beta\vert\sigma_x \sqrt{(np(1-p)} )$$

Here $\bf{\Phi}$ is the CDF for a standard normal distribution. $\sigma_x$ is the standard deviation for your predictor. If it is 1, as I suggest, then $\beta$ is the change in log odds per one standrd deviation change in the predictor. Also, $p(1-p)$ is the marginal variance of the outcome, regardless Unex.

Does that help answer your question?

answered Apr 03 '19 at 14:59

Demetri Pananos

24,380
1
36
94

Hi, thanks for helping out! I've not heard of marginal variance before. I've calculated: the proportion response of each level of the DV (e.g. 0.2 level 1, 0.3 level 2 etc). Then i've calculated the variance the usual way (value - mean, ^2, summed, divided by n-1): 6.685216e-05 for me. I've tried to translate the equation to R code as that's how my brain works (beta as 1, my predictor SD is 6.2) power = 1 - dnorm(1.96 - 1 * 6.2 * sqrt(161 * 6.685216e-05) ) ) = 0.83 Not sure if I followed you correctly, as if i increase my beta to 2 the power is reduced to 0.68? – Christopher Dawes Apr 03 '19 at 17:11
@ChristopherDawes Marginal varince is just the variance of the outcome without taking into account covariates. If I were to ask "What is the variance of weight for people who are 6 foot tall" that would be conditional variance (because I am asking about people who are 6 foot tall). If I were to ask "what is the variance of weight for *people*" that would be marginal variance, because I don't care about height. – Demetri Pananos Apr 03 '19 at 20:04
Got it - i'm just not sure how to apply this? For example now in R i've tried beta = 1, sd = 6.2, marginal variance= 274 1 - dnorm(1.96 - beta * sd * sqrt(161*marginalVariance) ) Which comes out as 1, as the CDF returns 0 with those values, unless i'm using the wrong function? Sorry if this is really obviously, i've tried to calculate power outside of gPower! – Christopher Dawes Apr 03 '19 at 23:34
Marginal variance can't be larger than 0.25 for a binomial outcome. Also, dnorm is the density of the normal, not the cdf. You want to use pnorm. – Demetri Pananos Apr 03 '19 at 23:53
1

Got it! I'd converted my outcome to % (0-100) which is why it wasn't working. Thank you for taking the time to explain this to me, 0.001% of the PhD done. I'll copy my code into the question for others. Will also read the power articles you cited. Thanks again! – Christopher Dawes Apr 04 '19 at 16:59

Power analysis for binomial regression (success/failure)

1 Answers1