Significance of different levels of categorial variable in the GLM

Question

Problem

I would like to measure the effects of different treatments on the distribution of subjects between the two groups (If you need more details, subjects are cells and I want to measure the effect of different gene knockouts on cell differentiation). I fit Generalized Linear Model of the following form:

\begin{align} Y &\sim {\rm Binom}(p|N) \\ {\rm logit}(p) &\sim N(\beta X^T, \Sigma) \end{align}

Here $Y$ represents observed number of subjects in each of the two groups for different treatments and design matrix $X$ represents dummy-encoded categorical treatment variable.
Note that columns of $X$ are orthogonal as each observed subject is only treated by one treatment.
When generating $X$ I drop the level that corresponds to non-treated subjects such that the intercept represent the background distribution between the two groups in the absence of the treatment. Therefore coefficients $\beta$ represent the effect of each treatment compared to non-treated control.

Question

When interpreting the output from summary.glm() (below), do I need to correct the p-values (i.e., Pr(>|z|)) if I want to select the treatments that have significant effects? Will the answer change if I observe the same population before and after treatment?

Coefficients:
                Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -3.2600440  0.2522047 -12.926  < 2e-16 ***
TreatmentA     0.8582053  0.3762970   2.281  0.02257 *  
TreatmentB     0.1369642  0.4346106   0.315  0.75265    
TreatmentC    -0.0083802  0.4549547  -0.018  0.98530    
TreatmentD    -0.4489033  0.4786030  -0.938  0.34827    
TreatmentE    -0.0910970  0.4330876  -0.210  0.83340    
...

I found a related question on CrossValidated: How to test the statistical significance for categorical variable in linear regression? However, that question deals with interpreting the overall significance of the categorical variable rather than its levels.

EDIT:

The model is a logistic regression and is formulated as follows:

\begin{align} Y &\sim {\rm Binom}(p|N) \\ {\rm logit}(p) &= \beta X^T + \epsilon \\ \epsilon &\sim Logistic(0, S) \end{align}

There's a confusion in the model you believe you've fit. It makes no sense to say the logit of p is normally distributed. If you think the probabilities arise from a latent normal, you would fit a probit model, if you model the logit of p, you are fitting a logistic regression. It may help to read my answer here: [Difference between logit and probit models](https://stats.stackexchange.com/a/30909/7290). — gung - Reinstate Monica, Feb 24 '21 at 15:36
Thank you for your comment! I'm a little confused though. I am fitting a logistic regression with logit as a "canonical link for binary response data (more specifically, the binomial distribution)" as you mention in your post, so $log \frac{p}{1-p} = \beta*X^T + \epsilon$, where $\epsilon \sim N(0, \Sigma)$, no? How would you formulate it? — perlusha, Feb 25 '21 at 08:57
Actually, never mind. I think I got it. Logistic regression: $logit(p) = \beta*X^T + \epsilon$, where $\epsilon \sim Logistic(0, s)$, probit regression: $probit(p) = \beta*X^T + \epsilon$, where $\epsilon \sim N(0, \sigma)$. — perlusha, Feb 25 '21 at 09:25

score 2 · Accepted Answer · answered Feb 24 '21 at 15:19

First, it's good practice to evaluate the overall significance of a predictor before you move to evaluating individual differences. the page you link discusses how to do that. If the model is only for this one categorical predictor, with no covariates, then those would be the same as the overall significance tests of the model (likelihood-ratio, score, Wald tests). If the overall model isn't significant one should be wary of proceeding to testing individual differences, as you have no evidence that your model is significantly different from a null model.

Second, you have to decide what type of multiple comparison correction you want. Controlling family-wise error rate (FWER) means you are trying to control the risk of any false-positive findings. Controlling false discovery rate (FDR) means that you are trying to control your fraction of false positive findings. In early-stage screen-type studies like this, FDR is probably better. You presumably are going to follow up with further detailed studies on the genes that you identify, so you don't want to control so strongly that you miss potentially important true positives like you might with FWER control.

Third,

Will the answer change if I observe the same population before and after treatment?

The principles of multiple comparison control don't change; the question becomes which coefficients to compare. The model will now be different, potentially including pre/post knockout status as a predictor as well as the knockouts, possibly plus interaction terms between pre/post status and the knockouts. How to structure the model and the coefficients to test depend on your understanding of the subject matter. Again, if the overall model isn't significant then you probably shouldn't be doing individual tests.

Significance of different levels of categorial variable in the GLM

1 Answers1