How do covariates influence the sample sizes for survival and binary outcomes?

Question

Currently, I try to wrap my head around the concept of how covariates influence non-normal outcomes like survival and binary outcomes. I know that in linear models the unexplained variance shrinks according to the squared Pearson correlation. This in turn influences the test statistic when testing a coefficient. This is a concept I now understand well enough. I now try to understand the same concept when conducting logistic regression for binary outcomes or cox-regression for survival outcomes. Can someone explain to me how the commonly used test statistics are effected or could suggest some papers/books that explain this?

score 0 · Answer 1 · answered Sep 28 '21 at 15:47

0

For logistic regression, the covariate changes the log-odds. For Cox regression model, the covariate changes the log-hazard. It might help to read this.

answered Sep 28 '21 at 15:47

John L

2,140
6
15

score 0 · Answer 2 · answered Sep 28 '21 at 18:23

Survival and binomial models are fit by finding model parameter values that maximize the likelihood of the data (technically, the partial likelihood for a Cox survival model). The test statistics are based on the properties of the estimated (partial) likelihood of the data as a function of the parameter values. That's true for a wide range of models, including many types of generalized linear models beyond binomial models, used when errors around the model predictions have specific types of non-normal distributions.

The UCLA IDRE website has a nice summary of maximum-likelihood methods and the 3 standard tests based on them: score, Wald, and likelihood-ratio tests. If the data are such that they are only likely over a narrow range of parameter values, then the test statistics are 'significant' and error estimates of parameter values are correspondingly narrow.

The Wald test on coefficients is probably closest to what you're familiar with from linear regression. It's based on an assumption that the distribution of coefficient estimates is multivariate normal. Coefficient variance estimates thus tend to scale inversely with the number of observations $n$ and their standard deviations with $\sqrt n$, as you might expect.

Standard linear-regression least-squares fitting is equivalent to a maximum likelihood fit based on an assumption of a linear model with homoscedastic. normally distributed errors. See this page among other on this site or in the wider web. As you move farther along in studying statistics, it might make more sense to develop a primary intuition based on maximum likelihood, as that can be extended to many situations in ways that intuition based on correlations often can't.

If you found this answer helpful, then please consider [upvoting](https://stats.stackexchange.com/help/why-vote) and/or [accepting](https://stats.stackexchange.com/help/accepted-answer) it. — kjetil b halvorsen, Oct 09 '21 at 13:52

How do covariates influence the sample sizes for survival and binary outcomes?

2 Answers2