Questions tagged [generalized-linear-model]

A generalization of linear regression allowing for nonlinear relationships via a "link function" and for the variance of the response to depend on the predicted value. (Not to be confused with "general linear model" which extends the ordinary linear model to general covariance structure and multivariate response.)

A generalized linear model extends regression models by allowing a more general (conditional) distribution for the observations, a variance function related to the mean, and by allowing non-linear relationship between the mean and the linear predictor, $X\beta$.

A generalized linear model consists of three components:

Systematic part: $\eta_i = X_i'\beta$ . This is the linear predictor.
Random part: $Y_1, Y_2, ..., Y_n$ that are independent random variables where $$ Y_i \sim D(\mu_i = EY_i)$$ where $D$ is an exponential family distribution. More generally we can have an additional parameter, the overdispersion parameter $\phi$ which controls the dispersion in $Y_i$
Link function: an invertible function $g$, such that $\eta_i = g(\mu_i)$, or equivalently, $E(Y_i) = \mu_i = g^{-1}(\eta_i) = g^{-1}(X_i'\beta)$

The similar term "general linear model" is often confused with generalized linear models (both are typically abbreviated GLM). A general linear model is the standard multiple regression setting $Y = X\beta + \varepsilon$ (for a "design matrix" $X$, parameters $\beta$, and "error term" $\varepsilon$). Use the multiple-regression or linear-model tags for such cases (see discussion).

3987 questions

354

votes

12 answers

Difference between logit and probit models

What is the difference between Logit and Probit model? I'm more interested here in knowing when to use logistic regression, and when to use Probit. If there is any literature which defines it using R, that would be helpful as well.

r generalized-linear-model logistic probit link-function

asked Jan 03 '12 at 07:20

Beta

5,784
9
33
44

119

votes

4 answers

When to use gamma GLMs?

The gamma distribution can take on a pretty wide range of shapes, and given the link between the mean and the variance through its two parameters, it seems suited to dealing with heteroskedasticity in non-negative data, in a way that log-transformed…

generalized-linear-model gamma-distribution

asked Aug 16 '13 at 08:13

generic_user

11,981
8
40
63

103

votes

5 answers

Diagnostic plots for count regression

What diagnostic plots (and perhaps formal tests) do you find most informative for regressions where the outcome is a count variable? I'm especially interested in Poisson and negative binomial models, as well as zero-inflated and hurdle counterparts…

generalized-linear-model residuals negative-binomial-distribution zero-inflation poisson-regression

asked Sep 20 '13 at 01:17

half-pass

3,594
7
23
34

votes

4 answers

What is the difference between a "link function" and a "canonical link function" for GLM

What's the difference between terms 'link function' and 'canonical link function'? Also, are there any (theoretical) advantages of using one over the other? For example, a binary response variable can be modeled using many link functions such as…

logistic generalized-linear-model link-function

asked Oct 21 '12 at 14:17

steadyfish

1,772
2
15
30

votes

5 answers

What are modern, easily used alternatives to stepwise regression?

I have a dataset with around 30 independent variables and would like to construct a generalized linear model (GLM) to explore the relationship between them and the dependent variable. I am aware that the method I was taught for this situation,…

regression generalized-linear-model model-selection stepwise-regression

asked Jul 31 '11 at 23:45

fmark

4,666
5
35
51

votes

5 answers

What do the residuals in a logistic regression mean?

In answering this question John Christie suggested that the fit of logistic regression models should be assessed by evaluating the residuals. I'm familiar with how to interpret residuals in OLS, they are in the same scale as the DV and very clearly…

r logistic generalized-linear-model residuals aic

asked Aug 09 '10 at 07:32

russellpierce

17,079
16
67
98

votes

1 answer

How to interpret coefficients in a Poisson regression?

How can I interpret the main effects (coefficients for dummy-coded factor) in a Poisson regression? Assume the following example: treatment <- factor(rep(c(1, 2), c(43, 41)), levels = c(1, 2), …

r generalized-linear-model interpretation poisson-distribution regression-coefficients

asked May 21 '11 at 15:10

user734124

votes

1 answer

Why is the square root transformation recommended for count data?

It is often recommended to take the square root when you have count data. (For some examples on CV, see @HarveyMotulsky's answer here, or @whuber's answer here.) On the other hand, when fitting a generalized linear model with a response variable…

generalized-linear-model data-transformation poisson-distribution count-data variance-stabilizing

asked Dec 22 '12 at 03:11

gung - Reinstate Monica

132,789
81
357
650

votes

9 answers

Advanced statistics books recommendation

There are several threads on this site for book recommendations on introductory statistics and machine learning but I am looking for a text on advanced statistics including, in order of priority: maximum likelihood, generalized linear models,…

generalized-linear-model pca maximum-likelihood references saddlepoint-approximation

asked Jul 27 '12 at 16:15

Robert Kubrick

4,078
8
38
55

votes

3 answers

Interpreting Residual and Null Deviance in GLM R

How to interpret the Null and Residual Deviance in GLM in R? Like, we say that smaller AIC is better. Is there any similar and quick interpretation for the deviances also? Null deviance: 1146.1 on 1077 degrees of freedom Residual deviance: 4589.4…

generalized-linear-model deviance

asked Jul 23 '14 at 10:18

Anjali

votes

4 answers

How are regression, the t-test, and the ANOVA all versions of the general linear model?

How are they all versions of the same basic statistical method?

regression self-study anova generalized-linear-model t-test

asked May 15 '13 at 00:46

Amahabirsingh

votes

3 answers

Linear model with log-transformed response vs. generalized linear model with log link

In this paper titled "CHOOSING AMONG GENERALIZED LINEAR MODELS APPLIED TO MEDICAL DATA" the authors write: In a generalized linear model, the mean is transformed, by the link function, instead of transforming the response itself. The two methods …

generalized-linear-model model-selection lognormal-distribution

asked Jan 16 '13 at 10:01

miura

3,364
3
21
27

votes

4 answers

Choosing between LM and GLM for a log-transformed response variable

I'm trying to understand the philosophy behind using a Generalized Linear Model (GLM) vs a Linear Model (LM). I've created an example data set below where: $$\log(y) = x + \varepsilon $$ The example does not have the error $\varepsilon$ as a…

r generalized-linear-model linear-model gamma-distribution link-function

asked Nov 19 '12 at 13:28

Marc in the box

3,532
3
33
47

votes

4 answers

Regression for an outcome (ratio or fraction) between 0 and 1

I am thinking of building a model predicting a ratio $a/b$, where $a \le b$ and $a > 0$ and $b > 0$. So, the ratio would be between $0$ and $1$. I could use linear regression, although it doesn't naturally limit to 0..1. I have no reason to believe…

regression logistic generalized-linear-model beta-distribution beta-regression

asked May 23 '12 at 22:13

dfrankow

2,816
6
30
39

votes

2 answers

How to simulate artificial data for logistic regression?

I know I'm missing something in my understanding of logistic regression, and would really appreciate any help. As far as I understand it, the logistic regression assumes that the probability of a '1' outcome given the inputs, is a linear combination…

r regression logistic generalized-linear-model simulation

asked Dec 25 '12 at 14:59

zorbar

2 3

…

99 100 Next