Questions tagged [link-function]

A transformation of a parameter governing a response distribution that is used as a crucial part of the generalized linear model to map that parameter's range (which may be from 0 to 1, or only positive values, e.g.) to the real number line $(-\infty, +\infty)$.

Link functions are a central part of the Generalized Linear Model. Many non-normal response distributions (e.g., binomial, Poisson, etc.) are governed by parameters that can only range over a bounded interval. For example, the mean of the Bernoulli distribution is $\pi$, the probability of 'success', which can only range from 0 to 1. However, the structural part of a model, $\beta_0 + \beta_1X$, can range from $(-\infty, +\infty)$. The link function allows the predicted parameter to be equated to the structural part by transforming the parameter such that the transformed parameter can range from $(-\infty, +\infty)$.

Wikipedia https://en.wikipedia.org/wiki/Generalized_linear_model#Link_function has more information and references

184 questions
354
votes
12 answers

Difference between logit and probit models

What is the difference between Logit and Probit model? I'm more interested here in knowing when to use logistic regression, and when to use Probit. If there is any literature which defines it using R, that would be helpful as well.
Beta
  • 5,784
  • 9
  • 33
  • 44
87
votes
4 answers

What is the difference between a "link function" and a "canonical link function" for GLM

What's the difference between terms 'link function' and 'canonical link function'? Also, are there any (theoretical) advantages of using one over the other? For example, a binary response variable can be modeled using many link functions such as…
steadyfish
  • 1,772
  • 2
  • 15
  • 30
59
votes
4 answers

Choosing between LM and GLM for a log-transformed response variable

I'm trying to understand the philosophy behind using a Generalized Linear Model (GLM) vs a Linear Model (LM). I've created an example data set below where: $$\log(y) = x + \varepsilon $$ The example does not have the error $\varepsilon$ as a…
40
votes
2 answers

Purpose of the link function in generalized linear model

What is the purpose of the link function as a component of the generalized linear model? Why do we need it? Wikipedia states: It can be convenient to match the domain of the link function to the range of the distribution function's mean What's the…
Chris
  • 1,169
  • 3
  • 12
  • 16
34
votes
3 answers

How to decide which glm family to use?

I have fish density data that I am trying to compare between several different collection techniques, the data has lots of zeros, and the histogram looks vaugley appropriate for a poisson distribution except that, as densities, it is not integer…
29
votes
1 answer

Nonlinear vs. generalized linear model: How do you refer to logistic, Poisson, etc. regression?

I have a question about semantics that I would like fellow statisticians' opinions on. We know models such as logistic, Poisson, etc. fall under the umbrella of generalized linear models. The model includes nonlinear functions of the parameters,…
21
votes
5 answers

Do statisticians assume one can't over-water a plant, or am I just using the wrong search terms for curvilinear regression?

Almost everything I read about linear regression and GLM boils down to this: $y = f(x,\beta)$ where $f(x,\beta)$ is a non-increasing or non-decreasing function of $x$ and $\beta$ is the parameter you estimate and test hypotheses about. There are…
19
votes
1 answer

Can you give a simple intuitive explanation of IRLS method to find the MLE of a GLM?

Background: I'm trying to follow Princeton's review of MLE estimation for GLM. I understand the basics of MLE estimation: likelihood, score, observed and expected Fisher information and the Fisher scoring technique. And I know how to justify simple…
19
votes
2 answers

GLM: verifying a choice of distribution and link function

I have a generalized linear model that adopts a Gaussian distribution and log link function. After fitting the model, I check the residuals: QQ plot, residuals vs predicted values, histogram of residuals (acknowledging that due caution is needed).…
17
votes
1 answer

Log-linked Gamma GLM vs log-linked Gaussian GLM vs log-transformed LM

From my results, it appears that GLM Gamma meets most assumptions, but is it a worthwhile improvement over the log-transformed LM? Most literature I've found deals with Poisson or Binomial GLMs. I found the article EVALUATION OF GENERALIZED LINEAR…
15
votes
4 answers

Is the logit function always the best for regression modeling of binary data?

I've been thinking about this problem. The usual logistic function for modeling binary data is: $$ \log\left(\frac{p}{1-p}\right)=\beta_0+\beta_1X_1+\beta_2X_2+\ldots $$ However is the logit function, which is an S-shaped curve, always the best for…
Glen
  • 6,320
  • 4
  • 37
  • 59
15
votes
2 answers

Pros and Cons of Log Link Versus Identity Link for Poisson Regression

I am carrying out a Poisson regression with the end goal of comparing (and taking the difference of) the predicted mean counts between two factor levels in my model: $\hat{\mu}_1-\hat{\mu}_2$, while holding other model covariates (which are all…
StatsStudent
  • 10,205
  • 4
  • 37
  • 68
12
votes
2 answers

Problem with comparing GLM models having a different link function

Given the same set of covariates and distribution family, how can I compare models having different link functions? I think the correct answer here is "AIC/BIC", but I am not 100% sure. Is it possible to have nested models if they have a different…
Davide
  • 167
  • 2
  • 12
12
votes
1 answer

Calculation of canonical link function in GLM

I thought that the canonical link function $g(\cdot)$ comes from the natural parameter of exponential family. Say, consider the family $$ f(y,\theta,\psi)=\exp\left\{\frac{y\theta-b(\theta)}{a(\psi)}-c(y,\psi)\right\} $$ then $\theta=\theta(\mu)$…
ziyuang
  • 1,536
  • 8
  • 32
1
2 3
12 13