0

This post discusses why we need to transform $Y$ before estimating the predictors exponents in order to reduce the problem to a linear fit. The example builds on $Y$ log-normal. In the case of a GLM, we can run a BoxCox likelihood estimate on the predictors against $log(Y)$, then use a logit function in the GLM.

How can we estimate the predictor transformation when $Y$ is not log-normal, when using GLM? For example in the case of a Poisson $Y$ distribution with $X$ normal, how do we linearize $Y ~ X$?

Robert Kubrick
  • 4,078
  • 8
  • 38
  • 55
  • How does a Pareto distribution imply nonlinearity of Y|X? – Hong Ooi Sep 25 '14 at 15:55
  • Well, at the tail of the distribution the, say, $X^3$ exponent will not fit the model as well as at the center of the $Y, X$ observations where $Y = \beta_0 + \beta_1X^3 + \epsilon$ is more stable. – Robert Kubrick Sep 25 '14 at 16:54
  • Tail of the distribution of what? – Hong Ooi Sep 25 '14 at 17:05
  • Tails of $Y$ distribution. That is the reason we need to specify a family distribution and link function in GLM to indicate the kind $Y$ distribution, so that each predictor is "stretched" to fit $Y$. – Robert Kubrick Sep 25 '14 at 17:17
  • How does the tail of the Y distribution imply nonlinearity of Y|X? – Hong Ooi Sep 25 '14 at 17:19
  • So if $Y$ is lognormal and $X$ is normal (like in the link I posted), it doesn't imply nonlinearity of $Y|X$. Only heteroskedasticity? I guess that makes sense... – Robert Kubrick Sep 25 '14 at 17:31
  • So to go back to the original question, in this case we have Y Pareto and X normal. How do we transform the predictors? Don't we have to transform $Y$ before linearizing the predictors with a boxcox? In this case a $Y$ log transform won't make it.... – Robert Kubrick Sep 25 '14 at 17:35
  • You don't transform Y "because it's lognormal". The marginal distribution of $Y$ might be almost anything, without any issue at all. The assumptions relate to the conditional distribution of $Y$, (its distribution at each given value of $X$ -- and then primarily when you're trying to use that assumption in hypothesis testing or constructing intervals). This issue (that the distributional assumption applies to the conditional distribution of Y|X not the raw distribution of Y) is discussed in many posts. – Glen_b Jan 20 '17 at 22:24

0 Answers0