This post discusses why we need to transform $Y$ before estimating the predictors exponents in order to reduce the problem to a linear fit. The example builds on $Y$ log-normal. In the case of a GLM, we can run a BoxCox likelihood estimate on the predictors against $log(Y)$, then use a logit function in the GLM.
How can we estimate the predictor transformation when $Y$ is not log-normal, when using GLM? For example in the case of a Poisson $Y$ distribution with $X$ normal, how do we linearize $Y ~ X$?