Transformations to Linearize a Model in Multiple Linear Regression. Deciding if a model is linear, intrinsically linear, or non-linear

Question

I'm asked to indicate whether a model is linear. If not, I need to find a suitable transformation. Well, the model seems strange to me in that I can't imagine where a situation like this would arise.

The model is,

$$Y_i = Ln(\beta_1X_{i1}) + \beta_2X_{i2} + \epsilon_i$$

The thing I don't understand is how a model like this would even come to fruition. To me it just seems like a data analyst had a linear model and made a mistake in the transformation. But, assuming I'm wrong and this model could somehow be derived without explicitly taking the natural log of the first term, is it possible to adequately transform back? Is it proper to simple exponentiate the first term or would we need to exponentiate everything?

In short: Is it possible to linearize this model? If so, how? Also, can a model like this appear without manually taking the natural log of the first term?

Hint: log of a product is equivalent to the sum of each term logged. — dimitriy, Oct 28 '18 at 18:25
You ask several different questions: one is whether the model is linear; another is to find the transformation; another is about transforming back; and there are others. Could you clarify what you're trying to ask? — whuber, Oct 28 '18 at 18:26

score 3 · Accepted Answer · answered Oct 28 '18 at 21:18

To explore whether a model is linear in a particular element, such as $\beta_1$ (for example), let's adopt a notation that focuses on this element. Write

$$f(\theta) = \log(\theta X_{i1}) + \beta_2 X_{i2} + \epsilon_i,\tag{1}$$

thereby suppressing all other variables and parameters in the definition of $f$ and generically referring to the parameter $\beta_1$ as "$\theta.$" This ought to make it obvious how the following approach applies to any parameter for any model.

Let's be clear about the objective.

Linearizing $f$ means there is a one-to-one transformation $u,$ which converts a new parameter $\gamma$ into the original parameter $\theta=u(\gamma),$ for which $g(\gamma)=f(u(\gamma))$ is a linear function of $\gamma$ wherever it is defined.

A fairly general meaning of "$g$ is linear" is that $g$ is differentiable on an open set containing its domain and has a constant derivative $g^\prime.$ (This generality really is needed, because many statistical models constrain some of their parameters. For instance, the parameter $\sigma$ in the familiar Normal$(\mu,\sigma^2)$ family of distributions is usually constrained to $\sigma\gt 0.$)

This approach allows us to use the machinery of Calculus to decide the question whether $f$ can be linearized. To apply it, we usually assume the reparameterization is itself differentiable. The Chain Rule implies $g$ is differentiable. Let $C$ name the (as yet unknown) constant derivative of $g.$ The Chain Rule also supplies a formula for the derivative,

$$C=g^\prime(\gamma) = f^\prime(u(\gamma))\, u^\prime(\gamma).\tag{2}$$

(Bear in mind that the model is undefined if $X_{i1}=0,$ so we may assume these numbers are nonzero.)

The original formula $(1)$ provides the information needed to differentiate $f,$

$$f^\prime(\theta) = \frac{X_{i1}}{\theta X_{i1}} = \frac{1}{\theta}.$$

Combine this with $(2)$ to obtain an equation for the unknown reparameterization $u:$

$$u^\prime(\gamma) = \frac{C}{f^\prime(u(\gamma))} = C u(\gamma),\tag{3}$$

whence

$$\frac{du}{u}\left(\gamma\right) = Cd\gamma$$

with general solution

$$\log\, \lvert \theta \rvert = \log\, \lvert u(\gamma) \rvert = C \gamma + C_0,$$

equivalent to

$$\theta = \pm e^{C\gamma + C_0}$$

for some additional constant $C_0.$ The sign to choose depends on whether all the $X_{i1}$ are positive or all are negative. (The model is undefined if the signs of these variables vary.)

To be quite explicit, in the case all the $X_{i1}$ are positive, we may choose $C=1$ and $C_0=0$ so that the model is

$$Y_i = \gamma \log(X_{i1}) + \beta_2 X_{i2} + \epsilon_i;\quad \beta_1 = e^\gamma$$

and when all the $X_{i1}$ are negative, the model can be written

$$Y_i = \gamma \log(-X_{i1}) + \beta_2 X_{i2} + \epsilon_i;\quad \beta_1=-e^\gamma.$$

However, being able to find an explicit solution to $(3)$ is not important for us: the mere demonstration that there exists some solution suffices to show that $f$ is linearizable. Note, too, that you didn't need to know much about logarithms to arrive at $(3),$ nor was it necessary to be clever or have some kind of insight. The work was purely mechanical.

For more on the many different meanings of "linear" model, please see my post at https://stats.stackexchange.com/a/148713/919.

Transformations to Linearize a Model in Multiple Linear Regression. Deciding if a model is linear, intrinsically linear, or non-linear

1 Answers1