Comparing log-transformed and non-log transformed models

Question

Suppose I have the following two models fitted with constrained linear least squares:

Model 1: $Y = \beta_1X_1 + \ldots + \beta_kX_k + \varepsilon$

Model 2: $\log_{10}(Y) = \beta_1\log_{10}(X_1) + \ldots + \beta_k\log_{10}(X_k) + \varepsilon$,

Both of them subject to $\sum_{i=1}^k \beta_i = 1$ and $0 \leq \boldsymbol{\beta} \leq 1$. I want to compare these two models in terms of their RMSE and $R^2$. Calculating them for the first model is straightforward. However, for the second model, in order to be able to compare them, my understanding is that:

According to this answer before computing the RMSE, I should do the Smearing transformation on $\hat{Y}$ of the second model as: $\hat{Y_j} = 10^{\widehat{\log Y_j}}\frac{1}{N}\sum_{i=1}^N10^{\varepsilon_i}$.
It is incorrect to compare the $R^2$ of model 1 vs. model 2 because the variances of the response variable are different. So I should do the back-transformation on the $\hat{Y}$ of the second model (as above), and fit a linear model of $Y \sim \hat{Y}$ and obtain the $R^2$ from here, instead of the original $R^2$ from model 2.

Two questions:

Is the understanding above correct?
Is the complete set of assumptions for linear regression (OLS) also necessary for constrained linear least squares?

Could you explain how it makes any sense to compare the RMSEs of these two models? They are completely different in form and in statistical assumptions and the RMSEs aren't even in commensurable units! — whuber, Aug 20 '18 at 13:50
I wanted to check if using any transformation of the response and/or input variables could lead to a model with a better fit, because I am interested in the interpretation of the coefficient values (my understanding is that error of the model is smaller, the coefficient estimates are better). By looking at the residual plots of model 1 there's some heteroscedasticity that is not as present in model 2. I am asking about back-transformation precisely because I was thinking that the units aren't comparable. If I put both of them on the same scale, why is the RMSE not comparable? — drgxfs, Aug 20 '18 at 13:59
I find that the advice given in the second link above is contradicting with this other suggestion (https://stattrek.com/regression/linear-transformation.aspx) where it is suggested to compare the R^2 of the power model vs. the raw model. — drgxfs, Aug 20 '18 at 14:16
We have come across that site before: it tends to give poor or wrong advice. In this case it's completely wrong. One reason is that $R^2$ depends as much on the distribution of the explanatory variables as it does on the response variable, so when you apply a nonlinear transformation to one or more of the explanatory variables, you have lost all basis for comparison. Your transformation could greatly reduce $R^2$ while greatly improving model accuracy, or it could greatly increase $R^2$ while destroying model accuracy altogether. — whuber, Aug 20 '18 at 15:56

Comparing log-transformed and non-log transformed models

0 Answers0