Suppose I have the following two models fitted with constrained linear least squares:
Model 1: $Y = \beta_1X_1 + \ldots + \beta_kX_k + \varepsilon$
Model 2: $\log_{10}(Y) = \beta_1\log_{10}(X_1) + \ldots + \beta_k\log_{10}(X_k) + \varepsilon$,
Both of them subject to $\sum_{i=1}^k \beta_i = 1$ and $0 \leq \boldsymbol{\beta} \leq 1$. I want to compare these two models in terms of their RMSE and $R^2$. Calculating them for the first model is straightforward. However, for the second model, in order to be able to compare them, my understanding is that:
- According to this answer before computing the RMSE, I should do the Smearing transformation on $\hat{Y}$ of the second model as: $\hat{Y_j} = 10^{\widehat{\log Y_j}}\frac{1}{N}\sum_{i=1}^N10^{\varepsilon_i}$.
- It is incorrect to compare the $R^2$ of model 1 vs. model 2 because the variances of the response variable are different. So I should do the back-transformation on the $\hat{Y}$ of the second model (as above), and fit a linear model of $Y \sim \hat{Y}$ and obtain the $R^2$ from here, instead of the original $R^2$ from model 2.
Two questions:
- Is the understanding above correct?
- Is the complete set of assumptions for linear regression (OLS) also necessary for constrained linear least squares?