I have created an example in R to illustrate the problem:
> set.seed(10)
> Ydata<-rnorm(200,15,5)*rep(1:200)^3
> Xdata<-rep(1:200)
> lm.test<-lm(log(Ydata)~Xdata)
> summary(lm.test)$r.squared
[1] 0.7665965
> Yfit<-fitted.values(lm.test)
> lm.test2<-lm(Yfit1~log(Ydata))
> summary(lm.test2)$r.squared
[1] 0.7665965
> ExpYfit<-exp(fitted.values(lm.test))
> lm.test3<-lm(ExpYfit~Ydata)
> summary(lm.test3)$r.squared
[1] 0.6088178
When calculating the r-squared of some exponential model, fitted values for log(Y) run against observed log(Y) give the same r-squared as the original regression as expected:
log(Y) = fitted values = a + bX
but when we want to estimate the level of Y, exponentials of both sides are taken:
Y= exp(a + bX) = exp(fitted values)
but when running level Y against exponential fitted values, the R-squared is calculated incorrectly.
Why is this? and does this mean my predictions of Y are wrong?