I recently used multiple linear regression to model monthly species abundance (y) and environmental variables (x) 2005-2016.
To ensure assumptions were satisfied for multiple linear regression I had to apply a transformation (abs(y-mean(y)))
to the response variable. Having completed model selection, I wanted to see how well it was able to predict y using x from 2017 so I used the predict()
function. The result was returned in its transformed state which is no use to me so, I have removed the transformation and used the following script.
mod1<-lm(y~x1+x2, data=mydata)
new.df <- data.frame(x1=c(),
(x2=c()))
predict(mod1, new.df)
I compared the result to the actual monthly species abundance data for 2017, and the predictions were very accurate.
I have two questions,
1) Can I report the prediction when this MLR model does not satisfy assumptions?
2) As the initial model selection was based on models with a transformed data is it suitable for me to report predictions from the model without transformation?
I have seen many answers to questions that may appear similar to this that seem suggest assumptions do not need to be satisfied for making predictions, however, I have been unable to find any reference for this in published literature.