I am carrying out a project where I am imputing missing data. I am trying to compare an imputed dataset with a baseline dataset by measuring MSE and R-squared. These metrics are measured by performing linear regression and carrying out 10-fold cross validation.
The problem is that for the baseline dataset I get MSE = 85.0 and R-squared = 45.5; and for the imputed dataset I get MSE = 97.6 and R-squared = 47.2.
So we see that the MSE is lower (better) for the baseline but the R-squared is higher (better) for the imputed dataset.
I am trying to see if the imputed dataset is a better choice than the baseline, but I am now confused as to which metric to choose.
Both datasets have the same features (16) and target feature. The features include both categorical and continuous features. The target is a continuous feature. The baseline has 814 observations; the imputed set has 879. They have not been scaled or normalized.
Please could you advise on why there is no clear "winner"? Which metric should I choose? Etc.
Thank you very much.