You assume that train and test come from the same distribution, and hence you use the mean of the train. Also generally training data is more which gives a better estimator of the mean of the distribution. Training is mostly done offline with a lot of data and new test examples can use the estimated mean.
If you use the mean from your test set, you are linking the performance of your procedure to the mean of the new data you are evaluating on. You could be testing on 1 example at a time or 10 or hundreds,so using that as a mean would be more close to the one in training if you have a lot more elements in the test set. Also , for example you use income 100-200 in your train set and test set is made by an adversary who tries to make a prediction for 1000-1200 which does not make sense in your current model.If you use the mean from the test set , the valid test points will be affected by these outliers. But as the training data is large, the effect of outliers in the mean will tend to get averaged out