It's perfectly fine to have different means for different groups. What's not fine is to calculate those means having your test set included. As you say, you are trying to test a hypothesis on your data. That test should be done on a set aside test set, which should not be taken into account when you estimate your means.
The mean of a variable is a knowledge about the data. And depending on the application, it can be a very important one, i.e. is a newly given data above or bellow the average? Especially if there are outliers in your data, they would move your average substantially. Therefore independent of your hypothesis, you should remove some parts of your data from your dataset and put them as your test set, develop your hypothesis (including the calculation of the average of a variable) on only the training set, and then check if it's valid on your test set. If you calculate the mean on the whole dataset, you're implicitly gaining knowledge about the part of the data which you shouldn't see while developing your hypothesis.
Another point to make, is that centering your variables does not necessarily remove the multicolinearity effect as nicely worded here:
Mean-center the predictor variables. Generating polynomial terms
(i.e., for $x_{1}, x_{1}^{2}, x_{1}^{3}$, etc.) or interaction terms
(i.e., $x_{1}\times x_{2}$, etc.) can cause some multicollinearity if
the variable in question has a limited range (e.g., [2,4]).
Mean-centering will eliminate this special kind of multicollinearity.
However, in general, this has no effect. It can be useful in
overcoming problems arising from rounding and other computational
steps if a carefully designed computer program is not used.