Working in neuroscience, we often classify data from different sites. Usually I balance my data for sites - if I have for instance to classify the data for some illness vs. normal health condition, each of the sites the data is recorded at will contribute with an equal number of normal vs. ill data samples (subjects) to the final data set of the two-class classification problem.
Nevertheless, can the differences from having acquired the data at different sites still bias the classification performance despite the sets being balanced? If so, why?