The question (and, largely, the topic itself) seemed interesting enough to warrant spending some time on a brief research (please note that, despite the fact that most of the information below is tied to bioinformatics (but I'm sure is generally applicable), I'm not fluent in this domain's subject matter). It appears that there are various approaches and methods for detecting and eliminating batch effects and other unwanted effects / noise. Including some bioinformatics-specific methods, the approaches/methods include (Chen, Grennan, Badner, Zhang, Gershon et al., 2011):
- Distance-weighted discrimination (DWD), based on the support vector machines (SVM);
- Mean-centering (PAMR), based on one-way ANOVA;
- Surrogate variable analysis (SVA), based on a combination of singular value decomposition (SVD) and a linear model analysis;
- Geometric ratio-based method (Ratio_G);
- Combating Batch Effects When Combining Batches of Gene Expression Microarray Data (ComBat), based on empirical Bayes method;
- Singular value decomposition (SVD);
- Standardization (location/scale adjustment model);
- Ratio-based method with arithmetic mean (Ratio_A).
Note that the last three methods in the list above are excluded from the study by Chen et al. (2011), however I include them here for the sake of completeness. Approaches to detecting and removing systemic variation as well as several other bioinformatics-focused methods are also discussed by Li, Łabaj, Zumbo, Sykacek, Shi, Shi et al. (2014).
In regard to the software that could be used for the task, various packages are available, many within the Bioconductor R project ecosystem. One of the most popular R packages seems to be Surrogate Variable Analysis (SVA). Its vignette contains detailed description of functionality with examples. It also briefly covers other complimentary functions, such as above-mentioned ComBat
and svaseq
. The latter is described in more details in this paper (too specific, hence no citation).
References
Chen, C., Grennan, K., Badner, J., Zhang, D., Gershon, E., et al. (2011). Removing batch effects in analysis of expression microarray data: An evaluation of six batch adjustment methods. PLoS ONE, 6(2): e17238. doi:10.1371/journal.pone.0017238 Retrieved from http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0017238
Li, S., Łabaj, P. P., Zumbo, P., Sykacek, P., Shi, W., Shi, L., ..., & Mason, C. E. (2014). Detecting and correcting systematic variation in large-scale RNA sequencing data. Nature Biotechnology, 32, 888–895. doi:10.1038/nbt.3000 Retrieved from http://www.nature.com/nbt/journal/v32/n9/full/nbt.3000.html