Correcting for batch effects in case-control microbiome studies

Gibbons, Sean M.; Duvallet, Claire; Alm, Eric J.

Author(s)

Gibbons, Sean Michael; Duvallet, Claire; Alm, Eric J

Downloadpcbi.1006102.pdf (25.55Mb)

PUBLISHER_CC

Terms of use

Creative Commons Attribution 4.0 International License http://creativecommons.org/licenses/by/4.0/

Metadata

Show full item record

Abstract

High-throughput data generation platforms, like mass-spectrometry, microarrays, and second-generation sequencing are susceptible to batch effects due to run-to-run variation in reagents, equipment, protocols, or personnel. Currently, batch correction methods are not commonly applied to microbiome sequencing datasets. In this paper, we compare different batch-correction methods applied to microbiome case-control studies. We introduce a model-free normalization procedure where features (i.e. bacterial taxa) in case samples are converted to percentiles of the equivalent features in control samples within a study prior to pooling data across studies. We look at how this percentile-normalization method compares to traditional meta-analysis methods for combining independent p-values and to limma and ComBat, widely used batch-correction models developed for RNA microarray data. Overall, we show that percentile-normalization is a simple, non-parametric approach for correcting batch effects and improving sensitivity in case-control meta-analyses.

Date issued

2018-04

URI

http://hdl.handle.net/1721.1/117510

Department

Massachusetts Institute of Technology. Department of Biological Engineering

Journal

PLOS Computational Biology

Publisher

Public Library of Science (PLoS)

Citation

Gibbons, Sean M., et al. “Correcting for Batch Effects in Case-Control Microbiome Studies.” PLOS Computational Biology, edited by Morgan Langille, vol. 14, no. 4, Apr. 2018, p. e1006102. © 2018 Gibbons et al.

Version: Final published version

ISSN

1553-7358

Collections

MIT Open Access Articles

DSpace@MIT