Microbiome (Oct 2019)
Microbiota analysis optimization for human bronchoalveolar lavage fluid
Abstract
Abstract Background It is now possible to comprehensively characterize the microbiota of the lungs using culture-independent, sequencing-based assays. Several sample types have been used to investigate the lung microbiota, each presenting specific challenges for preparation and analysis of microbial communities. Bronchoalveolar lavage fluid (BALF) enables the identification of microbiota specific to the lower lung but commonly has low bacterial density, increasing the risk of false-positive signal from contaminating DNA. The objectives of this study were to investigate the extent of contamination across a range of sample densities representative of BALF and identify features of contaminants that facilitate their removal from sequence data and aid in the interpretation of BALF sample 16S sequencing data. Results Using three mock communities across a range of densities ranging from 8E+ 02 to 8E+ 09 16S copies/ml, we assessed taxonomic accuracy and precision by 16S rRNA gene sequencing and the proportion of reads arising from contaminants. Sequencing accuracy, precision, and the relative abundance of mock community members decreased with sample input density, with a significant drop-off below 8E+ 05 16S copies/ml. Contaminant OTUs were commonly inversely correlated with sample input density or not reproduced between technical replicates. Removal of taxa with these features or physical concentration of samples prior to sequencing improved both sequencing accuracy and precision for samples between 8E+ 04 and 8E+ 06 16S copies/ml. For the lowest densities, below 8E+ 03 16S copies/ml BALF, accuracy and precision could not be significantly improved using these approaches. Using clinical BALF samples across a large density range, we observed that OTUs with features of contaminants identified in mock communities were also evident in low-density BALF samples. Conclusion Relative abundance data and community composition generated by 16S sequencing of BALF samples across the range of density commonly observed in this sample type should be interpreted in the context of input sample density and may be improved by simple pre- and post-sequencing steps for densities above 8E+ 04 16S copies/ml.
Keywords