AUGMENTING SEQUENCING TECHNOLOGY FOR BETTER INFERENCE IN SOIL MICROBIOME ANALYSIS
Publication or External Link
The advent of DNA sequencing revolutionized the field of microbiome research. Many organisms, by virtue of their codependence and/or growth rate, are either impossible or extremely challenging to get into pure culture. Sequencing allows important taxonomic and phylogenetic information to be obtained independent of culturing. Development of the sequencing technology itself has allowed for high throughput workflow that has allowed low cost and extensive sampling of microbiomes across environments. The co-development of reference datasets for taxonomy and functional assignments, along with open-source bioinformatics pipelines has further empowered scientists to explore microbiomes in many environments. However, there are limitations to sequence data that have constrained the ecological inferences in microbiome research. One such limitation, the compositional nature of sequence data, has impeded our ability to make accurate inferences about the environmental drivers of taxon abundance and covariance across conditions. In this dissertation I explore the use of quantitative PCR in combination with sequencing techniques to generate “Quantitative Sequencing” data (QSeq) that mitigates the limitations of compositionality on inferences relating to taxon abundance and covariance across environmental gradients. In chapter 1, I reviewed key characteristics of the soil environment and sequencing as a mechanism for sampling. In chapter 2, I leveraged modeling, synthesis, and literature review methods to establish the questions and data characteristics that demand QSeq methodology. I show that even small amounts of variation in total abundance make determining the effects of environment (biotic and abiotic factors) on any given taxon unreliable without QSeq. In Chapter 3, I extend the logic of quantitative sequencing to improve metagenome prediction from PICRUSt2. Using data synthesis methods, accounting for 16S gene abundance consistently improved the accuracy of predicted functional genes. This was confirmed by high correlations between predicted and measured gene abundance (QPCR). There was however a large variation in prediction accuracy, likely due in part to database biases and in part to decoupling of bacterial function from taxonomy. In Chapter 4, I applied QSeq in the context of an experimental, long-term farming system that has large gradients in total abundance with depth, and I used QSeq to identify taxa that changed in abundance due to different farming system management and soil depth. Finally in Chapter 5, I used QSeq to identify putative N-fixing taxa that responded to glyphosate in four experimental farming systems. I show that the abundance of these taxa were decoupled from other effects of glyphosate on N-fixation in soybean across farming systems.