Improving and validating computational algorithms for the assembly, clustering, and taxonomic classification of microbial communities

Loading...
Thumbnail Image

Files

Publication or External Link

Date

2024

Authors

Advisor

Citation

Abstract

Recent high-throughput sequencing technologies have advanced the study of microbial communities; nonetheless, analyzing the resulting large datasets still poses challenges. This dissertation focuses on developing and validating computational algorithms to address these challenges in microbial communities' assembly, clustering, and taxonomic classification.
We first introduce a novel reference-guided metagenomic assembly approach that delivers high-quality assemblies that generally outperform \textit{de novo} assembly in terms of quality without a significant increase in runtime. Next, We propose SCRAPT, an iterative sampling-based algorithm designed to cluster 16S rRNA gene sequences from large datasets efficiently. In addition, we validate a comprehensive set of genome assembly pipelines using Oxford Nanopore sequencing, achieving near-perfect accuracy through the combination of long and short-read polishing tools.

Our research improves the accuracy and efficiency of analyzing complex microbial communities. This dissertation offers insights into the composition and structures of these communities, with potential implications for human, animal, and plant health.

Notes

Rights