UMD Theses and Dissertations

Permanent URI for this collectionhttp://hdl.handle.net/1903/3

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a given thesis/dissertation in DRUM.

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    Item
    Unraveling Metarhizium interactions with insects, plants and microbes
    (2019) Lovett, Brian; St. Leger, Raymond J; Entomology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Metarhizium fungi have dual lifestyles as insect pathogens and as rhizospheric plant symbionts. Since these fungi can vary widely in their virulence and host-specificity, they provide a powerful model for host-pathogen interactions. Today, it is clear the greatest potential of these fungi lies in their application as transgenic biotechnologies. Biotechnologies are rightly subjected to increased scrutiny, and this dissertation seeks to assess the risks and benefits of applying transgenic Metarhizium fungi using bioinformatics. After sequencing the early-diverged generalist Metarhizium frigidum, comparative genomics has upended our understanding of the trajectory of Metarhizium evolution. Using a functional gene microarray, I assessed the impacts these fungi have on the soil microbial community, establishing a protocol for evaluating possible risks of applying transgenic entomopathogenic fungi. To inform the evaluation and development of next-generation transgenic Metarhizium strains in the future, I evaluated the specific mosquito immune response to Metarhizium pingshaense during early infection with and without Plasmodium falciparum (the human malaria parasite) using transcriptomics. A strain of this fungus engineered to express a potent arthropod-derived, insect-specific neurotoxin in mosquito hemolymph, was also evaluated for mosquito control efficacy in semi-field trials in West Africa. Together, this body of work offers a comprehensive view of the evolution of this fungal genus and how transgenic fungi interact with insects, plants and microbes. The results herein comprise a framework for evaluating the risks and efficacy of transgenic fungi.
  • Thumbnail Image
    Item
    Computational Metagenomics: Network, Classification and Assembly
    (2012) Liu, Bo; Pop, Mihai; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Due to the rapid advance of DNA sequencing technologies in recent 10 years, large amounts of short DNA reads can be obtained quickly and cheaply. For example, a single Illumina HiSeq machine can produce several terabytes of data sets within a week. Metagenomics is a new scientific field that involves the analysis of genomic DNA sequences obtained directly from the environment, enabling studies of novel microbial systems. Metagenomics was made possible from high-throughput sequencing technologies. The analysis of the resulting data requires sophisticated computational analyses and data mining. In clinical settings, a fundamental goal of metagenomics is to help people diagnose and cure disease in clinical settings. One major bottleneck so far is how to analyze the huge noisy data sets quickly and precisely. My PhD research focuses on developing algorithms and tools to tackle these challenging and interesting computational problems. From the functional perspective, a metagenomic sample can be represented as a weighted metabolic network, in which the nodes are molecules, edges are enzymes encoded by genes, and the weights can be considered as the number of organisms providing the functions. One goal of functional comparison between metagenomic samples is to find differentially abundant metabolic subnetworks between two groups under comparison. We have developed a statistical network analysis tool - MetaPath, which uses a greedy search algorithm to find maximum weight subnetwork and a nonparametric permutation test to measure the statistical significance. Unlike previous approaches, MetaPath explicitly searches for significant subnetwork in the global network, enabling us to detect signatures at a finer level. In addition, we developed statistical methods that take into account the topology of the network when testing the significance of the subnetworks. Another computational problem involves classifying anonymous DNA sequences obtained from metagenomic samples. There are several challenges here: (1) The classification labels follow a hierarchical tree structure, in which the leaves are most specific, and the internal nodes are more general. How can we classify novel sequences that do not belong to leaf categories (species) but belong to internal groups (e.g., phylum)? (2) For each classification how can we compute a confidence score, such that the users have a tradeoff between sensitivity and specificity? (3) How can we analyze billions of data items quickly? We have developed a novel hierarchical classifier (MetaPhyler) for the classification of anonymous DNA reads. Through simulation, MetaPhyler models the distribution of pairwise similarities within different hierarchical groups with nonparametric density estimation. The confidence score is computed by the ratio of likelihood function. For a query DNA sequence with arbitrary length, its similarity can be calculated through linear approximation. Through benchmark comparison, we have shown that MetaPhyler is significantly faster and more accurate than previous tools. DNA sequencing machines can only produce very short strings (e.g., 100bp) relative to the size of a genome (e.g., a typical bacterial genome is 5Mbp). One of the most challenging computational tasks is the assembly of millions of short reads into longer contigs, which are used as the basis of subsequent computational analyses. In this project, we have developed a comparative metagenomic assembler (MetaCompass), which utilizes the genomes that have already been sequenced previously, and produces long contigs through read mapping (alignment) and assembly. Given the availability of thousands of existing bacteria genomes, for a particular sample, MetaCompass first chooses a best subset as reference based on the taxonomic composition. Then, the reads are aligned against these genomes using MUMmer-map or Bowtie2. Afterwards, we use a greedy algorithm of the minimum set-covering problem to build long contigs, and the consensus sequences are computed by the majority rule. We also propose an iterative approach to improve the performance. Finally, MetaCompass has been successfully evaluated and tested on over 20 terabytes of metagenomic data sets generated from the Human Microbiome Project. In addition, to facilitate the identification and characterization of antibiotic resistance genes, we have created Antibiotic Resistance Genes Database (ARDB), which provides a centralized compendium of information on antibiotic resistance. Furthermore, we have applied our tools to the analysis of a novel oral microbiome data set, and have discovered interesting functional mechanisms and ecological changes underlying the transition from health to periodontal disease of human mouth at a system level.