UMD Theses and Dissertations

Permanent URI for this collectionhttp://hdl.handle.net/1903/3

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a given thesis/dissertation in DRUM.

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 5 of 5
  • Thumbnail Image
    Item
    Improving and validating computational algorithms for the assembly, clustering, and taxonomic classification of microbial communities
    (2024) Luan, Tu; Pop, Mihai; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Recent high-throughput sequencing technologies have advanced the study of microbial communities; nonetheless, analyzing the resulting large datasets still poses challenges. This dissertation focuses on developing and validating computational algorithms to address these challenges in microbial communities' assembly, clustering, and taxonomic classification. We first introduce a novel reference-guided metagenomic assembly approach that delivers high-quality assemblies that generally outperform \textit{de novo} assembly in terms of quality without a significant increase in runtime. Next, We propose SCRAPT, an iterative sampling-based algorithm designed to cluster 16S rRNA gene sequences from large datasets efficiently. In addition, we validate a comprehensive set of genome assembly pipelines using Oxford Nanopore sequencing, achieving near-perfect accuracy through the combination of long and short-read polishing tools. Our research improves the accuracy and efficiency of analyzing complex microbial communities. This dissertation offers insights into the composition and structures of these communities, with potential implications for human, animal, and plant health.
  • Thumbnail Image
    Item
    Computational approaches for improving the accuracy and efficiency of RNA-seq analysis
    (2020) Sarkar, Hirak N/A; Patro, Robert; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The past decade has seen tremendous growth in the area of high throughput sequencing technology, which simultaneously improved the biological resolution and subsequent processing of publicly-available sequencing datasets. This enormous amount of data also calls for better algorithms to process, extract and filter useful knowledge from the data. In this thesis, I concentrate on the challenges and solutions related to the processing of bulk RNA-seq data. An RNA-seq dataset consists of raw nucleotide sequences, drawn from the expressed mixture of transcripts in one or more samples. One of the most common uses of RNA-seq is obtaining transcript or gene level abundance information from the raw nucleotide read sequences and then using these abundances for downstream analyses such as differential expression. A typical computational pipeline for such processing broadly involves two steps: assigning reads to the reference sequence through alignment or mapping algorithms, and subsequently quantifying such assignments to obtain the expression of the reference transcripts or genes. In practice, this two-step process poses multitudes of challenges, starting from the presence of noise and experimental artifacts in the raw sequences to the disambiguation of multi-mapped read sequences. In this thesis, I have described these problems and demonstrated efficient state-of-the-art solutions to a number of them. The current thesis will explore multiple uses for an alternate representation of an RNA-seq experiment encoded in equivalence classes and their associated counts. In this representation, instead of treating a read fragment individually, multiple fragments are simultaneously assigned to a set of transcripts depending on the underlying characteristics of the read-to-transcript mapping. I used the equivalence classes for a number of applications in both single-cell and bulk RNA-seq technologies. By employing equivalence classes at cellular resolution, I have developed a droplet-based single-cell RNA-seq sequence simulator capable of generating tagged end short read sequences resembling the properties of real datasets. In bulk RNA-seq, I have utilized equivalence classes to applications ranging from data-driven compression methodologies to clustering de-novo transcriptome assemblies. Specifically, I introduce a new data-driven approach for grouping together transcripts in an experiment based on their inferential uncertainty. Transcripts that share large numbers of ambiguously-mapping fragments with other transcripts, in complex patterns, often cannot have their abundances confidently estimated. Yet, the total transcriptional output of that group of transcripts will have greatly-reduced inferential uncertainty, thus allowing more robust and confident downstream analysis. This approach, implemented in the tool terminus, groups together transcripts in a data-driven manner. It leverages the equivalence class factorization to quickly identify transcripts that share reads and posterior samples to measure the confidence of the point estimates. As a result, terminus allows transcript-level analysis where it can be confidently supported, and derives transcriptional groups where the inferential uncertainty is too high to support a transcript-level result.
  • Thumbnail Image
    Item
    Novel methods for comparing and evaluating single and metagenomic assemblies
    (2015) Hill, Christopher Michael; Pop, Mihai; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The current revolution in genomics has been made possible by software tools called genome assemblers, which stitch together DNA fragments “read” by sequencing machines into complete or nearly complete genome sequences. Despite decades of research in this field and the development of dozens of genome assemblers, assessing and comparing the quality of assembled genome sequences still heavily relies on the availability of independently determined standards, such as manually curated genome sequences, or independently produced mapping data. The focus of this work is to develop reference-free computational methods to accurately compare and evaluate genome assemblies. We introduce a reference-free likelihood-based measure of assembly quality which allows for an objective comparison of multiple assemblies generated from the same set of reads. We define the quality of a sequence produced by an assembler as the conditional probability of observing the sequenced reads from the assembled sequence. A key property of our metric is that the true genome sequence maximizes the score, unlike other commonly used metrics. Despite the unresolved challenges of single genome assembly, the decreasing costs of sequencing technology has led to a sharp increase in metagenomics projects over the past decade. These projects allow us to better understand the diversity and function of microbial communities found in the environment, including the ocean, Arctic regions, other living organisms, and the human body. We extend our likelihood-based framework and show that we can accurately compare assemblies of these complex bacterial communities. After an assembly has been produced, it is not an easy task determining what parts of the underlying genome are missing, what parts are mistakes, and what parts are due to experimental artifacts from the sequencing machine. Here we introduce VALET, the first reference-free pipeline that flags regions in metagenomic assemblies that are statistically inconsistent with the data generation process. VALET detects mis-assemblies in publicly available datasets and highlights the current shortcomings in available metagenomic assemblers. By providing the computational methods for researchers to accurately evalu- ate their assemblies, we decrease the chance of incorrect biological conclusions and misguided future studies.
  • Thumbnail Image
    Item
    Design and Analysis of an Automated Assembly Process for Manufacturing Paint Brush Knots
    (2011) Gorbashev, Aleksandr Borisovich; Thamire, Chandrasekhar; Mechanical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Title of Document: DESIGN AND ANALYSIS OF AN AUTOMATED ASSEMBLY PROCESS FOR MANUFACTURING PAINT BRUSH KNOTS Aleksandr Borisovich Gorbashev, M.S. 2011 Mechanical Engineering Directed By: Chandrasekhar Thamire Department of Mechanical Engineering Manufacturing process for paint brushes requires handling and assembling of flexible and delicate filaments and can be cumbersome in manual assembly processes. Common issues resulting from such manual assembly process include variations in filament density, deviations in filament straightness, and issues due to right and left handed bias in assembly operations, resulting in poor quality of the end products. Coupled with operator fatigue and health problems, these issues provide an excellent motivation for refining the process. The primary objectives of this study were to develop an assembly system that will 1) increase product quality, and 2) improve the production rate. The secondary objective was to develop a set of design guidelines for handling flexible elements such as synthetic filaments within provided housings. In order to develop the automated assembly process, needs analysis and product design specification exercises were performed first, followed by functional decomposition of the process at the first level. Designs for individual subsystems were developed next using functional decomposition at lower levels, concept generation, concept evaluation, feasibility testing, testing for design parameters, design through solid modeling, strength analysis, concept testing using physical prototypes and subsystem refinement. In order to assess the response of filament assemblies when subjected to external loading and moving relative to the housings, experiments were designed and conducted. For a range of factors, tests were conducted to establish limits of pulling force required to displace filament bundles within the housings. Correlations relating filament motion to applied loading were developed for a variety of housing geometries and material types. Design guideline related to motion of filaments within housings was developed. In light of the testing performed, design guidelines for development of gripper-plates used for gripping of bulk filament bundles were also established. It is expected that these guidelines will be useful in the manufacturing automation industry, involving manufacture of toothbrushes, hair brushes and fiber-optic elements. Upon successful completion of the feasibility tests, full-scale prototypes using the final concepts of subsystems were fabricated. Tests were conducted to determine the reliability of the process and quality of the brush knots. Results indicate that the quality of the brushes was much higher than the traditional hand-made brushes and that the productivity would nearly double. Upon delivery of the system to the company sponsoring this research, it is expected that the system developed would be able to produce up to 3 million brushes per year.
  • Thumbnail Image
    Item
    Gallium Nitride Nanowire Based Electronic and Optical Devices
    (2007-07-26) Motayed, Abhishek; Melngailis, John; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Gallium nitride nanowires have significant potential for developing nanoscale emitters, detectors, and biological/chemical sensors, as they possess unique material properties such as wide direct bandgap (3.4 eV), high critical breakdown field, radiation hardness, and mechanical/chemical stability. Although few results of individual GaN nanowire devices have been reported so far, most of them often utilize fabrication processes unsuitable for large-scale nanosystems development and do not involve fundamental transport property measurements. Understanding the transport mechanisms and correlating the device properties with the structural characteristics of the nanowires are of great importance for realizing high performance devices. Focused ion beam induced metal deposition was used to make individual GaN nanowire devices, and assessment of their electrical properties was performed. The nanowires were grown by direct reaction of Ga and NH3, with diameters ranging from 80 nm to 250 nm and lengths up to 200 µm. Dielectrophoretic alignment was used to assemble these nanowires from a suspension on to a large area pre-patterned substrate. A fabrication technique, utilizing only conventional microfabrication processes, has been developed for realizing robust nanowire devices including field effect transistors (FETs), light emitting diodes (LEDs), Schottky diodes, and four-terminal structures. Nanowire FETs with different gate geometries were studied, namely bottom gate, omega-backgate, and omega-plane gate structures. Utilizing omega-backgated FETs, transconductance as high as 0.34  103 µS mm-1 has been obtained. Room temperature field effect electron mobility in excess of 300 cm2 V-1 s-1 have been exhibited by a nanowire FET, with a 200 nm diameter nanowire and Si substrate as the backgate. The observed reduction of mobility in the GaN nanowire FETs with decreasing diameter of the nanowire is attributed to the surface scattering. Electron beam backscattered diffraction revealed that the grain boundary scattering is present in some of the nanowires. Temperature dependent mobility measurements indicated that the ionized impurity scattering is the dominant mechanism in the transport in these nanowires. GaN nanoLEDs have been realized by assembling the n-type nanowires on a p-GaN epitaxial layer using dielectrophoresis. The resulting p-n homojunctions exhibited 365 nm electroluminescence with a full width half maximum of 25 nm at 300 K.