Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
6 results
Search Results
Item Phenotypic and Genetic Analysis of Reasons for Disposal in Dairy Cattle(2024) Iqbal, Victoria Audrey; Ma, Li; Animal Sciences; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Reasons for disposal are defined as why a cow has left the herd during lactation and are documented as termination codes. Dairy cattle termination codes were collected by Dairy Records Processing Centers and stored in the National Cooperator Database maintained by the Council on Dairy Cattle Breeding for analysis. The list of possible termination codes is as follows: code 0 is cow lactation that ended typically without an abortion, code 1 is locomotion problems, code 2 is female transferred or sold, code 3 is low milk yield, code 4 is reproductive problems, code 5 is unspecified reasons, code 6 is death, code 7 is the presence of mastitis, code 8 is abortion, code 9 is udder problems, code A is an unfavorable phenotype, and lastly code B is undesirable temperament. Understanding termination codes is the key to understanding and improving farm management. Unfortunately, the secondary termination codes are not utilized, despite studies saying one reason is too limited. Heifer termination codes should be more utilized, and studies show that this could improve heifer management. The four processing centers' principal termination codes deviated a little from year to year, but processing center D had the most variation in principal termination codes. There were few records with termination codes 9, A, and B. There was low lameness found for Jersey cattle but more fluctuations for their termination codes 6, 7, and 8. Jersey's main reason for disposal was sold and low milk yield. As for Holstein, the main reasons for disposal were low milk production and death. Recommendations include removing termination code 5 (other reasons) and enforcing a secondary termination code for code 2 (sold). Also, including the percentage of animal records used in traits developed at the CDCB was recommended to encourage farmers to add more records to improve breeding selections. Overall, the top main reasons for disposal were low milk yield, death, and reproduction across breeds from 2011 to 2022. To determine whether health traits correlate to termination codes and how health traits change the probability of survival, a multinomial logistic regression was developed, where twelve health traits, breeds, and other factors were used as an independent variable for the termination code, the dependent variable. The output is a regression coefficient list that conveys the effect of each health trait for each termination code. The results show the apparent impacts of animal breeds on different termination codes, such as dairy crossbreeds negatively affecting termination due to reproductive advantages that follow the literature. Lastly, using termination codes as phenotype, this study focuses on developing a genome-wide association study (GWAS) using the Weighted single-step Genomic Best Linear unbiased prediction (WssGBLUP) model to find significant SNPs related to survival in Holstein cows. In summary, this study provided an understanding of reasons for disposal trends, modeled the reasons for disposal, determined the likelihood of termination post-incidence, and found the heritability and important SNPs of each termination code.Item The Genetic Architecture of Complex Traits and Diseases in Dairy Cattle(2022) Freebern, Ellen; Ma, Li; Animal Sciences; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Genetic architecture refers to the number and locations of genes that affect a trait, as well as the magnitude and the relative contributions of their effects. A better understanding of the genetic architecture of complex traits and diseases will be beneficial for analyzing genetic contributions to disease risk and for estimating genetic values of agricultural importance. In particular, genetic and genomic selection in dairy cattle populations has been well established and exploited through genome-wide association studies, sequencing studies, and functional studies. The objective of this dissertation is to understand the genetic architecture of complex traits and apply the understanding to investigate the biological relationship between genetics and diseases in dairy cattle. First, we performed GWAS and fine-mapping analyses on livability and six health traits in Holstein-Friesian cattle. From our analyses, we reported significant associations and candidate genes relevant to cattle health. Second, we evaluated genome-wide diversity in cattle over a period of time by running GWAS and proposed a gene dropping simulation program. From this study, we identified candidate variants under selection that are associated with biological and economically important traits in cattle. Also, we demonstrated that gene dropping is an applicable method to investigate changes in the cattle genome over time. Third, we investigated the effect of maternal age and temperature on recombination rate in cattle. We provided novel results regarding the plasticity of meiotic recombination in cattle. Additionally, we found a positive correlation between environmental temperature at conception and recombination rate in Holstein-Friesian cows. Collectively, these studies advance our understanding of the genetic architecture and the biological relationship between complex traits and diseases in dairy cattle.Item DATA DRIVEN APPROACHES TO IDENTIFY DETERMINANTS OF HEART DISEASES AND CANCER RESISTANCE(2016) Sahu, Avinash Das; Hannenhalli, Sridhar; Ruppin, Eytan; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Cancer and cardio-vascular diseases are the leading causes of death world-wide. Caused by systemic genetic and molecular disruptions in cells, these disorders are the manifestation of profound disturbance of normal cellular homeostasis. People suffering or at high risk for these disorders need early diagnosis and personalized therapeutic intervention. Successful implementation of such clinical measures can significantly improve global health. However, development of effective therapies is hindered by the challenges in identifying genetic and molecular determinants of the onset of diseases; and in cases where therapies already exist, the main challenge is to identify molecular determinants that drive resistance to the therapies. Due to the progress in sequencing technologies, the access to a large genome-wide biological data is now extended far beyond few experimental labs to the global research community. The unprecedented availability of the data has revolutionized the capabilities of computational researchers, enabling them to collaboratively address the long standing problems from many different perspectives. Likewise, this thesis tackles the two main public health related challenges using data driven approaches. Numerous association studies have been proposed to identify genomic variants that determine disease. However, their clinical utility remains limited due to their inability to distinguish causal variants from associated variants. In the presented thesis, we first propose a simple scheme that improves association studies in supervised fashion and has shown its applicability in identifying genomic regulatory variants associated with hypertension. Next, we propose a coupled Bayesian regression approach -- eQTeL, which leverages epigenetic data to estimate regulatory and gene interaction potential, and identifies combinations of regulatory genomic variants that explain the gene expression variance. On human heart data, eQTeL not only explains a significantly greater proportion of expression variance in samples, but also predicts gene expression more accurately than other methods. We demonstrate that eQTeL accurately detects causal regulatory SNPs by simulation, particularly those with small effect sizes. Using various functional data, we show that SNPs detected by eQTeL are enriched for allele-specific protein binding and histone modifications, which potentially disrupt binding of core cardiac transcription factors and are spatially proximal to their target. eQTeL SNPs capture a substantial proportion of genetic determinants of expression variance and we estimate that 58% of these SNPs are putatively causal. The challenge of identifying molecular determinants of cancer resistance so far could only be dealt with labor intensive and costly experimental studies, and in case of experimental drugs such studies are infeasible. Here we take a fundamentally different data driven approach to understand the evolving landscape of emerging resistance. We introduce a novel class of genetic interactions termed synthetic rescues (SR) in cancer, which denotes a functional interaction between two genes where a change in the activity of one vulnerable gene (which may be a target of a cancer drug) is lethal, but subsequently altered activity of its partner rescuer gene restores cell viability. Next we describe a comprehensive computational framework --termed INCISOR-- for identifying SR underlying cancer resistance. Applying INCISOR to mine The Cancer Genome Atlas (TCGA), a large collection of cancer patient data, we identified the first pan-cancer SR networks, composed of interactions common to many cancer types. We experimentally test and validate a subset of these interactions involving the master regulator gene mTOR. We find that rescuer genes become increasingly activated as breast cancer progresses, testifying to pervasive ongoing rescue processes. We show that SRs can be utilized to successfully predict patients' survival and response to the majority of current cancer drugs, and importantly, for predicting the emergence of drug resistance from the initial tumor biopsy. Our analysis suggests a potential new strategy for enhancing the effectiveness of existing cancer therapies by targeting their rescuer genes to counteract resistance. The thesis provides statistical frameworks that can harness ever increasing high throughput genomic data to address challenges in determining the molecular underpinnings of hypertension, cardiovascular disease and cancer resistance. We discover novel molecular mechanistic insights that will advance the progress in early disease prevention and personalized therapeutics. Our analyses sheds light on the fundamental biological understanding of gene regulation and interaction, and opens up exciting avenues of translational applications in risk prediction and therapeutics.Item ANALYSIS OF CONSENSUS GENOME-WIDE EXPRESSION-QTLS AND THEIR RELATIONSHIPS TO HUMAN COMPLEX TRAIT DISEASES(2014) YU, CHEN-HSIN; Moult, John; Molecular and Cell Biology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Genome-wide association studies of human complex disease have identified a large number of disease associated genetic loci. However, most of these risk loci do not provide direct information on the biological basis of a disease or on the underlying mechanisms. Recent genome-wide expression quantitative trait loci (eQTLs) association studies have provided information on genetic factors, especially SNPs, associated with gene expression variation. These eQTLs might contribute to phenotype diversity and disease susceptibility, but interpretation is handicapped by low reproducibility of the expression results. Our first major goal was to establish a list of consensus eQTLs by integrating publicly available data for specific human populations and cell types. We used linkage disequilibrium data from Hapmap and the 1000 Genomes Project to integrate the results of eQTL studies. Overall, we find over 4000 genes that are involved in high confidence eQTL relationships. We also assessed the possible underlying mechanisms of tissue dependent eQTLs by mapping these to known genome sites of functional elements. Results of comparison of eQTLs across studies on the same cell type versus those on different cell types suggest that tissue specific eQTLs are less common than pan-tissue eQTLs. Our second major goal was to use these results to elucidate the role eQTLs play in human common diseases. For this purpose, we matched the high confidence eQTLs to a set of 335 disease risk loci identified from the Wellcome Trust Case Control Consortium (WTCCC1) genome-wide association study and follow-up studies for seven human common diseases. Our results show that the data are consistent with approximately 50% of these disease loci arising from an underlying expression change mechanism. In many cases, the results provide a proposed expression mechanism for genes previously suggested as disease relevant, in others, new disease relevant genes are identified. A web-based database, ExSNP, was designed to provide comprehensive access to the eQTL data and results from our analysis, including original eQTLs, high-confidence eQTLs, cell type dependent eQTLs, population dependent eQTLs, disease associated eQTLs, and functionally annotated eQTLs. The website also incorporates a genome browser that allows visualization of the relative positions of eQTL SNPs to their associated genes and other neighboring genes, as well as the relationship to functional elements and disease associations.Item IDENTIFICATION OF GENES INVOLVED IN THE ANTIVIRAL RESPONSE THROUGH GENETIC SCREENS IN DROSOPHILA(2014) Tang, Jessica (Juanjie); Wu, Louisa; Molecular and Cell Biology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Innate immunity is essential for the host to defend against invading pathogens, such as viruses and bacteria. To identify novel genes or molecules that are involved in innate immunity, we carried out two genetic screens in Drosophila. From a forward screen of flies mutagenized with Ethyl methane sulfonate (EMS), four mutants with increased susceptibility to Drosophila X virus (DXV) were found. In this study, we focused on the rogue mutant and identified a novel antiviral gene rogue. The rogue mutant is highly susceptible to DXV infection and is unable to control viral replication during infection. The expression of rogue in either the hemocytes or the fat body is required for flies to control viral accumulation and to survive a viral infection. At an early stage of infection, rogue is induced and the amount of Rogue protein that locates to the nucleus increases. In addition, we confirm that the Rogue protein interacts with the polyA binding protein (PABP), and we propose that rogue restricts viral replication via translation regulation in Drosophila. The rogue mutant also has a phagosome maturation defect, which may contribute to its susceptibility to Staphylococcus aureus infection. RNAi knockdown of rogue in the fat body or the hemocytes in wild type flies results in high bacterial susceptibility. Introducing the rogue transgene in the hemocytes of the rogue mutant can rescue the mutant survival to both DXV and S. aureus. Together, our results demonstrate that rogue plays a critical role in defending against DXV and S. aureus infections. We performed another genetic screen on wild derived inbred flies from the Drosophila Genetic Reference Panel (DGRP). From a genome wide association study (GWAS) in these flies, we found four single nucleotide polymorphisms (SNPs) associated with susceptibility of flies to DXV. One allele contributed most to the susceptibility is located in the intron of Socs36E, a negative regulator of the JAK-STAT pathway, implicating that the JAK-STAT pathway plays a role in the immune responses against DXV. Our study also shows that natural genetic variation can be used as a tool for identifying novel genes or pathways involved in antiviral immunity.Item COMPUTATIONAL METHODS IN PROTEIN STRUCTURE, EVOLUTION AND NETWORKS.(2013) Cao, Chen; Moult, John; Molecular and Cell Biology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The advent of new sequencing technology has resulted in the accumulation of a large amount of information on human DNA variation. In order to make sense of these data in the context of biology and medicine, new methods are needed both for analysis and for integration with other resources. In this work: 1) I studied the distribution pattern of human DNA variants across populations using data from the 1000 genomes project and investigated several evolutionary biology questions from the perspective of population genomics. I found population level support for trends previously observed between species, including selection against deleterious variants, and lower frequency of variants in highly expressed genes and highly connected genes. I was also able to show that the correlation between synonymous and non-synonymous variant levels is a consequence of both mutation prevalence variation across the genome and shared selection pressure. 2) I performed a systematic evaluation of the effectiveness of GWAS (Genome Wide Association Studies) for finding potential drug targets and discovered the method is very ineffective for this purpose. I proposed two reasons to explain this finding, selection against variants in drug targets and the relatively short length of drug target genes. I discovered that GWAS genes and drug targets are closely associated in the biological network, and on that basis, developed a machine learning algorithm to leverage the GWAS results for the identification of potential drug targets, making use of biological network information. As a result, I identified some potential drug repurposing opportunities. 3) I developed a method to increase the number of protein structure models available for interpreting the impact of human non-synonymous variants, important for not only the understanding the mechanisms of genetic disease but also in the study of human protein evolution. The method enables the impact of approximately 40% more missense variants to be reliably modeled. In summary, these three projects demonstrate that value of computational methods in addressing a wide range of problems in protein structure, evolution, and networks.