COMPUTATIONAL METHODS IN PROTEIN STRUCTURE, EVOLUTION AND NETWORKS.

dc.contributor.advisorMoult, Johnen_US
dc.contributor.authorCao, Chenen_US
dc.contributor.departmentMolecular and Cell Biologyen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2014-02-07T06:30:34Z
dc.date.available2014-02-07T06:30:34Z
dc.date.issued2013en_US
dc.description.abstractThe advent of new sequencing technology has resulted in the accumulation of a large amount of information on human DNA variation. In order to make sense of these data in the context of biology and medicine, new methods are needed both for analysis and for integration with other resources. In this work: 1) I studied the distribution pattern of human DNA variants across populations using data from the 1000 genomes project and investigated several evolutionary biology questions from the perspective of population genomics. I found population level support for trends previously observed between species, including selection against deleterious variants, and lower frequency of variants in highly expressed genes and highly connected genes. I was also able to show that the correlation between synonymous and non-synonymous variant levels is a consequence of both mutation prevalence variation across the genome and shared selection pressure. 2) I performed a systematic evaluation of the effectiveness of GWAS (Genome Wide Association Studies) for finding potential drug targets and discovered the method is very ineffective for this purpose. I proposed two reasons to explain this finding, selection against variants in drug targets and the relatively short length of drug target genes. I discovered that GWAS genes and drug targets are closely associated in the biological network, and on that basis, developed a machine learning algorithm to leverage the GWAS results for the identification of potential drug targets, making use of biological network information. As a result, I identified some potential drug repurposing opportunities. 3) I developed a method to increase the number of protein structure models available for interpreting the impact of human non-synonymous variants, important for not only the understanding the mechanisms of genetic disease but also in the study of human protein evolution. The method enables the impact of approximately 40% more missense variants to be reliably modeled. In summary, these three projects demonstrate that value of computational methods in addressing a wide range of problems in protein structure, evolution, and networks.en_US
dc.identifier.urihttp://hdl.handle.net/1903/14865
dc.language.isoenen_US
dc.subject.pqcontrolledBioinformaticsen_US
dc.subject.pquncontrolleddrug targetsen_US
dc.subject.pquncontrolledevolutionen_US
dc.subject.pquncontrolledGWASen_US
dc.subject.pquncontrolledproteinen_US
dc.subject.pquncontrolledSNVen_US
dc.subject.pquncontrolledstructureen_US
dc.titleCOMPUTATIONAL METHODS IN PROTEIN STRUCTURE, EVOLUTION AND NETWORKS.en_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Cao_umd_0117E_14650.pdf
Size:
4.11 MB
Format:
Adobe Portable Document Format