Bayesian Estimation of the Inbreeding Coefficient for Single Nucleotide Polymorphism Using Complex Survey Data

dc.contributor.advisorLahiri, Parthaen_US
dc.contributor.advisorLi, Yanen_US
dc.contributor.authorXue, Zhenyien_US
dc.contributor.departmentMathematicsen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2016-02-06T06:44:11Z
dc.date.available2016-02-06T06:44:11Z
dc.date.issued2015en_US
dc.description.abstractIn genome-wide association studies (GWAS), single nucleotide polymorphism (SNP) is often used as a genetic marker to study gene-disease association. Some large scale health sample surveys have recently started collecting genetic data. There is now growing interest in developing statistical procedures using genetic survey data. This calls for innovative statistical methods that incorporate both genetic and statistical sampling. Under simple random sampling, the traditional estimator of the inbreeding coefficient is given by 1 - (number of observed heterozygotes) / (number of expected heterozygotes). Genetic data quality control reports published by the National Health and Nutrition Examination Survey (NHANES) and the Health and Retirement Study (HRS) use this simple estimator, which serves as a reasonable quality control tool to identify problems such as genotyping error. There is, however, a need to improve on this estimator by considering different features of the complex survey design. The main goal of this dissertation is to fill in this important research gap. First, a design-based estimator and its associated jackknife standard error estimator are proposed. Secondly, a hierarchical Bayesian methodology is developed using the effective sample size and genotype count. Lastly, a Bayesian pseudo-empirical likelihood estimator is proposed using the expected number of heterozygotes in the estimating equation as a constraint when maximizing the pseudo-empirical likelihood. One of the advantages of the proposed Bayesian methodology is that the prior distribution can be used to restrict the parameter space induced by the general inbreeding model. The proposed estimators are evaluated using Monte Carlo simulation studies. Moreover, the proposed estimates of the inbreeding coefficients of SNPs from APOC1 and BDNF genes are compared using the data from the 2006 Health and Retirement Study.en_US
dc.identifierhttps://doi.org/10.13016/M21X55
dc.identifier.urihttp://hdl.handle.net/1903/17307
dc.language.isoenen_US
dc.subject.pqcontrolledStatisticsen_US
dc.subject.pqcontrolledGeneticsen_US
dc.subject.pquncontrolledBayesianen_US
dc.subject.pquncontrolledComplex Surveyen_US
dc.subject.pquncontrolledEmpirical Likelihooden_US
dc.subject.pquncontrolledInbreeding Coefficienten_US
dc.subject.pquncontrolledMonte Carloen_US
dc.subject.pquncontrolledSingle Nucleotide Polymorphismen_US
dc.titleBayesian Estimation of the Inbreeding Coefficient for Single Nucleotide Polymorphism Using Complex Survey Dataen_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Xue_umd_0117E_16746.pdf
Size:
1.07 MB
Format:
Adobe Portable Document Format