Theses and Dissertations from UMD

Permanent URI for this communityhttp://hdl.handle.net/1903/2

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 1 of 1
  • Thumbnail Image
    Item
    Semiparametric Regression and Mortality Rate Prediction
    (2011) Voulgaraki, Anastasia; Kedem, Benjamin; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This dissertation is divided into two parts. In the first part we consider the general multivariate multiple sample semiparametric density ratio model. In this model one distribution serves as a reference or baseline, and all other distributions are weighted tilts of the reference. The weights are considered known up to a parameter. All the parameters in the model, as well as the reference distribution, are estimated from the combined data from all samples. A kernel-based density estimator can be constructed based on the semiparametric model. In this dissertation we discuss the asymptotic theory and convergence properties for the semiparametric kernel density estimator. The estimator is shown to be not only consistent, but also more efficient than the general kernel density estimator. Several ways for selecting the bandwidth are also discussed. This opens the door to regression analysis with random covariates from a semiparametric perspective where information is combined from multiple multivariate sources. Accordingly, each multivariate distribution and a corresponding conditional expectation (or regression) of interest is then estimated from the combined data from all sources. Graphical and quantitative diagnostic tools are suggested to assess model validity. The method is applied to real and simulated data. Comparisons are made with multiple regression, generalized additive models (GAM) and nonparametric kernel regression. In the second part we study mortality rate prediction. The National Center for Health Statistics (NCHS) uses observed mortality data to publish race-gender specific life tables for individual states decennially. At ages over 85 years, the reliability of death rates based on these data is compromised to some extent by age misreporting. The eight-parameter Heligman-Pollard parametric model is then used to smooth the data and obtain estimates/extrapolation of mortality rates for advanced ages. In States with small sub-populations the observed mortality rates are often zero, particularly among young ages. The presence of zero death rates makes the fitting of the Heligman-Pollard model difficult and at times outright impossible. In addition, since death rates are reported on a log scale, zero mortality rates are problematic. To overcome observed zero death rates, appropriate probability models are used. Using these models, observed zero mortality rates are replaced by the corresponding expected values. This enables using logarithmic transformations, and the fitting of the Heligman-Pollard model to produce mortality estimates for ages 0 - 130 years.