Applications of parametric and semi-parametric models for longitudinal data analysis

Loading...
Thumbnail Image

Files

Publication or External Link

Date

2014

Citation

Abstract

A wide range of scientific applications involve analyses of longitudinal data. Whether it is time or location, careful considerations need to be made when applying different statistical tools. One such challenge is to correctly estimate variance components in observed data. In this dissertation, I apply statistical tools to solve problems involving longitudinal data in the field of Biology, Healthcare and Networks.

In the second chapter, I apply SSANOVA models to find regions in the genome that have a specific biological trait. We introduce a direct approach of estimating genomic longitudinal data of two different biological groups. Using SSANOVA we then produce a novel method to estimate the difference between the two groups and find regions (location or time) where this difference is biologically significant.

In the third chapter, we analyze longitudinal network data using an overdispersed Poisson model. We build a network of musical writers yearly for a 42 year period. Using statistical models, we predict network level topology changes and find covariates that explain these changes. Network level characteristics used for this chapter include average node degree, clustering coefficient and network density. We also build a visualization tool using R-Shiny.

The fourth chapter uses data partitioning to study the difference between insured patients and uninsured patients in health outcomes. There is a disparity in health outcomes depending on an individual's type of insurance. The level of risk for an injury is the longitudinal aspect of this dataset. We partition the data into four pre-defined risk categories and evaluate the disparity between insured and uninsured patients using logistic regression models.

Notes

Rights