STATISTICAL LEARNING WITH APPLICATIONS IN HIGH DIMENSIONAL DATA AND HEALTH CARE ANALYTICS

dc.contributor.advisorRyzhov, Ilyaen_US
dc.contributor.authorFan, Yimeien_US
dc.contributor.departmentMathematicsen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2017-09-14T05:45:59Z
dc.date.available2017-09-14T05:45:59Z
dc.date.issued2017en_US
dc.description.abstractStatistical learning has been applied in business and health care analytics. Predictive models are fit using hierarchically structured data: common characteristics of products and customers are represented as categorical variables, and each category can be split up into multiple subcategories at a lower level of the hierarchy. Hundreds of thousands of binary variables may be required to model the hierarchy, necessitating the use of variable selection to screen out large numbers of irrelevant or insignificant features. We propose a new dynamic screening method, based on the distance correlation criterion, designed for hierarchical binary data. Our method can screen out large parts of the hierarchy at the higher levels, avoiding the need to explore many lower-level features and greatly reducing the computational cost of screening. The practical potential of the method is demonstrated in a case application involving a large volume of B2B transaction data. While statistical inference has been widely used for decision and policy making in health care, we particularly focused on how providers get paid for some common procedures. We explored a few rich datasets and discovered large variations among providers for how much payers/insurers have paid, aka allowed payment. Then we proposed to incorporate available providers' attributes with regression model to explain the possible reasons for those payment variations.en_US
dc.identifierhttps://doi.org/10.13016/M2N29P702
dc.identifier.urihttp://hdl.handle.net/1903/19996
dc.language.isoenen_US
dc.subject.pqcontrolledStatisticsen_US
dc.subject.pqcontrolledApplied mathematicsen_US
dc.subject.pqcontrolledHealth care managementen_US
dc.subject.pquncontrolledDynamic Distance Correlationen_US
dc.subject.pquncontrolledHealth Careen_US
dc.subject.pquncontrolledHierarchical Dataen_US
dc.subject.pquncontrolledHigh Dimension Dataen_US
dc.titleSTATISTICAL LEARNING WITH APPLICATIONS IN HIGH DIMENSIONAL DATA AND HEALTH CARE ANALYTICSen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Fan_umd_0117E_18388.pdf
Size:
614.39 KB
Format:
Adobe Portable Document Format