Optimal Learning with Non-Gaussian Rewards
dc.contributor.advisor | Ryzhov, Ilya O. | en_US |
dc.contributor.author | Ding, Zi | en_US |
dc.contributor.department | Applied Mathematics and Scientific Computation | en_US |
dc.contributor.publisher | Digital Repository at the University of Maryland | en_US |
dc.contributor.publisher | University of Maryland (College Park, Md.) | en_US |
dc.date.accessioned | 2014-10-11T05:49:57Z | |
dc.date.available | 2014-10-11T05:49:57Z | |
dc.date.issued | 2014 | en_US |
dc.description.abstract | In this disseration, the author studies sequential Bayesian learning problems modeled under non-Gaussian distributions. We focus on a class of problems called the multi-armed bandit problem, and studies its optimal learning strategy, the Gittins index policy. The Gittins index is computationally intractable and approxi- mation methods have been developed for Gaussian reward problems. We construct a novel theoretical and computational framework for the Gittins index under non- Gaussian rewards. By interpolating the rewards using continuous-time conditional Levy processes, we recast the optimal stopping problems that characterize Gittins indices into free-boundary partial integro-differential equations (PIDEs). We also provide additional structural properties and numerical illustrations on how our ap- proach can be used to approximate the Gittins index. | en_US |
dc.identifier | https://doi.org/10.13016/M2PS3X | |
dc.identifier.uri | http://hdl.handle.net/1903/15774 | |
dc.language.iso | en | en_US |
dc.subject.pqcontrolled | Mathematics | en_US |
dc.subject.pqcontrolled | Operations research | en_US |
dc.subject.pquncontrolled | Bayesian learning | en_US |
dc.subject.pquncontrolled | Gittins Index | en_US |
dc.subject.pquncontrolled | non-Gaussian rewards | en_US |
dc.subject.pquncontrolled | Optimal learning | en_US |
dc.title | Optimal Learning with Non-Gaussian Rewards | en_US |
dc.type | Dissertation | en_US |
Files
Original bundle
1 - 1 of 1