Simulation Optimization: Efficient Selection and Dynamic Programming
Files
Publication or External Link
Date
Authors
Advisor
Citation
Abstract
Many real-world problems can be modeled as stochastic systems, whose performances can be estimated by simulations. Important topics include statistical ranking and selection (R&S) problems, response surface problems, and stochastic estimation problems.
The first part of the dissertation focuses on the R&S problems, where there is a finite set of feasible solutions ("systems" or "alternatives") with unknown values of performances that must be estimated from noisy observations, e.g., from expensive stochastic simulation experiments. The goal in R&S problems is to identify the highest-valued alternative as quickly as possible. In many practical settings, alternatives with similar attributes could have similar performances; we thus focus on such settings where prior similarity information between the performance outputs of different alternatives can be learned from data. Incorporating similarity information enables efficient budget allocation for faster identification of the best alternative in sequential selection. Using a new selection criterion, the similarity selection index, we develop two new allocation methods: one based on a mathematical programming characterization of the asymptotically optimal budget allocation, and the other based on a myopic expected improvement measure. For the former, we present a novel sequential implementation that provably learns the optimal allocation without tuning. For the latter, we also derive its asymptotic sampling ratios. We also propose a practical way to update the prior similarity information as new samples are collected.
The second part of the dissertation considers the stochastic estimation problems of estimating a conditional expectation. We first formulate the conditional expectation as a ratio of two stochastic derivatives. Applying stochastic gradient estimation techniques, we express the two derivatives using ordinary expectations, which can be then estimated by Monte Carlo methods. Based on an empirical distribution estimated from simulation, we provide guidance on selecting the appropriate formulation of the derivatives to reduce the variance of the ratio estimator.
The third part of this dissertation further explores the application of estimators for conditional expectations in policy evaluation, an important topic in stochastic dynamic programming. In particular, we focus on a finite-horizon setting with continuous state variables. The Bellman equation represents the value function as a conditional expectation. By using the finite difference method and the generalized likelihood ratio method, we propose new estimators for policy evaluation and show how the value of any given state can be estimated using sample paths starting from various other states.