Decision Making Under Uncertainty: New Models and Applications
Fu, Michael C
MetadataShow full item record
In the settings of decision-making-under-uncertainty problems, an agent takes an action on the environment and obtains a non-deterministic outcome. Such problem settings arise in various applied research fields such as financial engineering, business analytics and speech recognition. The goal of the research is to design an automated algorithm for an agent to follow in order to find an optimal action according to his/her preferences.Typically, the criterion for selecting an optimal action/policy is a performance measure, determined jointly by the agent's preference and the random mechanism of the agent's surrounding environment. The random mechanism is reflected through a random variable of the outcomes attained by a given action, and the agent's preference is captured by a transformation on the potential outcomes from the set of possible actions. Many decision-making-under-uncertainty problems formulate the performance measure objective function and develop optimization schemes on that objective function. Although the idea on the high-level seems straightforward, there are many challenges, both conceptually and computationally, that arise in the process of finding the optimal action. The thesis studies a special class of performance measure defined based on Cumulative Prospect Theory (CPT), which has been used as an alternative to expected-utility based performance measure for evaluating human-centric systems. The first part of the thesis designs a simulation-based optimization framework on the CPT-based performance measure. The framework includes a sample-based estimator for the CPT-value and stochastic approximation algorithms for searching the optimal action/policy. We prove that, under reasonable assumptions, the CPT-value estimator is asymptotically consistent and our optimization algorithms are asymptotically converging to the optimal point. The second part of the thesis introduces an abstract dynamic programming framework whose transitional measure is defined through the CPT-value. We also provide sufficient conditions under which the CPT-driven dynamic programming would attain a unique optimal solution. Empirical experiments presented in the last part of thesis illustrate that the CPT-estimator is consistent and that the CPT-based performance measure may lead to an optimal policy very different from those obtained using traditional expected utility.