Computer Science Theses and Dissertations
Permanent URI for this collectionhttp://hdl.handle.net/1903/2756
Browse
2 results
Search Results
Item STATISTICAL AND OPTIMAL LEARNING WITH APPLICATIONS IN BUSINESS ANALYTICS(2015) Han, Bin; Ryzhov, Ilya O; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Statistical learning is widely used in business analytics to discover structure or exploit patterns from historical data, and build models that capture relationships between an outcome of interest and a set of variables. Optimal learning on the other hand, solves the operational side of the problem, by iterating between decision making and data acquisition/learning. All too often the two problems go hand-in-hand, which exhibit a feedback loop between statistics and optimization. We apply this statistical/optimal learning concept on a context of fundraising marketing campaign problem arising in many non-profit organizations. Many such organizations use direct-mail marketing to cultivate one-time donors and convert them into recurring contributors. Cultivated donors generate much more revenue than new donors, but also lapse with time, making it important to steadily draw in new cultivations. The direct-mail budget is limited, but better-designed mailings can improve success rates without increasing costs. We first apply statistical learning to analyze the effectiveness of several design approaches used in practice, based on a massive dataset covering 8.6 million direct-mail communications with donors to the American Red Cross during 2009-2011. We find evidence that mailed appeals are more effective when they emphasize disaster preparedness and training efforts over post-disaster cleanup. Including small cards that affirm donors' identity as Red Cross supporters is an effective strategy, while including gift items such as address labels is not. Finally, very recent acquisitions are more likely to respond to appeals that ask them to contribute an amount similar to their most recent donation, but this approach has an adverse effect on donors with a longer history. We show via simulation that a simple design strategy based on these insights has potential to improve success rates from 5.4% to 8.1%. Given these findings, when new scenario arises, however, new data need to be acquired to update our model and decisions, which is studied under optimal learning framework. The goal becomes discovering a sequential information collection strategy that learns the best campaign design alternative as quickly as possible. Regression structure is used to learn about a set of unknown parameters, which alternates with optimization to design new data points. Such problems have been extensively studied in the ranking and selection (R&S) community, but traditional R&S procedures experience high computational costs when the decision space grows combinatorially. We present a value of information procedure for simultaneously learning unknown regression parameters and unknown sampling noise. We then develop an approximate version of the procedure, based on semi-definite programming relaxation, that retains good performance and scales better to large problems. We also prove the asymptotic consistency of the algorithm in the parametric model, a result that has not previously been available for even the known-variance case.Item Simulation Optimization: New Methods and An Application(2014) Qu, Huashuai; Fu, Michael C; Ryzhov, Ilya O; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Simulation models are commonly used to provide analysis and prediction of the behavior of complex stochastic systems. Simulation optimization integrates optimization techniques into simulation analysis to capture response surface, to choose optimal decision variables and to perform sensitivity analysis. Objective functions usually cannot be computed in closed form and are computationally expensive to evaluate. Many methods are proposed by researchers for problems with continuous and discrete variables, respectively. The dissertation is comprised of both optimization methods and a real-world application. In particular, our goal is to develop new methods based on direct gradient estimates and variational Bayesian techniques. The first part of the thesis considers the setting where additional direct gradient information is available and introduces different approaches for enhancing regression models and stochastic kriging with this additional gradient information,respectively. For regression models, we propose Direct Gradient Augmented Regression (DiGAR) models to incorporate direct gradient estimators. We characterize the variance of the estimated parameters in DiGAR and compare them analytically with the standard regression model for some special settings. For stochastic kriging, we propose Gradient Extrapolated Stochastic Kriging (GESK) to incorporate direct gradient estimates by extrapolating additional responses. We show that GESK reduces mean squared error (MSE) compared to stochastic kriging under certain conditions on step sizes. We also propose maximizing penalized likelihood and minimizing integrated mean squared error to determine the step sizes. The second part of the thesis focuses on the problem of learning unknown correlation structures in ranking and selection (R&S) problems. We proposes a computationally tractable Bayesian statistical model for learning unknown correlation structures in fully sequential simulation selection. We derive a Bayesian procedure that allocates simulations based on the value of information, thus anticipating future changes to our beliefs about the correlations. The proposed approach is able to simultaneously learn unknown mean performance values and unknown correlations, whereas existing approaches in the literature assume independence or known correlations to learn unknown mean performance values only. Finally we consider an application in business-to-business (B2B) pricing. We propose an approximate Bayesian statistical model for predicting the win/loss probability for a given price and an approach for recommending target prices based on the approximate Bayesian model.