Theses and Dissertations from UMD

Permanent URI for this communityhttp://hdl.handle.net/1903/2

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    Item
    EVALUATING METHODS FOR MODELING AND AGGREGATING CONTINUOUS DISTRIBUTIONS OF FORECASTER BELIEF
    (2017) Tidwell, Joe; Wallsten, Thomas; Dougherty, Michael; Psychology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The ``Wisdom of the crowds'' is the concept that the average estimate of a group of judges is often more accurate than any single judge’s estimate. This dissertation explores a variety of elicitation, modeling, and aggregation methods for time-based forecasting questions at both the individual and consensus levels, and shows that accurate continuous forecast distributions can be modeled from relatively few judgments from individual forecasters. For individual forecasters, eliciting judgments with fixed versus random cut points, and modeling those judgments with least-squares methods led to the most accurate forecasts. While gamma distributions fit the empirical judgments more closely than exponential distributions, exponential fits yielded more accurate model forecasts, suggesting that the greater flexibility of the gamma distribution tended to over-fit the empirical forecasts. For consensus forecasts, random cut points across individual forecasters yielded more accurate forecasts than fixed cut points, suggesting that across a group of forecasters, random bins may help average over individual-level forecast errors introduced through partition dependence bias and an arbitrary set of fixed cut points. With respect to modeling methods, a mixture of forecaster distributions fit with a Bayesian Dirichlet-multinomial model performed best across a variety of metrics and yielded forecast accuracies on par with advanced discrete aggregation techniques. This model also provides a natural way to weight individual forecasters according to expertise and other factors. Differences in forecast accuracy between modeling methods varied greatly depending on when an event occurred relative to the range over which forecaster judgments were elicited, particularly when events occurred long after the last date for which forecasters provided judgments. In these cases, the modeled forecasts depend heavily on the assumptions of the model versus the elicited judgments, and forecasts should be cautiously interpreted as representing crowd belief. The results of this research shows that with a limited number of discrete elicited judgments, it is possible to obtain continuous aggregate models of forecaster belief that are as accurate as discrete forecast aggregation methods, but can also provide decision makers with forecasts for arbitrary partitions of the event space and can be easily integrated into a broad range of decision analyses.
  • Thumbnail Image
    Item
    Anomaly Detection in Time Series: Theoretical and Practical Improvements for Disease Outbreak Detection
    (2009) Lotze, Thomas Harvey; Shmueli, Galit; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The automatic collection and increasing availability of health data provides a new opportunity for techniques to monitor this information. By monitoring pre-diagnostic data sources, such as over-the-counter cough medicine sales or emergency room chief complaints of cough, there exists the potential to detect disease outbreaks earlier than traditional laboratory disease confirmation results. This research is particularly important for a modern, highly-connected society, where the onset of disease outbreak can be swift and deadly, whether caused by a naturally occurring global pandemic such as swine flu or a targeted act of bioterrorism. In this dissertation, we first describe the problem and current state of research in disease outbreak detection, then provide four main additions to the field. First, we formalize a framework for analyzing health series data and detecting anomalies: using forecasting methods to predict the next day's value, subtracting the forecast to create residuals, and finally using detection algorithms on the residuals. The formalized framework indicates the link between the forecast accuracy of the forecast method and the performance of the detector, and can be used to quantify and analyze the performance of a variety of heuristic methods. Second, we describe improvements for the forecasting of health data series. The application of weather as a predictor, cross-series covariates, and ensemble forecasting each provide improvements to forecasting health data. Third, we describe improvements for detection. This includes the use of multivariate statistics for anomaly detection and additional day-of-week preprocessing to aid detection. Most significantly, we also provide a new method, based on the CuScore, for optimizing detection when the impact of the disease outbreak is known. This method can provide an optimal detector for rapid detection, or for probability of detection within a certain timeframe. Finally, we describe a method for improved comparison of detection methods. We provide tools to evaluate how well a simulated data set captures the characteristics of the authentic series and time-lag heatmaps, a new way of visualizing daily detection rates or displaying the comparison between two methods in a more informative way.