Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
4 results
Search Results
Item Enabling Graph Analysis Over Relational Databases(2019) Xirogiannopoulos, Konstantinos; Deshpande, Amol; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Complex interactions and systems can be modeled by analyzing the connections between underlying entities or objects described by a dataset. These relationships form networks (graphs), the analysis of which has been shown to provide tremendous value in areas ranging from retail to many scientific domains. This value is obtained by using various methodologies from network science-- a field which focuses on studying network representations in the real world. In particular "graph algorithms", which iteratively traverse a graph's connections, are often leveraged to gain insights. To take advantage of the opportunity presented by graph algorithms, there have been a variety of specialized graph data management systems, and analysis frameworks, proposed in recent years, which have made significant advances in efficiently storing and analyzing graph-structured data. Most datasets however currently do not reside in these specialized systems but rather in general-purpose relational database management systems (RDBMS). A relational or similarly structured system is typically governed by a schema of varying strictness that implements constraints and is meticulously designed for the specific enterprise. Such structured datasets contain many relationships between the entities therein, that can be seen as latent or "hidden" graphs that exist inherently inside the datasets. However, these relationships can only typically be traversed via conducting expensive JOINs using SQL or similar languages. Thus, in order for users to efficiently traverse these latent graphs to conduct analysis, data needs to be transformed and migrated to specialized systems. This creates barriers that hinder and discourage graph analysis; our vision is to break these barriers. In this dissertation we investigate the opportunities and challenges involved in efficiently leveraging relationships within data stored in structured databases. First, we present GraphGen, a lightweight software layer that is independent from the underlying database, and provides interfaces for graph analysis of data in RDBMSs. GraphGen is the first such system that introduces an intuitive high-level language for specifying graphs of interest, and utilizes in-memory graph representations to tackle the problems associated with analyzing graphs that are hidden inside structured datasets. We show GraphGen can analyze such graphs in orders of magnitude less memory, and often computation time, while eliminating manual Extract-Transform-Load (ETL) effort. Second, we examine how in-memory graph representations of RDBMS data can be used to enhance relational query processing. We present a novel, general framework for executing GROUP BY aggregation over conjunctive queries which avoids materialization of intermediate JOIN results, and wrap this framework inside a multi-way relational operator called Join-Agg. We show that Join-Agg can compute aggregates over a class of relational and graph queries using orders of magnitude less memory and computation time.Item EVALUATING METHODS FOR MODELING AND AGGREGATING CONTINUOUS DISTRIBUTIONS OF FORECASTER BELIEF(2017) Tidwell, Joe; Wallsten, Thomas; Dougherty, Michael; Psychology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The ``Wisdom of the crowds'' is the concept that the average estimate of a group of judges is often more accurate than any single judge’s estimate. This dissertation explores a variety of elicitation, modeling, and aggregation methods for time-based forecasting questions at both the individual and consensus levels, and shows that accurate continuous forecast distributions can be modeled from relatively few judgments from individual forecasters. For individual forecasters, eliciting judgments with fixed versus random cut points, and modeling those judgments with least-squares methods led to the most accurate forecasts. While gamma distributions fit the empirical judgments more closely than exponential distributions, exponential fits yielded more accurate model forecasts, suggesting that the greater flexibility of the gamma distribution tended to over-fit the empirical forecasts. For consensus forecasts, random cut points across individual forecasters yielded more accurate forecasts than fixed cut points, suggesting that across a group of forecasters, random bins may help average over individual-level forecast errors introduced through partition dependence bias and an arbitrary set of fixed cut points. With respect to modeling methods, a mixture of forecaster distributions fit with a Bayesian Dirichlet-multinomial model performed best across a variety of metrics and yielded forecast accuracies on par with advanced discrete aggregation techniques. This model also provides a natural way to weight individual forecasters according to expertise and other factors. Differences in forecast accuracy between modeling methods varied greatly depending on when an event occurred relative to the range over which forecaster judgments were elicited, particularly when events occurred long after the last date for which forecasters provided judgments. In these cases, the modeled forecasts depend heavily on the assumptions of the model versus the elicited judgments, and forecasts should be cautiously interpreted as representing crowd belief. The results of this research shows that with a limited number of discrete elicited judgments, it is possible to obtain continuous aggregate models of forecaster belief that are as accurate as discrete forecast aggregation methods, but can also provide decision makers with forecasts for arbitrary partitions of the event space and can be easily integrated into a broad range of decision analyses.Item A META-DATA INFORMED EXPERT JUDGMENT AGGREGATION AND CALIBRATION TECHNIQUE(2016) Feldman, Ellis; Mosleh, Ali; Reliability Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Policy makers use expert judgment opinions elicited from experts as probability distributions, quantiles or point estimates, as inputs to decisions that may have economic or life and death impacts. While challenges in estimating probabilities in general have been studied, research that distinguished between non-probabilistic, i.e., physical, variables and probabilistic variables specifically in the context of meta-data based expert judgment aggregation techniques, and the errors associated with the predictions developed from such variables, was not identified. This research demonstrated that for two combined expert judgment meta-data bases, the distinction between physical and probabilistic variables was significant in terms of the extent of multiplicative error between elicited medians and realized values both before and after aggregation. The distinction also impacts the widths of bounds around aggregated point estimates. The research compared nine methods of aggregating estimates and obtaining calibrated bounds, including ones based on alpha stable distributions, quantile regression, and a Bayesian model. Simple parametric distributions were also fit to the meta-data. Methods were compared against criteria including accuracy, bounds coverage and width, sensitivity to outliers, and complexity. No single method dominated all criteria for either variable type. The research investigated sensitivity of results to level of realized value for a variable, such as infrequent events for probabilistic variables, as well as sensitivity of results to number of experts.Item An Analysis of the Stability, Aggregation Propensity, and Negative Cooperativity of the Escherichia coli Chaperonin GroEL(2013) Wehri, Sarah; Lorimer, George H; Biochemistry; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Since the discovery of chaperonin GroEL and co-chaperonin GroES, there has been a deluge of literature investigating many aspects of the system. Substrate proteins are protected from aggregation through a cycle of capture, encapsulation, and release made possible through rigid body motions of the GroE system driven by a combination of allosteric controls influenced by nucleotide, potassium and denatured protein termed substrate protein (SP). This dissertation first explores the sequential transition of GroEL that maintains the rings operating in an alternating fashion. To do this, an intra-subunit, inter-domain mutant, GroELD83AT state. Steady state ATPase assays, stopped-flow fluorescence, and gel filtration chromatography were all used to demonstrate that the trans ring must access the T state before ligands can be discharged from the cis ring. The dual-heptameric ring structure of GroEL and the post-translational assembly of the protein make creating mutants with a mutation within a single subunit of a ring almost impossible, however the ability to do so opens the opportunity for a myriad of experiments that explore the allosteric transitions of GroEL. Two potential recombination methods, acetone treatment and heat treatment, were investigated. Förster resonance energy transfer (FRET) and electrospray ionization mass spectrometry (ESI-MS) were used to study recombination facilitated by such treatments. Recombination using the acetone method resulted in a one-in-one-out subunit exchange, however aggregation complicated the exchange. Heat treatment resulted in exchange of rings. Finally, dynamic light scattering (DLS) was used to investigate stability and aggregation on the chaperonin. It was observed that the chaperonin is stable for over 30 days while incubated continuously at 37°C in sterile buffered solution, however interesting aggregation kinetics are observed upon addition of acetone, the solvent used to strip SP from GroEL during the standard lab purification procedure. GroEL partitions into 10nm and 100nm species that are extremely stable before the appearance of macromolecular aggregates and precipitation is observed.