Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
6 results
Search Results
Item Ubiquitous Accessibility Digital-Maps for Smart Cities: Principles and Realization(2019) Ismail, Heba; Agrawala, Ashok; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)To support disabled individuals' active participation in the society, the Americans with Disabilities Act (ADA) requires installing various accessibility measures in roads and public accommodation spaces such as malls and airports. For example, curb ramps are installed on sidewalks to aid wheel-chaired individuals to transition from/to sidewalks smoothly. However, to comply with the ADA requirements, it is sufficient to have one accessible route in a place and usually there are no clear directions on how to reach that route. Hence, even within ADA-compliant facilities, accessing them can still be challenging for a disabled individual. To improve the spaces' accessibility, recently, systems have been proposed to rate outdoor walkways and intersections’ accessibility through active crowdsourcing where individuals mark and/or validate a maps’ accessibility assessments. Yet, depending on humans limits the ubiquity, accuracy and the update-rate of the generated maps. In this dissertation, we propose the AccessMap—Accessibility Digital Maps—system to build ubiquitous accessibility digital-maps automatically; where indoor/outdoor spaces are updated with various accessibility semantics and marked with assessment of their accessibility levels for the vision- and mobility-impairment disability types. To build the maps automatically, we propose a passive crowdsourcing approach where the users’ smartphone devices’ spatiotemporal sensors signals (e.g. barometer, accelerometer, etc.) are analyzed to detect and map the accessibility semantics. We present algorithms to passively detect various semantics such as accessible pedestrian signals and missing curb-ramps. We also present a probabilistic framework to construct the map while taking the uncertainty in the detected semantics and the sensors into account. AccessMap was evaluated in two different countries, the evaluation results show high detection accuracy for the different accessibility semantics. Moreover, the crowdsourcing framework helps further improve the map integrity overtime. Additionally, to tag the crowdsourced data with location stamps, GPS is the de-facto-standard localization method, but it fails in indoor environments. Thus, we present the Hapi WiFi-based localization system to estimate the crowdsourcers’ location indoors. WiFi represents a promising technology for indoor localization due to its world-wide deployment. Nevertheless, current systems either rely on a tedious expensive offline calibration phase and/or focus on a single-floor area of interest. To address these limitations, Hapi combines signal-processing, deep-learning and probabilistic models to estimate a user’s 2.5D location (i.e. the user floor-level and her 2D location within that floor) in a calibration-free manner. Our evaluation results show that, in high-rise buildings, we could achieve significant improvements over state-of-the-art indoor-localization systems.Item Security and Trust in Distributed Computation(2015) Liu, Xiangyang; Baras, John S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)We propose three research problems to explore the relations between trust and security in the setting of distributed computation. In the first problem, we study trust-based adversary detection in distributed consensus computation. The adversaries we consider behave arbitrarily disobeying the consensus protocol. We propose a trust-based consensus algorithm with local and global trust evaluations. The algorithm can be abstracted using a two-layer structure with the top layer running a trust-based consensus algorithm and the bottom layer as a subroutine executing a global trust update scheme. We utilize a set of pre-trusted nodes, headers, to propagate local trust opinions throughout the network. This two-layer framework is flexible in that it can be easily extensible to contain more complicated decision rules, and global trust schemes. The first problem assumes that normal nodes are homogeneous, i.e. it is guaranteed that a normal node always behaves as it is programmed. In the second and third problems however, we assume that nodes are heterogeneous, i.e, given a task, the probability that a node generates a correct answer varies from node to node. The adversaries considered in these two problems are workers from the open crowd who are either investing little efforts in the tasks assigned to them or intentionally give wrong answers to questions. In the second part of the thesis, we consider a typical crowdsourcing task that aggregates input from multiple workers as a problem in information fusion. To cope with the issue of noisy and sometimes malicious input from workers, trust is used to model workers' expertise. In a multi-domain knowledge learning task, however, using scalar-valued trust to model a worker's performance is not sufficient to reflect the worker's trustworthiness in each of the domains. To address this issue, we propose a probabilistic model to jointly infer multi-dimensional trust of workers, multi-domain properties of questions, and true labels of questions. Our model is very flexible and extensible to incorporate metadata associated with questions. To show that, we further propose two extended models, one of which handles input tasks with real-valued features and the other handles tasks with text features by incorporating topic models. Our models can effectively recover trust vectors of workers, which can be very useful in task assignment adaptive to workers' trust in the future. These results can be applied for fusion of information from multiple data sources like sensors, human input, machine learning results, or a hybrid of them. In the second subproblem, we address crowdsourcing with adversaries under logical constraints. We observe that questions are often not independent in real life applications. Instead, there are logical relations between them. Similarly, workers that provide answers are not independent of each other either. Answers given by workers with similar attributes tend to be correlated. Therefore, we propose a novel unified graphical model consisting of two layers. The top layer encodes domain knowledge which allows users to express logical relations using first-order logic rules and the bottom layer encodes a traditional crowdsourcing graphical model. Our model can be seen as a generalized probabilistic soft logic framework that encodes both logical relations and probabilistic dependencies. To solve the collective inference problem efficiently, we have devised a scalable joint inference algorithm based on the alternating direction method of multipliers. The third part of the thesis considers the problem of optimal assignment under budget constraints when workers are unreliable and sometimes malicious. In a real crowdsourcing market, each answer obtained from a worker incurs cost. The cost is associated with both the level of trustworthiness of workers and the difficulty of tasks. Typically, access to expert-level (more trustworthy) workers is more expensive than to average crowd and completion of a challenging task is more costly than a click-away question. In this problem, we address the problem of optimal assignment of heterogeneous tasks to workers of varying trust levels with budget constraints. Specifically, we design a trust-aware task allocation algorithm that takes as inputs the estimated trust of workers and pre-set budget, and outputs the optimal assignment of tasks to workers. We derive the bound of total error probability that relates to budget, trustworthiness of crowds, and costs of obtaining labels from crowds naturally. Higher budget, more trustworthy crowds, and less costly jobs result in a lower theoretical bound. Our allocation scheme does not depend on the specific design of the trust evaluation component. Therefore, it can be combined with generic trust evaluation algorithms.Item CROWDSOURCING: A NOVEL GROUP-LEVEL MECHANISM STRUCTURES CHROMATIN AND FOSTERS GENE-COMPLEX ACTIVATION(2015) Malin, Justin; Hannenhalli, Sridhar; Biology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Transcriptional regulation of a co-expressed gene network often relies on adoption of a three-dimensional conformation, dubbed a ‘chromatin hub’ or ‘regulatory archipelago’, which radically reduces spatial distances between genomically remote enhancers and gene targets, as well as among enhancers. While the advantage of spatial proximity for fostering pairwise interactions is self-evident, there has been limited exploration within archipelagos of higher-order interactions. Here we probe the evidence for a novel and group-level mechanism which, we hypothesize, is emergent when numerous coordinately-acting regulatory enhancers, mediated by chromatin, converge in space. Based on functional human genomic data and biophysical modeling, and using a set of 40 enhancer archipelagos we identified through shared activity across 37 tissues, we show that three-dimensional juxtaposition of dozens of genomically dispersed binding sites for a given transcription factor (TF) can briefly ‘trap’ diffusing TF proteins, eliciting a spike in local TF concentration and a two-fold boost in its DNA occupancy at member enhancers. We find substantial evidence for the role of this ‘crowdsourcing’ effect in tissue-specific gene-complex activation, and in the process, offer the first evidence for a predictable group-level modulator of TF occupancy that operates independently of genomic distance. In turn, crowd-sourcing proves a surprising answer to the paradoxical source of binding specificity for degenerate TFs, in general, and various master regulator TFs, in particular. Additionally, we show that crowdsourcing likely contributes to super-enhancer functionality and speculate on crowdsourcing’s role in coordinating collectives of super-enhancers in cell lineage determination. Finally, we ask whether the biophysical impact of crowdsourcing also flows in the opposite direction. Here we find, likely mediated by elevated TF concentrations, that coordinately acting enhancers adopt a more compact conformation, stereotypical of activated gene complexes. Together, we find compelling evidence for a novel and pervasive regulatory mechanism that is emergent at the level of co-expressed gene module and which, both, mediates and is mediated by higher-order chromatin structure.Item Crowdsourcing decision support: frugal human computation for efficient decision input acquisition(2014) Quinn, Alexander James; Bederson, Benjamin B; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)When faced with data-intensive decision problems, individuals, businesses, and governmental decision-makers must balance trade-offs between optimality and the high cost of conducting a thorough decision process. The unprecedented availability of information online has created opportunities to make well-informed, near-optimal decisions more efficiently. A key challenge that remains is the difficulty of efficiently gathering the requisite details in a form suitable for making the decision. Human computation and social media have opened new avenues for gathering relevant information or opinions in support of a decision-making process. It is now possible to coordinate paid web workers from online labor markets such as Amazon Mechanical Turk and others in a distributed search party for the needed information. However, the strategies that individuals employ when confronted with too much information--satisficing, information foraging, etc.--are more difficult to apply with a large, distributed group. Consequently, current distributed approaches are inherently wasteful of human time and effort. This dissertation offers a method for coordinating workers to efficiently enter the inputs for spreadsheet decision models. As a basis for developing and understanding the idea, I developed AskSheet, a system that uses decision models represented as spreadsheets. The user provides a spreadsheet model of a decision, the formulas of which are analyzed to calculate the value of information for each of the decision inputs. With that, it is able to prioritize the inputs and make the decision input acquisition process more frugal. In doing so, it trades machine capacity for analyzing the model for a reduction in the cost and burden to the humans providing the needed information.Item Crowdsourced Monolingual Translation(2012) Hu, Chang; Bederson, Benjamin B; Resnik, Philip; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)An enormous potential exists for solving certain classes of computational problems through rich collaboration among crowds of humans supported by computers. Solutions to these problems used to involve human professionals who are expensive to hire or difficult to find. Despite significant advances, fully automatic systems still have much room for improvement. Recent research has involved recruiting large crowds of skilled humans (``crowdsourcing''), but crowdsourcing solutions are still restricted by the availability of those skilled human participants. With translation, for example, professional translators incur high cost and are not always available; machine translation systems have been greatly improved recently, but still can only provide passable translation, and for only limited language pairs at that; crowdsourced translation is limited by the availability of bilingual humans. This dissertation describes crowdsourced monolingual translation, where monolingual translation is translation performed by monolingual people. Crowdsourced monolingual translation is a collaborative form of translation performed by two crowds of people who speak the source or the target language respectively, with machine translation as the mediating device. A general protocol to handle crowdsourced monolingual translation is introduced along with three systems that implement the protocol. The MonoTrans system initially established the feasibility of the protocol. Then, MonoTrans2 enabled lab experiments with a second implementation of the protocol. MonoTrans2 was also applied to a an emergency-response scenario in a developing country (Haiti). The MonoTrans Widgets system was deployed to a large crowd of casual web users with a third implementation of the protocol. These systems were studied in various settings, and were found to supply improvement in quality over both machine translation and monolingual post-editing.Item A distributional and syntactic approach to fine-grained opinion mining(2011) Sayeed, Asad Basheer; Weinberg, Amy S; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This thesis contributes to a larger social science research program of analyzing the diffusion of IT innovations. We show how to automatically discriminate portions of text dealing with opinions about innovations by finding {source, target, opinion} triples in text. In this context, we can discern a list of innovations as targets from the domain itself. We can then use this list as an anchor for finding the other two members of the triple at a ``fine-grained'' level---paragraph contexts or less. We first demonstrate a vector space model for finding opinionated contexts in which the innovation targets are mentioned. We can find paragraph-level contexts by searching for an ``expresses-an-opinion-about'' relation between sources and targets using a supervised model with an SVM that uses features derived from a general-purpose subjectivity lexicon and a corpus indexing tool. We show that our algorithm correctly filters the domain relevant subset of subjectivity terms so that they are more highly valued. We then turn to identifying the opinion. Typically, opinions in opinion mining are taken to be positive or negative. We discuss a crowd sourcing technique developed to create the seed data describing human perception of opinion bearing language needed for our supervised learning algorithm. Our user interface successfully limited the meta-subjectivity inherent in the task (``What is an opinion?'') while reliably retrieving relevant opinionated words using labour not expert in the domain. Finally, we developed a new data structure and modeling technique for connecting targets with the correct within-sentence opinionated language. Syntactic relatedness tries (SRTs) contain all paths from a dependency graph of a sentence that connect a target expression to a candidate opinionated word. We use factor graphs to model how far a path through the SRT must be followed in order to connect the right targets to the right words. It turns out that we can correctly label significant portions of these tries with very rudimentary features such as part-of-speech tags and dependency labels with minimal processing. This technique uses the data from the crowdsourcing technique we developed as training data. We conclude by placing our work in the context of a larger sentiment classification pipeline and by describing a model for learning from the data structures produced by our work. This work contributes to computational linguistics by proposing and verifying new data gathering techniques and applying recent developments in machine learning to inference over grammatical structures for highly subjective purposes. It applies a suffix tree-based data structure to model opinion in a specific domain by imposing a restriction on the order in which the data is stored in the structure.