Theses and Dissertations from UMD

Permanent URI for this communityhttp://hdl.handle.net/1903/2

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 10 of 22
  • Thumbnail Image
    Item
    The Language of Central Banking: Probing Global Monetary Policy Communications Spillovers and Central Bank Shocks with Natural Language Processing Tools and a Novel Text Database
    (2024) Baird, Cory; Swagel, Phillip; Public Policy; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The discipline of macroeconomics relies mainly on structured data for empirical research, despite unstructured text data being vastly more abundant. This text data, particularly central bank communications, holds untapped potential for monetary economics research due to their influence on market expectations and policy outcomes like inflation. To help guide monetary policy researchers in exploring the growing universe of text data, this research lays out a foundational framework, both in terms of coding infrastructure and Natural Language Processing (NLP) methods. The first step in building out this infrastructure is through the creation of a new open-source central bank text database consisting of monetary policy communications from 14 countries consisting of 2,418 monetary policy statements. I leverage this novel database to explore the literature on "information effects," which has mainly relied on structured data for empirical analysis despite the possibility that the phenomenon itself is attributable to the linguistic elements or sentiment expressed via central bank communications. Chapter 1 (The Anatomy of a Central Bank Statement and Information Shocks) details the steps necessary to create a reproducible and scalable database of monetary policy statements from a diverse group of countries using the latest open source technologies and modern data science practices. I find that positive co-movement between policy rates and equities (what the literature defines as an "information shock") is a common event, with almost half of all policy rate increases (decreases) occurring alongside higher (lower) equity prices. With linguistic regressions and part-of-speech annotations, I provide novel linguistic evidence that information shocks are likely related to both the future state of the economy \parencite{nakamura2018high} and inflation expectations \parencite{boehm2021beyond}. Chapter 2 (Sentiment Analysis-From Past to Present) develops a novel approach for extracting sentiment at the sentence level using cutting-edge transformers models, the architecture behind many large language models (LLMs). My research demonstrates that transformer models as well as the traditional lexical methods employed in the economic literature, can produce starkly divergent results when applied to the same monetary policy statement. This highlights the critical need to utilize multiple sentiment measures to ensure the robustness of any findings derived from textual analysis. Reinforcing the linguistic evidence from Chapter 1, I show that positive (negative) sentiment is associated with positive (negative) information shocks providing further evidence the shocks are driven by the language of the statement itself. I also show that positive sentiment is associated with higher GDP growth in the quarters following a monetary policy statement. Chapter 3 (Central Bank Shocks and Global Spillovers), aggregates sentiment measures from the previous chapter to produce what I call the Global Policy Stance (GPS). I find that the GPS, led by the U.S., Japan, and Switzerland, tends to co-move with the global financial cycle (Global Asset Prices Factor from \textcite{miranda2020global}). I also find that domestic sentiment, rather than U.S. or global sentiment, is predictive of future policy rate changes suggesting that markets may be more sensitive to the communications of the home country's central bank. This thesis sets a rigorous standard for database transparency and code reproducibility above and beyond what is standard practice in the economics literature today. I will publicly release the codebase encompassing data retrieval, cleaning processes, figure generation, model development, all of which were produced utilizing the open-source Python programming language. Through this public release, I will provide researchers with valuable coding infrastructure that supports the operationalization of best practices in data management, enabling (1) the creation of open-source databases fostering collaboration and automation, as well as (2) the development of reproducible, scalable algorithms for text classification and text cleaning processes. In the future, I intend to further build out the central bank database to include other types of monetary policy communications (e.g., minutes and speeches) while also separately maintaining a repository of text classification algorithms (e.g. positive and negative sentiment) including lexical dictionaries from the literature as well as fine-tuned transformer models.
  • Thumbnail Image
    Item
    A Multifaceted Quantification of Bias in Large Language Models
    (2023) Sotnikova, Anna; Daumé III, Hal; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Language models are rapidly developing, demonstrating impressive capabilities in comprehending, generating, and manipulating text. As they advance, they unlock diverse applications across various domains and become increasingly integrated into our daily lives. Nevertheless, these models, trained on vast and unfiltered datasets, come with a range of potential drawbacks and ethical issues. One significant concern is the potential amplification of biases present in the training data, generating stereotypes and reinforcing societal injustices when language models are deployed. In this work, we propose methods to quantify biases in large language models. We examine stereotypical associations for a wide variety of social groups characterized by both single and intersectional identities. Additionally, we propose a framework for measuring stereotype leakage across different languages within multilingual large language models. Finally, we introduce an algorithm that allows us to optimize human data collection in conditions of high levels of human disagreement.
  • Thumbnail Image
    Item
    DATA-DRIVEN RISK MODELING FOR INFRASTRUCTURE PROJECTS USING ARTIFICIAL INTELLIGENCE TECHNIQUES
    (2023) Erfani, Abdolmajid; Cui, Qingbin; Civil Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Managing project risk is a key part of the successful implementation of any large project and is widely recognized as a best practice for public agencies to deliver infrastructures. The conventional method of identifying and evaluating project risks involves getting input from subject matter experts at risk workshops in the early phases of a project. As a project moves through its life cycle, these identified risks and their assessments evolve. Some risks are realized to become issues, some are mitigated, and some are retired as no longer important. Despite the value provided by conventional expert-based approaches, several challenges remain due to the time-consuming and expensive processes involved. Moreover, limited is known about how risks evolve from ex-ante to ex-post over time. How well does the project team identify and evaluate risks in the initial phase compared to what happens during project execution? Using historical data and artificial intelligence techniques, this study addressed these limitations by introducing a data-driven framework to identify risks automatically and to examine the quality of early risk registers and risk assessments. Risk registers from more than 70 U.S. major transportation projects form the input dataset. Firstly, the study reports a high degree of similarity between risk registers for different projects in the entire document of the risk register, and the probability and consequence of each risk item, suggesting that it is feasible to develop a common risk register. Secondly, the developed data-driven model for identifying common risks has a recall of over 66% and an F1 score of 0.59 for new projects, i.e., knowledge and experience of similar previous projects can help identify more than 66% of risks at the start. Thirdly, approximately 65% of ex-ante identified risks actually occur in projects and are mitigated, while more than 35% do not occur and are retired. The categorization of risk management styles illustrates that identifying risks early on is important, but it is not sufficient to achieve successful project delivery. During project execution, a project team demonstrating positive doer behavior (by actively monitoring and identifying risks) performed better. Finally, this study proposes using a data-driven approach to unify and summarize existing risk documents to create a comprehensive risk breakdown structure (RBS). Study results suggest that acquired knowledge from previous projects helps project teams and public agencies identify risks more effectively than starting from scratch using solely expert judgments.
  • Thumbnail Image
    Item
    Learning and Composing Primitives for the Visual World
    (2023) Gupta, Kamal; Shrivastava, Abhinav; Davis, Larry; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Compositionality is at the core of how humans understand and create visual data. In order for the computational approaches to assist humans in creative tasks, it is crucial for them to understand and perform composition. The recent advances in deep generative models have enabled us to convert noise to highly realistic scenes. However, in order to harness these models for building real-world applications, I argue that we need to be able to represent and control the generation process with the composition of interpretable primitives. In the first half of this talk, I’ll discuss how deep models can discover such primitives from visual data. By playing a cooperative referential game between two neural network agents, we can represent images with discrete meaningful concepts without supervision. I further extend this work for applications in image and video editing by learning a dense correspondence of primitives across images. In the second half, I’ll focus on learning how to compose primitives for both 2D and 3D visual data. By expressing the scenes as an assembly of smaller parts, we can easily perform generation from scratch or from partial scenes as input. I’ll conclude the talk with a discussion of possible future directions and applications of generative models, and how we can better enable users to guide the creative process.
  • Thumbnail Image
    Item
    Stronger Inductive Biases for Sample-Efficient and Controllable Neural Machine Translation
    (2023) Xu, Weijia; Carpuat, Marine; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    As one of the oldest applications of natural language processing, machine translation (MT) has a growing impact on human lives both as an end application and as a key component of cross-lingual information processing such as cross-lingual information retrieval and dialogue generation. Although neural machine translation (NMT) models achieve impressive performance on some language pairs, they are trained on large amounts of human translations. In addition, they are notorious for generating fluent outputs that do not faithfully reflect the meaning of the source sentence, and they make it difficult for users to control the outputs. To address these issues, this thesis contributes techniques to build more sample-efficient and controllable NMT models by incorporating stronger inductive biases that help correct undesirable biases, integrate prior knowledge, and introduce flexible ways to control the outputs in NMT. In our first line of research, we show that current NMT models are susceptible to undesirable biases that hinder sample-efficient training and lead to unfaithful translations. We further provide evidence that we can mitigate these undesirable biases by integrating stronger inductive biases through training algorithms. We start by introducing a new training objective to address the exposure bias problem — a common problem in sequence generation models that typically causes accumulated errors along the generated sequence at inference time, especially when the training data is limited. Next, we turn to a well-known but less studied problem in MT — the hallucination problem — translation outputs that are unrelated to the source text. To find spurious biases that cause hallucination errors, we first identify model symptoms that are indicative of hallucinations at inference time. And then, we show how these symptoms connect to the spurious biases at training time, where the model learns to predict the ground-truth translation while ignoring a large part of the source sentence. These findings provide a future path toward mitigating hallucinations by addressing these spurious biases. In our second line of research, we study how to integrate stronger inductive biases in NMT for effective integration of the language priors estimated from unsupervised data. We introduce a novel semi-supervised learning objective with a theoretical guarantee on its global optimum and show that it can be effectively approximated and leads to improved performance in practice. Finally, we study inductive biases in the form of NMT model architectures to allow end users to control the model outputs more easily. Controlling the outputs of standard NMT models is difficult with high computational cost at training or inference time. We develop an edit-based NMT model with novel edit operations that can incorporate users' lexical constraints with low computational cost at both training and inference time. To allow users to provide lexical constraints in more flexible morphological forms, we further introduce a modular framework for inflecting and integrating lexical constraints in NMT.
  • Thumbnail Image
    Item
    EXPERT-IN-THE-LOOP FOR SEQUENTIAL DECISIONS AND PREDICTIONS
    (2021) Brantley, Kiante; Daumé III, Hal; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Sequential decisions and predictions are common problems in natural language processing, robotics, and video games. Essentially, an agent interacts with an environment to learn how to solve a particular problem. Research in sequential decisions and predictions has increased due in part to the success of reinforcement learning. However, this success has come at the cost of algorithms being very data inefficient, making learning in the real world difficult. Our primary goal is to make these algorithms more data-efficient using an expert in the loop (e.g., imitation learning). Imitation learning is a technique for using an expert in sequential decision and prediction problems. Naive imitation learning has a covariate shift problem (i.e., training distribution differs from test distribution). We propose methods and ideas to address this issue and address other issues that arise in different styles of imitation learning. In particular, we study three broad areas of using an expert in the loop for sequential decisions and predictions. First, we study the most popular category of imitation learning, interactive imitation learning. Although interactive imitation learning addresses issues around the covariate shift problem in naive imitation, it does this with a trade-off. Interactive imitation learning assumesaccess to an online interactive expert, which is unrealistic. Instead, we propose a setting where this assumption is realistic and attempt to reduce the amount of queries needed for interactive imitation learning. We further introduce a new category on imitation learning algorithm called, Reward- Learning Imitation learning. Unlike interactive imitation learning, these algorithms only address the covariate shift using demonstration data instead of querying an online interactive expert. This category of imitation learning algorithms assumes access to an underlying reinforcement learning algorithm, that can optimize a reward function learned from demonstration data. We benchmark all algorithms in this category and relate them to modern structured prediction NLP problems. Beyond reward-learning imitation learning and interactive imitation, some problems cannot be naturally expressed and solved using these two categories of algorithms. For example, learning an algorithm that solves a particular problem and also satisfies safety constraints. We introduce expert-in-the-loop techniques that extend beyond traditional imitation learning paradigms, where an expert provides demonstration features or constraints, instead of state-action pairs.
  • Thumbnail Image
    Item
    Human-in-the-Loop Question Answering with Natural Language Interaction
    (2021) Ghoneim, Ahmed Elgohary; Boyd-Graber, Jordan; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Generalizing beyond the training examples is the primary goal of machine learning. In natural language processing (NLP), impressive models struggle to generalize when faced with test examples that differ from the training examples: e.g., in genre, domain, or language. I study interactive methods that overcome such limitations by seeking feedback from human users to successfully complete the task at hand and improve over time while on the job. Unlike previous work that adopts simple forms of feedback (e.g., labeling predictions as correct/wrong or answering yes/no clarification questions), I focus on using free-form natural language as the communication interface for providing feedback which can convey richer information and offer a more flexible interaction. An essential skill that language-based interactive systems should have is to understand user utterances in conversational contexts. I study conversational question answering (CQA) in which humans interact with a question answering (QA) system by asking a sequence of related questions. CQA requires models to link questions together to resolve the conversational dependencies between them such as coreference and ellipsis. I introduce question-in-context rewriting to reduce context-dependent conversational questions to independent stand-alone questions that can be answered with existing QA models. I collect a large dataset of human rewrites and I use it to evaluate a set of models for the question rewriting task. Next, I study semantic parsing in interactive settings in which users correct parsing errors using natural language feedback. Most existing work frames semantic parsing as a one-shot mapping task. I establish that the majority of parsing mistakes that recent neural text-to-SQL parsers make are minor. Hence, it is often feasible for humans to detect and suggest corrections for such mistakes if they have the opportunity to provide precise feedback. I describe an interactive text-to-SQL parsing system that enables users to inspect the inferred parses and correct any errors they find by providing feedback in free-form natural language. I construct SPLASH: a large dataset of SQL correction instances paired with a diverse set of human-authored natural language feedback utterances. Using SPLASH, I posed a new task: given a question paired with an initial erroneous SQL parse, to what extent can we correct the parse based on a provided natural language feedback? Then, I present NL-EDIT: a neural model for the correction task. NL-EDIT combines two key ideas: 1) interpreting the feedback in the context of the other elements of the interaction and, 2) explicitly generating edit operations to correct the initial query instead of re-generating the full query from scratch. I create a simple SQL editing language whose basic units are add/delete operations applied to different SQL clauses. I discuss evaluation methods that help understand the usefulness and limitations of semantic parse correction models. I conclude this thesis by identifying three broad research directions for further advancing collaborative human-computer NLP: (1) developing user-centered explanations, (2) designing and evaluating interaction mechanisms, and (3) learning from interactions.
  • Thumbnail Image
    Item
    A Human-centric Approach to NLP in Healthcare Applications
    (2021) Shing, Han-Chin; Resnik, Philip; Oard, Douglas W; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The abundance of personal health information available to healthcare professionals can be a facilitator to better care. However, it can also be a barrier, as the relevant information is often buried in the sheer amount of personal data, and healthcare professionals already lack time to take care of both patients and their data. This dissertation focuses on the role of natural language processing (NLP) in healthcare and how it can surface information relevant to healthcare professionals by modeling the extensive collections of documents that describe those whom they serve. In this dissertation, the extensive natural language data about a person is modeled as a set of documents, where the model inference is at the level of the individual, but evidence supporting that inference is found in a subset of their documents. The effectiveness of this modeling approach is demonstrated in the context of three healthcare applications. In the first application, clinical coding, document-level attention is used to model the hierarchy between a clinical encounter and its documents, jointly learning the encounter labels and the assignment of credits to specific documents. The second application, suicidality assessment using social media, further investigates how document-level attention can surface "high-signal" posts from the document set representing a potentially at-risk individual. Finally, the third application aims to help healthcare professionals write discharge summaries using an extract-then-abstract multidocument summarization pipeline to surface relevant information. As in many healthcare applications, these three applications seek to assist, not replace, clinicians. Evaluation and model design thus centers around healthcare professionals' needs. In clinical coding, document-level attention is shown to align well with professional clinical coders' expectations of evidence. In suicidality assessment, document-level attention leads to better and more time-efficient assessment by surfacing document-level evidence, shown empirically using a theoretically grounded time-aware evaluation measure and a dataset annotated by suicidality experts. Finally, extract-then-abstract summarization pipelines that assist healthcare professionals in writing discharge summaries are evaluated by their ability to surface faithful and relevant evidence.
  • Thumbnail Image
    Item
    AN ANALYSIS OF BOTTOM-UP ATTENTION MODELS AND MULTIMODAL REPRESENTATION LEARNING FOR VISUAL QUESTION ANSWERING
    (2019) Narayanan, Venkatraman; Shrivastava, Abhinav; Systems Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    A Visual Question Answering (VQA) task is the ability of a system to take an image and an open-ended, natural language question about the image and provide a natural language text answer as the output. The VQA task is a relatively nascent field, with only a few strategies explored. The performance of the VQA system, in terms of accuracy of answers to the image-question pairs, requires a considerable overhaul before the system can be used in practice. The general system for performing the VQA task consists of an image encoder network, a question encoder network, a multi-modal attention network that combines the information obtained image and question, and answering network that generates natural language answers for the image-question pair. In this thesis, we follow two strategies to improve the performance (accuracy) of VQA. The first is a representation learning approach (utilizing the state-of-the-art Generative Adversarial Models (GANs) (Goodfellow, et al., 2014)) to improve the image encoding system of VQA. This thesis evaluates four variants of GANs to identify a GAN architecture that best captures the data distribution of the images, and it was determined that GAN variants become unstable and fail to become a viable image encoding system in VQA. The second strategy is to evaluate an alternative approach to the attention network, using multi-modal compact bilinear pooling, in the existing VQA system. The second strategy led to an increase in the accuracy of VQA by 2% compared to the current state-of-the-art technique.
  • Thumbnail Image
    Item
    Teaching Machines to Ask Useful Clarification Questions
    (2018) Rao, Sudha; Daumé III, Hal; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Inquiry is fundamental to communication, and machines cannot effectively collaborate with humans unless they can ask questions. Asking questions is also a natural way for machines to express uncertainty, a task of increasing importance in an automated society. In the field of natural language processing, despite decades of work on question answering, there is relatively little work in question asking. Moreover, most of the previous work has focused on generating reading comprehension style questions which are answerable from the provided text. The goal of my dissertation work, on the other hand, is to understand how can we teach machines to ask clarification questions that point at the missing information in a text. Primarily, we focus on two scenarios where we find such question asking to be useful: (1) clarification questions on posts found in community-driven technical support forums such as StackExchange (2) clarification questions on descriptions of products in e-retail platforms such as Amazon. In this dissertation we claim that, given large amounts of previously asked questions in various contexts (within a particular scenario), we can build machine learning models that can ask useful questions in a new unseen context (within the same scenario). In order to validate this hypothesis, we firstly create two large datasets of context paired with clarification question (and answer) for the two scenarios of technical support and e-retail by automatically extracting these information from available datadumps of StackExchange and Amazon. Given these datasets, in our first line of research, we build a machine learning model that first extracts a set of candidate clarification questions and then ranks them such that a more useful question would be higher up in the ranking. Our model is inspired by the idea of expected value of perfect information: a good question is one whose expected answer will be useful. We hypothesize that by explicitly modeling the value added by an answer to a given context, our model can learn to identify more useful questions. We evaluate our model against expert human judgments on the StackExchange dataset and demonstrate significant improvements over controlled baselines. In our second line of research, we build a machine learning model that learns to generate a new clarification question from scratch, instead of ranking previously seen questions. We hypothesize that we can train our model to generate good clarification questions by incorporating the usefulness of an answer to the clarification question into the recent sequence-to-sequence based neural network approaches. We develop a Generative Adversarial Network (GAN) where the generator is a sequence-to-sequence model and the discriminator is a utility function that models the value of updating the context with the answer to the clarification question. We evaluate our model on our two datasets of StackExchange and Amazon, using both automatic metrics and human judgments of usefulness, specificity and relevance, showing that our approach outperforms both a retrieval-based model and ablations that exclude the utility model and the adversarial training. We observe that our question generation model generates questions that range a wide spectrum of specificity to the given context. We argue that generating questions at a desired level of specificity (to a given context) can be useful in many scenarios. In our last line of research we, therefore, build a question generation model which given a context and a level of specificity (generic or specific), generates a question at that level of specificity. We hypothesize that by providing the level of specificity of the question to our model during training time, it can learn patterns in the question that indicate the level of specificity and use those to generate questions at a desired level of specificity. To automatically label the large number of questions in our training data with the level of specificity, we train a binary classifier which given a context and a question, predicts whether the question is specific (to the context) or generic. We demonstrate the effectiveness of our specificity-controlled question generation model by evaluating it on the Amazon dataset using human judgements.