Theses and Dissertations from UMD

Permanent URI for this communityhttp://hdl.handle.net/1903/2

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 5 of 5
  • Thumbnail Image
    Item
    AI Empowered Music Education
    (2024) Shrestha, Snehesh; Aloimonos, Yiannis; Fermüller, Cornelia; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Learning a musical instrument is a complex process involving years of practice and feedback. However, dropout rates in music programs, particularly among violin students, remain high due to socio-economic barriers and the challenge of mastering the instrument. This work explores the feasibility of accelerating learning and leveraging technology in music education, with a focus on bowed string instruments, specifically the violin. My research identifies workflow gaps and challenges for the stakeholders, aiming to address not only the improvement of learning outcomes but also the provision of opportunities for socioeconomically challenged students. Three key areas are emphasized: designing user studies and creating a comprehensive violin dataset, developing tools and deep learning algorithms for accurate performance assessment, and crafting a practice platform for student feedback. Three fundamental perspectives were essential: a) understanding the stakeholders and their specific challenges, b) understanding how the instrument operates and what actions the player must master to control its functions, and c) addressing the technical challenges associated with constructing and implementing detection and feedback systems. The existing datasets were inadequate for analyzing violin playing, primarily due to their lack of diversity of body types and skill levels, as well as the absence of well-synchronized and calibrated video data, along with corresponding ground truth 3D poses and musical events. Our experiment design was ensured that the collected data would be suitable for subsequent tasks downstream. These considerations played a significant role in determining the metrics used to evaluate the accuracy of the data and the success metrics for the subsequent tasks. At the foundation of movement analysis lies 3D human pose estimation. Unfortunately, the current state-of-the-art algorithms face challenges in accurately estimating monocular 3D poses during instrument playing. These challenges arise from factors such as occlusions, partial views, human-object interactions, limited viewing angles, pixel density, and camera sampling rates. To address these issues, we developed a novel 3D pose estimation algorithm based on the insight that the music produced by the violin is a direct result of the corresponding motions. Our algorithm integrates visual observations with audio inputs to generate precise, high-resolution 3D pose estimates that are temporally consistent and conducive to downstream tasks. Providing effective feedback to learners is a nuanced process that requires balancing encouragement with challenge. Without a user-friendly interface and a motivational strategy, feedback runs the risk of being counterproductive. While current systems excel at detecting pitch and temporal misalignments and visually displaying them for analysis, they often overwhelm players. In this dissertation, we introduce two novel feedback systems. The first is a visual-haptic feedback system that overlays simple augmented cues on the user's body, gently guiding them back to the correct posture. The second is a haptic band synchronized with the music, enhancing students' perception of rhythmic timing and bowing intensities. Additionally, we developed an intuitive user interface for real-time feedback during practice sessions and performance reviews. This data can be shared with teachers for deeper insights into students' struggles and track progress. This research aims to empower both students and teachers. By providing students with feedback during individual practice sessions and equipping teachers with tools to monitor and tailor AI interventions according to their preferences, this work serves as a valuable teaching assistant. By addressing tasks that teachers may not prefer or physically perform, such as personalized feedback and progress tracking, this research endeavors to democratize access to high-quality music education and mitigate dropout rates in music programs.
  • Thumbnail Image
    Item
    Exploring The Role Of Generative Artificial Intelligence In Cultural Relevant Storytelling For Native Language Learning Among Children
    (2024) Nanduri, Dinesh Kumar; Marsh, Diana E; Information Studies; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In an era marked by the rapid disappearance of languages, UNESCO warns that nearly half of the world's linguistic heritage might soon become dormant. Despite its current health, Telugu has seen a decline in usage, reduced focus in India's educational systems, and overshadowing by dominant global languages. This thesis explores Generative Artificial Intelligence (GenAI) to counter this trend, focusing on its application in native language learning for children, key carriers of their ancestral tongues. Through scoping reviews and participatory design sessions with young Telugu-speaking learners and their guardians, the study investigates GenAI's role in enhancing language learning tailored to individual and cultural contexts. It highlights storytelling as a potent mechanism for language acquisition, facilitated by GenAI's ability to personalize learning experiences and bridge generational gaps. The research also addresses ethical considerations vital for designing GenAI tools, promoting inclusivity, bias mitigation, and cultural integrity protection. It showcases a future where technology helps prevent linguistic dormancy and empowers children to celebrate human language and cultural diversity.
  • Thumbnail Image
    Item
    APPLICANT REACTIONS TO ARTIFICIAL INTELLIGENCE SELECTION SYSTEMS
    (2022) Bedemariam, Rewina Sahle; Wessel, Jennifer; Psychology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Practitioners have embraced the use of AI and Machine Learning systems for employeerecruitment and selection. However, studies examining applicant reactions to such systems are lacking in the literature. Specifically, little is known about how job applicants react to AI-based selection systems. This study assessed fairness perceptions of hiring decisions made by AIdriven systems and whether significant differences existed between different groups of people.  To do so, a two-by-two experimental study where participants in a selection scenario are randomly assigned to a decision-maker condition (human vs AI) and outcome variability condition (hired vs rejected) was utilized. The results showed that the condition had a significant effect on the interactional justice dimension. The interaction effect of outcome and condition had an impact on job-relatedness, chance to perform, reconsideration opportunity, feedback perceptions, and interactional justice. The three-way interaction of outcome, race and condition influences general fairness reactions and emotional reactions. Given these findings, HR personnel should weigh the pros and cons of AI, especially towards applicants that are rejected.
  • Thumbnail Image
    Item
    Transfer Learning in Natural Language Processing through Interactive Feedback
    (2022) Yuan, Michelle; Boyd-Graber, Jordan; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Machine learning models cannot easily adapt to new domains and applications. This drawback becomes detrimental for natural language processing (NLP) because language is perpetually changing. Across disciplines and languages, there are noticeable differences in content, grammar, and vocabulary. To overcome these shifts, recent NLP breakthroughs focus on transfer learning. Through clever optimization and engineering, a model can successfully adapt to a new domain or task. However, these modifications are still computationally inefficient or resource-intensive. Compared to machines, humans are more capable at generalizing knowledge across different situations, especially in low-resource ones. Therefore, the research on transfer learning should carefully consider how the user interacts with the model. The goal of this dissertation is to investigate “human-in-the-loop” approaches for transfer learning in NLP. First, we design annotation frameworks for inductive transfer learning, which is the transfer of models across tasks. We create an interactive topic modeling system for users to find topics useful for classifying documents in multiple languages. The user-constructed topic model bridges improves classification accuracy and bridges cross-lingual gaps in knowledge. Next, we look at popular language models, like BERT, that can be applied to various tasks. While these models are useful, they still require a large amount of labeled data to learn a new task. To reduce labeling, we develop an active learning strategy which samples documents that surprise the language model. Users only need to annotate a small subset of these unexpected documents to adapt the language model for text classification. Then, we transition to user interaction in transductive transfer learning, which is the transfer of models across domains. We focus our efforts on low-resource languages to develop an interactive system for word embeddings. In this approach, the feedback from bilingual speakers refines the cross-lingual embedding space for classification tasks. Subsequently, we look at domain shift for tasks beyond text classification. Coreference resolution is fundamental for NLP applications, like question-answering and dialogue, but the models are typically trained and evaluated on one dataset. We use active learning to find spans of text in the new domain for users to label. Furthermore, we provide important insights on annotating spans for domain adaptation. Finally, we summarize the contributions of each chapter. We focus on aspects like the scope of applications and model complexity. We conclude with a discussion of future directions. Researchers may extend the ideas in our thesis to topics like user-centric active learning and proactive learning.
  • Thumbnail Image
    Item
    Active Vision Based Embodied-AI Design For Nano-UAV Autonomy
    (2021) Jagannatha Sanket, Nitin; Aloimonos, Yiannis; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The human fascination to mimic ultra-efficient flying beings like birds and bees hasled to a rapid rise in aerial robots in the recent decade. These aerial robots now posses a market share of over 10 Billion US Dollars. The future for aerial robots or Unmanned Aerial Vehicles (UAVs) which are commonly called drones is very bright because of their utility in a myriad of applications. I envision drones delivering packages to our homes, finding survivors in collapsed buildings, pollinating flowers, inspecting bridges, performing surveillance of cities, in sports and even as pets. In particular, quadrotors have become the go to platform for aerial robotics due to simplicity in their mechanical design, their vertical takeoff and landing capabilities and agility characteristics. Our eternal pursuit to improve drone safety and improve power efficiency has givenrise to the research and development of smaller yet smarter drones. Furthermore, smaller drones are more agile and task-distributable as swarms. Embodied Artificial Intelligence (AI) has been a big fuel to push this area further. Classically, the approach to designing such nano-drones possesses a strict distinction between perception, planning and control and relies on a 3D map of the scene that are used to plan paths that are executed by a control algorithm. On the contrary, nature’s never-ending quest to improve the efficiency of flyingagents through genetic evolution led to birds developing amazing eyes and brains tailored for agile flight in complex environments as a software and hardware co-design solution. In contrast, smaller flying agents such as insects that are at the other end of the size and computation spectrum adapted an ingenious approach – to utilize movement to gather more information. Early pioneers of robotics remarked at this observation and coined the concept of “Active Perception” which proposed that one can move in an exploratory way to gather more information to compensate for lack of computation and sensing. Such a controlled movement imposes additional constraints on the data being perceived to make the perception problem simpler. Inspired by this concept, in this thesis, I present a novel approach for algorithmicdesign on nano aerial robots (flying robots the size of a hummingbird) based on active perception by tightly coupling and combining perception, planning and control into sensorimotor loops using only on-board sensing and computation. This is done by re-imagining each aerial robot as a series of hierarchical sensorimotor loops where the higher ones require the inner ones such that resources and computation can be efficiently re-used. Activeness is presented and utilized in four different forms to enable large-scale autonomy at tight Size, Weight, Area and Power (SWAP) constraints not heard of before. The four forms of activeness are: 1. By moving the agent itself, 2. By employing an active sensor, 3. By moving a part of the agent’s body, 4. By hallucinating active movements. Next, to make this work practically applicable I show how hardware and software co-design can be performed to optimize the form of active perception to be used. Finally, I present the world’s first prototype of a RoboBeeHive that shows how to integrate multiple competences centered around active vision in all it’s glory. Following is a list of contributions of this thesis: • The world’s first functional prototype of a RoboBeeHive that can artificially pollinateflowers. • The first method that allows a quadrotor to fly through gaps of unknown shape,location and size using a single monocular camera with only on-board sensing and computation. • The first method to dodge dynamic obstacles of unknown shape, size and locationon a quadrotor using a monocular event camera. Our series of shallow neural networks are trained in simulation and transfers to the real world without any finetuning or re-training. • The first method to detect unmarked drones by detecting propellers. Our neuralnetwork is trained in simulation and transfers to the real world without any finetuning or re-training. • A method to adaptively change the baseline of a stereo camera system for quadrotornavigation. • The first method to introduce the usage of saliency to select features in a directvisual odometry pipeline. • A comprehensive benchmark of software and hardware for embodied AI whichwould serve as a blueprint for researchers and practitioners alike.