Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
4 results
Search Results
Item FOUNDATIONS OF TRUSTWORTHY DEEP LEARNING: FAIRNESS, ROBUSTNESS, AND EXPLAINABILITY(2024) Nanda, Vedant; Dickerson, John; Gummadi, Krishna; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Deep Learning (DL) models, especially with the rise of the so-called foundation models, are increasingly used in real-world applications either as autonomous systems (\eg~facial recognition), as decision aids (\eg~medical imaging, writing assistants), and even to generate novel content (\eg~chatbots, image generators). This naturally results in concerns about the trustworthiness of these systems, for example, do the models systematically perform worse for certain subgroups? Are the outputs of these models reliable under perturbations to the inputs? This thesis aims to strengthen the foundations of DL models, so they can be trusted in deployment. I will cover three important aspects of trust: fairness, robustness, and explainability. I will argue that we need to expand the scope of each of these aspects when applying them to DL models and carefully consider possible tradeoffs between these desirable but sometimes conflicting notions of trust. Traditionally the fairness community has worked on mitigating biases in classical models such as Support Vector Machines (SVMs) and logistic regression. However, a lot of real-world applications where bias shows up in a myriad of ways involve much more complicated DL models. In the first part, I will present two works that show how thinking about fairness for deep learning (DL) introduces new challenges, especially due to their overparametrized nature and susceptibility to adversarial attacks. Robustness literature has focused largely on measuring the invariance of models to carefully constructed (adversarial attacks) or natural (distribution shifts) noise. In the second part, I will argue that to get truly robust models, we must focus on a more general notion of robustness: measuring the alignment of invariances of DL models with other models of perception such as humans. I will present two works that measure shared invariances between (1) DL models and humans, and (2) between DL models. Such measurements of robustness provide a measure of \textit{relative robustness}, through which we can better understand the failure modes of DL models and work towards building truly robust systems. Finally, in the third part, I will show how even a small subset of randomly chosen neurons from a pre-trained representation can transfer very well to downstream tasks. We call this phenomenon \textit{diffused redundancy}, which we observe in a variety of pre-trained representations. This finding challenges existing beliefs in the explainability literature that claim individual neurons learn disjoint semantically meaningful concepts.Item TOWARD A DATA LITERACY ASSESSMENT THAT IS FAIR FOR LANGUAGE MINORITY STUDENTS(2023) Yeom, Semi; O'Flahavan, John; Education Policy, and Leadership; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Data literacy is crucial for adolescents to access and navigate data in today’s technology-driven world. Researchers emphasize the need for K-12 students to attain data literacy. However, few available instructions have incorporated validated assessments. Therefore, I developed and implemented the Data literacy Assessment for Middle graders (DLA-M) that can diagnose students’ current stages fairly and support future practices regardless of their language backgrounds. I initiated the study with two research questions: a) How valid is a newly developed assessment to measure middle-grade students’ data literacy? b) How fairly does the new assessment measure data literacy regardless of students’ language backgrounds?A new assessment purported to measure two competencies of data literacy of 6th to 9th graders: a) Interpret data representations and b) Evaluate data and data-based claims. I used the Evidence-Centered Design (ECD) as a methodological framework to increase the validity of the assessment. I followed the five layers of the ECD framework to develop and implement the DLAM. Then I analyzed the data from implementing the assessment and gathered five types of validity evidence for validation. Based on the collected validity evidence, I concluded that the assessment was designed to represent the content domain that is purported to measure. The assessment had internal consistency in measuring data literacy except for nine eliminated items, and the data literacy scores from the overall assessment were reliable as well. Regarding item quality, item discrimination parameters met the quality criteria, but difficulty estimates of some items did not meet the intended design. Empirical cluster analyses revealed two performance levels from the participants. Differential item functioning analyses showed that item discrimination and difficulty were not differentiated between language minority students (LMSs) and their counterparts with the same data literacy level. These results did not reveal the possibility of unfair interpretations and uses of this assessment for LMSs. Lastly, I found significant interaction effects between the DLAM scores and the two variables about students’ English reading proficiency and use of technology. This study delineated how to develop and validate a data literacy assessment that could support students from different linguistic backgrounds. The research also facilitated the application of a data literacy assessment to school settings by scrutinizing and defining target competencies that could benefit adolescents’ data literacy. The findings can inform future research to implement data literacy assessments in broader contexts. This study can serve as a springboard to provide inclusive data literacy assessments for diverse student populations.Item Security Enhancement and Bias Mitigation for Emerging Sensing and Learning Systems(2021) Chen, Mingliang; Wu, Min; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Artificial intelligence (AI) is being used across various practical tasks in recent years, facilitating many aspects of our daily life. With AI-based sensing and learning systems, we can enjoy the services of automated decision making, computer-assisted medical diagnosis, and health monitoring. Since these algorithms have entered human society and are influencing our daily life, such important issues as intellectual property protection, access control, privacy protection, and fairness/equity, should be considered when we are developing the algorithms, in addition to their successful performance. In this dissertation, we improve the design of emerging AI-based sensing and learning systems from security and fairness perspectives. The first part is the security protection of deep neural networks (DNN). DNNs are becoming an emerging form of intellectual property for model owners and should be protected from unauthorized access and piracy to encourage healthy business investment and competition. Taking advantage of DNN's intrinsic mechanism, we propose a novel framework to provide access control to the trained DNNs so that only authorized users can utilize them properly to prevent piracy and illicit usage. The second part is privacy protection in facial videos. Remote Photoplethysmography (rPPG) can be used to collect a person's physiological signal when his/her face is captured by a video camera, which may raise privacy issues from two aspects. First, individual health conditions may be revealed from a facial recording unintentionally by a person without his/her explicit consent from a facial recording. To avoid the physiological privacy issue, we develop \textit{PulseEdit}, a novel and efficient algorithm that can edit the physiological signals in facial videos without affecting visual appearance to protect the person's physiological signal from disclosure.On the other hand, R\&D of rPPG technology also has a potential leakage of identity privacy. We usually require public benchmark facial datasets to develop rPPG algorithms, but facial videos are often very sensitive and have a high leakage risk in identity privacy. We develop an anonymization transform that removes sensitive visual information identifying an individual, but in the meantime, preserves the physiological information for rPPG analysis. In the last part, we investigate fairness in machine learning inference. Various fairness definitions in prior art were proposed to ensure that decisions guided by the machine learning models are equitable. Unfortunately, the ``fair'' model trained with these fairness definitions is sensitive to threshold, i.e., the condition of fairness will no longer hold when tuning the decision threshold. To this end, we introduce the notion of threshold-invariant fairness, which enforces equitable performances across different groups independent of the decision threshold.Item On Fairness in Secure Computation(2010) Gordon, Samuel Dov; Katz, Jonathan; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Secure computation is a fundamental problem in modern cryptography in which multiple parties join to compute a function of their private inputs without revealing anything beyond the output of the function. A series of very strong results in the 1980's demonstrated that any polynomial-time function can be computed while guaranteeing essentially every desired security property. The only exception is the fairness property, which states that no player should receive their output from the computation unless all players receive their output. While it was shown that fairness can be achieved whenever a majority of players are honest, it was also shown that fairness is impossible to achieve in general when half or more of the players are dishonest. Indeed, it was proven that even boolean XOR cannot be computed fairly by two parties The fairness property is both natural and important, and as such it was one of the first questions addressed in modern cryptography (in the context of signature exchange). One contribution of this thesis is to survey the many approaches that have been used to guarantee different notions of partial fairness. We then revisit the topic of fairness within a modern security framework for secure computation. We demonstrate that, despite the strong impossibility result mentioned above, certain interesting functions can be computed fairly, even when half (or more) of the parties are malicious. We also provide a new notion of partial fairness, demonstrate feasibility of achieving this notion for a large class of functions, and show impossibility for certain functions outside this class. We consider fairness in the presence of rational adversaries, and, finally, we further study the difficulty of achieving fairness by exploring how much external help is necessary for enabling fair secure computation.