Towards Learning Non-Spurious Function: From Representation to Decision Through Data and Model Design

dc.contributor.advisorHuang, Furongen_US
dc.contributor.authorLiu, Xiaoyuen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2025-08-08T12:02:46Z
dc.date.issued2025en_US
dc.description.abstractModern machine learning models, particularly deep neural networks, have demonstrated remarkable capabilities across a variety of tasks. However, their success often hinges not only on model capacity but also on the structure of the data and the presence of spurious correlations. This dissertation investigates the foundations of learning non-spurious decision functions—functions that capture stable, generalizable patterns rather than superficial artifacts in the data. We begin by analyzing the expressive power of neural networks, defined by the function classes that different architectures can represent. In contrast to traditional approaches that fix model structure a priori, we propose a data-informed perspective, where structural choices are guided by the information content and complexity of the data. This view enables more efficient and targeted model designs that align with the intrinsic learning task. Building on this foundation, we explore how models trained on observational datasets are prone to learning spurious shortcuts, particularly when superficial features correlate with labels. We study the conditions under which such behavior emerges and propose methods to encourage the learning of non-spurious functions through structural and algorithmic interventions. Together, these contributions offer a unified framework that connects model expressivity, feature representation, and structural design. Our goal is to inform the development of learning systems that not only perform well empirically, but also base their decisions on meaningful, generalizable signals in the data.en_US
dc.identifierhttps://doi.org/10.13016/qb87-btby
dc.identifier.urihttp://hdl.handle.net/1903/34196
dc.language.isoenen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pqcontrolledArtificial intelligenceen_US
dc.titleTowards Learning Non-Spurious Function: From Representation to Decision Through Data and Model Designen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Liu_umd_0117E_25056.pdf
Size:
6.24 MB
Format:
Adobe Portable Document Format