Towards Generalized and Scalable Machine Learning on Structured Data

dc.contributor.advisorGoldstein, Tomen_US
dc.contributor.authorKong, Kezhien_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2024-06-28T05:43:56Z
dc.date.available2024-06-28T05:43:56Z
dc.date.issued2024en_US
dc.description.abstractDeep Learning and Neural Networks have brought a transformative era for the field of machine learning, significantly influencing how we approach and utilize structured data. This dissertation is dedicated to exploring machine learning methodologies specifically designed for structured graphs and tables, aiming to enhance the performance of neural networks on the important data modalities.Graph Neural Networks (GNNs) have emerged as powerful architectures for learning and analyzing graph representations. However, the training of GNNs on large-scale datasets usually suffers from overfitting, posing significant generalization challenges for prediction problems. Meanwhile, conventional GNNs are hindered by scalability problem when deployed on industrial- level graph datasets. Moreover for the table reasoning task, Large Language Models (LLMs) have shown competitive ability, but cannot fully process large tables due to context limit and may fail to comprehend the complex relationships within tabular data. In this dissertation, we investigate algorithms and techniques to address the generalization and scalability issues of GNNs, as well as the effective and efficient approach to the table reasoning task. In the first work, we propose to leverage data augmentation to generalize GNNs. We propose FLAG (Free Large-scale Adversarial Augmentation on Graphs), which iteratively augments node features with gradient-based adversarial perturbations during training. In the second and third work, we look into GNNs’ scalability problem. We propose VQ-GNN, a universal framework to scale up any convolution-based GNNs using Vector Quantization (VQ) without compromising the performance. We further propose GOAT, a global graph transformer that scales to large graphs with millions of nodes and is competitive on tasks of both homophilious and heterophilious graphs. Lastly, we propose OpenTab, an effective method towards open-domain table reasoning task built with the advanced Large Language Models.en_US
dc.identifierhttps://doi.org/10.13016/ml6u-6zx2
dc.identifier.urihttp://hdl.handle.net/1903/32795
dc.language.isoenen_US
dc.subject.pqcontrolledArtificial intelligenceen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.titleTowards Generalized and Scalable Machine Learning on Structured Dataen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kong_umd_0117E_24052.pdf
Size:
2.81 MB
Format:
Adobe Portable Document Format