FROM INSTANCE-SPECIFIC TRAINING TO GENERALIZABLE SOLVERS: ADVANCING MACHINE LEARNING FOR SCIENTIFIC COMPUTING

Song, Zezheng

FROM INSTANCE-SPECIFIC TRAINING TO GENERALIZABLE SOLVERS: ADVANCING MACHINE LEARNING FOR SCIENTIFIC COMPUTING

Files

Song_umd_0117E_25013.pdf (8.55 MB)

No. of downloads: 71

Date

2025

Authors

Song, Zezheng

Advisor

Yang, Haizhao

DRUM DOI

https://doi.org/10.13016/z9aw-liod

Abstract

Neural networks have played a crucial role in scientific computing by providing data-driven solutions to complex mathematical and physical problems. However, traditional neural network solvers remain fundamentally limited in accuracy and lack interpretability, as they operate as black-box models with no explicit mathematical structure. To address these challenges, this thesis explores the Finite Expression Method (FEX), a symbolic regression approach designed to enhance interpretability and accuracy in scientific computing. FEX leverages deep reinforcement learning to discover interpretable mathematical expressions, offering a principled approach to solving high-dimensional partial differential equations (PDEs) and uncovering governing equations from experimental data. Despite these advantages, both neural network solvers and FEX still require retraining for each new equation or change in initial and boundary conditions, limiting their scalability and adaptability.

To overcome these fundamental constraints, scientific computing is now transitioning to the second stage, characterized by foundation models inspired by large language models. These models are pretrained on diverse scientific data and employ in-context learning to generalize across a wide range of problems without requiring instance-specific retraining. In this thesis, we introduce FMint, a foundation model designed for the fast and accurate simulation of dynamical systems. FMint builds upon a decoder-only transformer architecture and functions as an error corrector for coarse simulations, significantly improving accuracy while maintaining computational efficiency. By learning from a broad set of dynamical system trajectories, FMint generalizes well to out-of-distribution dynamics, demonstrating superior performance compared to traditional neural network solvers.

This shift from single-instance solvers to foundation models marks a major transformation in scientific computing. FMint exemplifies how pretrained models can serve as a foundation for more advanced PDE solvers, demonstrating the potential of leveraging large-scale pretraining and in-context learning for scientific applications. Its success highlights a broader direction for future research, where foundation models trained on diverse physical systems can enable more generalizable, efficient, and interpretable solutions across a wide range of computational science and engineering problems.

URI (handle)

http://hdl.handle.net/1903/34158

Collections

UMD Theses and Dissertations
Computer Science Theses and Dissertations
Mathematics Theses and Dissertations

Full item page