Efficient Optimization Algorithms for Nonconvex Machine Learning Problems

dc.contributor.advisorHuang, Heng HHen_US
dc.contributor.authorXian, Wenhanen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2024-09-23T06:14:21Z
dc.date.available2024-09-23T06:14:21Z
dc.date.issued2024en_US
dc.description.abstractIn recent years, the success of the AI revolution has led to the training of larger neural networks on vast amounts of data to achieve superior performance. These powerful machine learning models have enabled the creation of remarkable AI products. Optimization, as the core of machine learning, becomes especially crucial because most machine learning problems can ultimately be formulated as optimization problems, which require minimizing a loss function with respect to model parameters based on training samples. To enhance the efficiency of optimization algorithms, distributed learning has emerged as a popular solution for addressing large-scale machine learning tasks. In distributed learning, multiple worker nodes collaborate to train a global model. However, a key challenge in distributed learning is the communication cost. This thesis introduces a novel adaptive gradient algorithm with gradient sparsification to address this issue. Another significant challenge in distributed learning is the communication overhead on the central parameter server. To mitigate this bottleneck, decentralized distributed (serverless) learning has been proposed, where each worker node only needs to communicate with its neighbors. This thesis investigates core nonconvex optimization problems in decentralized settings, including constrained optimization, minimax optimization, and second-order optimality. Efficient optimization algorithms are proposed to solve these problems. Additionally, the convergence analysis of minimax optimization under the generalized smooth condition is explored. A generalized algorithm is proposed, which can be applied to a broader range of applications.en_US
dc.identifierhttps://doi.org/10.13016/r6ko-uppv
dc.identifier.urihttp://hdl.handle.net/1903/33413
dc.language.isoenen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pquncontrolledDecentralized Learningen_US
dc.subject.pquncontrolledDistributed Learningen_US
dc.subject.pquncontrolledMachine Learningen_US
dc.subject.pquncontrolledOptimizationen_US
dc.titleEfficient Optimization Algorithms for Nonconvex Machine Learning Problemsen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Xian_umd_0117E_24601.pdf
Size:
2.97 MB
Format:
Adobe Portable Document Format