Alternating Optimization: Constrained Problems, Adversarial Networks, and Robust Models

dc.contributor.advisorGoldstein, Tomen_US
dc.contributor.authorXu, Zhengen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2019-09-27T05:41:11Z
dc.date.available2019-09-27T05:41:11Z
dc.date.issued2019en_US
dc.description.abstractData-driven machine learning methods have achieved impressive performance for many industrial applications and academic tasks. Machine learning methods usually have two stages: training a model from large-scale samples, and inference on new samples after the model is deployed. The training of modern models relies on solving difficult optimization problems that involve nonconvex, nondifferentiable objective functions and constraints, which is sometimes slow and often requires expertise to tune hyperparameters. While inference is much faster than training, it is often not fast enough for real-time applications.We focus on machine learning problems that can be formulated as a minimax problem in training, and study alternating optimization methods served as fast, scalable, stable and automated solvers. First, we focus on the alternating direction method of multipliers (ADMM) for constrained problem in classical convex and nonconvex optimization. Some popular machine learning applications including sparse and low-rank models, regularized linear models, total variation image processing, semidefinite programming, and consensus distributed computing. We propose adaptive ADMM (AADMM), which is a fully automated solver achieving fast practical convergence by adapting the only free parameter in ADMM. We further automate several variants of ADMM (relaxed ADMM, multi-block ADMM and consensus ADMM), and prove convergence rate guarantees that are widely applicable to variants of ADMM with changing parameters. We release the fast implementation for more than ten applications and validate the efficiency with several benchmark datasets for each application. Second, we focus on the minimax problem of generative adversarial networks (GAN). We apply prediction steps to stabilize stochastic alternating methods for the training of GANs, and demonstrate advantages of GAN-based losses for image processing tasks. We also propose GAN-based knowledge distillation methods to train small neural networks for inference acceleration, and empirically study the trade-off between acceleration and accuracy.Third, we present preliminary results on adversarial training for robust models. We study fast algorithms for the attack and defense for universal perturbations, and then explore network architectures to boost robustness.en_US
dc.identifierhttps://doi.org/10.13016/xgwp-edlc
dc.identifier.urihttp://hdl.handle.net/1903/25045
dc.language.isoenen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pquncontrolledADMMen_US
dc.subject.pquncontrolledadversarial trainingen_US
dc.subject.pquncontrolledComputer visionen_US
dc.subject.pquncontrolledGANen_US
dc.subject.pquncontrolledMachine learningen_US
dc.subject.pquncontrolledOptimizationen_US
dc.titleAlternating Optimization: Constrained Problems, Adversarial Networks, and Robust Modelsen_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Xu_umd_0117E_20282.pdf
Size:
13.6 MB
Format:
Adobe Portable Document Format