Efficient Models and Learning Strategies for Resource-Constrained Systems

No Thumbnail Available

Publication or External Link

Date

2024

Citation

Abstract

The last decade has seen sharp improvements in the performance of machine learning (ML) models but at the cost of vastly increased complexity and size of their underlying architectures. Advances in high-performance computing have enabled researchers to train and deploy models composed of hundreds of billions of parameters. However, harvesting the full utility of large models on smaller clients, such as Internet of Things (IoT) devices, without resorting to external hosting will require a significant reduction of parameters and faster, cheaper inference. In addition to augmenting IoT, efficient models and learning paradigms can reduce energy consumption, encourage technological equity, and are well-suited for deployment in real-world applications that require fast response in low-resource settings.

To address these challenges, we introduce multiple, novel strategies for (1) reducing the scale of deep neural networks and (2) faster learning. For the size problem (1), we leverage tools such as tensorization, randomized projections, and locality-sensitive hashing to train on reduced representations of large models without sacrificing performance. For learning efficiency (2), we develop algorithms for cheaper forward passes, accelerated PCA, and asynchronous gradient descent. Several of these methods are tailored for federated learning (FL), a private, distributed learning paradigm where data is decentralized among resource-constrained edge clients. We are exclusively concerned with improving efficiency during training -- our techniques do not process pre-trained models or require a device to train over an architecture in its full entirety.

Notes

Rights