IMPROVING MODEL AND DATA EFFICIENCY FOR DEEP LEARNING
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Deep learning has achieved or even surpassed human-level performance in a wide range of challenging tasks encompassing computer vision, natural language processing, and speech recognition. Nevertheless, such achievements are predominantly derived from training huge models (i.e., billions of parameters) on numerous labeled examples, which requires considerable computation resources and expensive data collection costs. Various studies have strived to enhance efficiency in these domains. In terms of model efficiency, remarkable advancements have been made to accelerate the training and inference by methods such as quantization and pruning. Regarding data efficiency, few-shot learning, semi-supervised learning, and self-supervised learning have gathered more attention due to their abilities to learn feature representations with few labeled examples or even without human supervision. This dissertation introduces several improvements and provides an in-depth analysis of these methodologies, aiming to address the computational challenges and augment the efficiency of deep learning models, especially in computer vision.
In addressing model efficiency, we explore the potential for improvement in both the training and inference phases of deep learning processes. For model inference acceleration, we investigate the challenges of using extremely low-resolution arithmetic in quantization methods, where integer overflows frequently happen and the models are sensitive to these overflows. To address this issue, we introduce a novel module, designed to emulate the “wrap-around” property of integer overflow, which maintains comparable performance with 8-bit low-resolution accumulators. In addition, to scale inferences of Vision Transformers on mobile devices, we propose an efficient and flexible local self-attention mechanism optimized directly on mobile devices that achieves comparable performance to global attention while significantly reducing the on-device latency, especially for high-resolution tasks. Besides the computational costs, training deep neural networks consumes a large amount of memory which is another bottleneck to applying model training on edge devices. To improve the memory efficiency of training deep networks on resource-limited devices, we propose a quantization aware training framework for federated learning where only the quantized model is distributed and trained on the client devices.
In the realm of label efficiency, we first develop a better understanding of the models trained by meta-learning, which has a unique training pipeline, for few-shot classification tasks. In addition, a comprehensive analysis has been conducted to integrate data augmentation strategies into the meta-learning pipeline, leading to Meta-MaxUp, a novel data augmentation technique for meta-learning, demonstrating enhanced few-shot performance across various benchmarks. Beyond few-shot learning, the research explores the application of meta-learning methods in the context of self-supervised learning. We discuss the close relationship between meta-learning and contrastive learning, a method that achieves excellent results in self-supervised learning, under a certain task distribution.