Deep-Learning Based Image Analysis on Resource-Constrained Systems
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
In recent years, deep learning has led to high-end performance on a very wide variety of computer vision tasks. Among different types of deep neural networks, convolutional neural networks (CNNs) are extensively studied and utilized for image analysis purposes, as CNNs have the capability to effectively capture spatial and temporal dependencies in images. The growth in the amount of annotated image data and improvements in graphics processing units are factors in the rapid gain in popularity of CNN-based image analysis systems. This growth in turn motivates investigation into the application of CNN-based deep learning to increasingly complex tasks, including an increasing variety applications at the network edge.
The application of deep CNNs to novel edge applications involves two major challenges. First, in many of the emerging edge-based application areas, there is a lack of sufficient training data or an uneven class balance within the datasets. Second, stringent implementation constraints --- including constraints on real-time performance, memory requirements, and energy consumption --- must be satisfied to enable practical deployment. In this thesis, we address these challenges in developing deep-CNN-based image analysis systems for deployment on resource-constrained devices at the network edge.
To tackle the challenges for medical image analysis, we first propose a methodology and tool for semi-automated training dataset generation in support of robust segmentation. The framework is developed to provide robust segmentation of surgical instruments using deep learning. We then address the problem of training dataset generation for real-time object tracking using a weakly supervised learning method. In particular, we present a weakly supervised method for surgical tool tracking based on a class of hybrid sensor systems. The targeted class of systems combines electromagnetic (EM) and vision-based modalities. Furthermore, we present a new framework for assessing the quality of nonrigid multimodality image registration in real-time. With the augmented dataset, we construct a solution using various registration quality metrics that are integrated to form a single binary assessment of image registration effectiveness as either high quality or low quality. To address challenges in practical deployment, we present a deep-learning-based hyperspectral image (HSI) classification method that is designed for deployment on resource-constrained devices at the network edge. Due to the large volumes of data produced by HSI sensors, and the complexity of deep neural network (DNN) architectures, developing DNN solutions for HSI classification on resource-constrained platforms is a challenging problem. In this part of the thesis, we introduce a novel approach that integrates DNN-based image analysis with discrete cosine transform (DCT) analysis for HSI classification.
In addition to medical image processing and HSI classification, a third application area that we investigate in this thesis is on-board object detection from Unmanned Aerial Vehicles (UAVs), which represents another important domain of interest for the edge-based deployment of CNN methods. In this part of the thesis, we present a novel framework for object detection using images captured from UAVs. The framework is optimized using synthetic datasets that are generated from a game engine to capture imaging scenarios that are specific to the UAV-based operating environment. Using the generated synthetic dataset, we develop new insight on the impact of different UAV-based imaging conditions on object detection performance.