Interpretable Deep Learning for Time Series

dc.contributor.advisorFeizi, Soheilen_US
dc.contributor.authorIsmail, Aya Abdelsalamen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2022-09-27T05:38:36Z
dc.date.available2022-09-27T05:38:36Z
dc.date.issued2022en_US
dc.description.abstractTime series data emerge in applications across many critical domains, including neuroscience, medicine, finance, economics, and meteorology. However, practitioners in such fields are hesitant to use Deep Neural Networks (DNNs) that can be difficult to interpret. For example, in clinical research, one might ask, ``Why did you predict this person as more likely to develop Alzheimer's disease?". As a result, research efforts to improve the interpretability of deep neural networks have significantly increased in the last couple of years. Nevertheless, they are mainly applied to vision and language tasks, and their applications to time series data are relatively unexplored. This thesis aims to identify and address the limitations of interpretability of neural networks for time series data. In the first part of this thesis, we extensively compare the performance of various interpretability methods (also known as saliency methods) across diverse neural architectures commonly used in time series, including Recurrent Neural Networks (RNNs), Temporal Convolutional Networks (TCNs), and Transformers in a new benchmark of synthetic time series data. We propose and report multiple metrics to empirically evaluate the performance of interpretability methods for detecting feature importance over time using both precision and recall. We find that network architectures and saliency methods fail to reliably and accurately identify feature importance over time. For RNNs, saliency vanishes over time, biasing detection of salient features only to later time steps, and are, therefore, incapable of reliably detecting important features at arbitrary time intervals. At the same time, non-recurrent architectures fail due to the conflation of time and feature domains. The second part of this thesis focuses on improving time series interpretability by enhancing neural architectures, saliency methods, and neural training procedures. [a] Enhancing neural architectures: To address the architectural limitations of recurrent networks, we design a novel RNN cell structure (input-cell attention); this new cell structure preserves a direct gradient path from the input to the output at all timesteps. As a result, explanations produced by the input-cell attention RNN can detect important features regardless of their occurrence in time. In addition, we introduce a generalized framework, Interpretable Mixture of Experts (IME), that provides interpretability for structured data while preserving accuracy. IME is an inherently-interpretable architecture, so explanations produced by IME are the exact descriptions of how the prediction is computed. [b] Enhancing saliency methods: We substantially improve the quality of time series saliency maps by detangling time and feature importance through two-step temporal saliency rescaling (TSR). [c] Enhancing neural training procedures: We introduce a saliency guided training procedure for neural networks to reduce noisy gradients used in predictions, which improves the quality of saliency maps while retaining the model's predictive performance.en_US
dc.identifierhttps://doi.org/10.13016/e3ur-eej2
dc.identifier.urihttp://hdl.handle.net/1903/29335
dc.language.isoenen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.titleInterpretable Deep Learning for Time Seriesen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Ismail_umd_0117E_22785.pdf
Size:
34.47 MB
Format:
Adobe Portable Document Format