Estimation of the Temporal Response Function and Tracking Selective Auditory Attention using Deep Kalman Filter

Cao, Yexin

Estimation of the Temporal Response Function and Tracking Selective Auditory Attention using Deep Kalman Filter

dc.contributor.advisor	Babadi, Behtash	en_US
dc.contributor.author	Cao, Yexin	en_US
dc.contributor.department	Electrical Engineering	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2020-10-10T05:40:23Z
dc.date.available	2020-10-10T05:40:23Z
dc.date.issued	2020	en_US
dc.description.abstract	The cocktail party effect refers to the phenomenon that people can focus on a single sound source in a noisy environment with multiple speakers talking at the same time. This effect reflects the human brain's ability of selective auditory attention, whose decoding from non-invasive electroencephalogram (EEG) or magnetoencephalography (MEG) has recently been a topic of active research. The mapping between auditory stimuli and their neural responses can be measured by the auditory temporal response functions (TRF). It has been shown that the TRF estimates derived with the envelopes of speech streams and auditory neural responses can be used to make predictions that discriminate between attended and unattended speakers. l_1 regularized least squares estimation has been adopted in previous research for the estimation of the linear TRF model. However, most real-world systems exhibit a degree of non-linearity. We thus have to use new models for complex, realistic auditory environments. In this thesis, we proposed to estimate TRFs with the deep Kalman filter model, for the cases where the observations are a noisy, non-linear function of the latent states. The deep Kalman filter (DKF) algorithm is developed by referring to the techniques in variational inference. Replacing all the linear transformations in the classic Kalman filter model with non-linear transformations makes the posterior distribution intractable to compute due to the non-linearity. Thus, a recognition network is introduced to approximate the intractable posterior and optimize the variational lower bound of the objective function. We implemented the deep Kalman filter model with a two-layer Bidirectional LSTM and a MLP. The performance is first evaluated by applying our algorithm to simulated MEG data. In addition, we also combined the new model for TRF estimation with a previously proposed framework by replacing the dynamic encoding/decoding module in the framework with a deep Kalman filter to conduct real-time tracking of selective auditory attention. This performance is validated by applying the general framework to simulated EEG data.	en_US
dc.identifier	https://doi.org/10.13016/brke-732f
dc.identifier.uri	http://hdl.handle.net/1903/26648
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Electrical engineering	en_US
dc.title	Estimation of the Temporal Response Function and Tracking Selective Auditory Attention using Deep Kalman Filter	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Cao_umd_0117N_21183.pdf
Size:: 4.84 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Electrical & Computer Engineering Theses and Dissertations