Decoding the Brain in Complex Auditory Environments

Thumbnail Image


Rezaeizadeh_umd_0117E_22917.pdf (18.21 MB)
No. of downloads:

Publication or External Link





Humans have an exceptional ability to engage with sequences of sounds and extract meaningful information from them. We can appreciate music or absorb speech during a conversation, not like anything else on the planet. It is unclear exactly how the brain effortlessly processes these rapidly changing complex soundscapes. This dissertation explored the neural mechanisms underlying these remarkable traits in an effort to expand our knowledge of human cognition with numerous clinical and engineering applications.

Brain-imaging techniques have provided a powerful tool to access mental representations' content and dynamics. Non-invasive imaging such as Electroencephalography (EEG) and Magnetoencephalography (MEG) provides a fine-grained dissection of the sequence of brain activities. The analysis of these time-resolved signals can be enhanced with temporal decoding methods that offer vast and untapped potential for determining how mental representations unfold over time. In the present thesis, we use these decoding techniques, along with a series of novel experimental paradigms, on EEG and MEG signals to investigate the neural mechanisms of auditory processing in the human brain, ranging from neural representation of acoustic features to the higher level of cognition, such as music perception and speech imagery.

First, we reported our findings regarding the role of temporal coherence in auditory source segregation. We showed that the perception of a target sound source can only be segregated from a complex acoustic background if the acoustic features (e.g., pitch, location, and timbre) induce temporally modulated neural responses that are mutually correlated. We used EEG signals to measure the neural responses to the individual acoustic feature in complex sound mixtures. We decoded the effect of attention on these responses. We showed that attention and the coherent temporal modulation of the acoustic features of the target sound are the key factors that induce the binding of the target features and its emergence as the foreground sound source.

Next, we explored how the brain learns the statistical structures of sound sequences in different musical contexts. The ability to detect probabilistic patterns is central to many aspects of human cognition, ranging from auditory perception to the enjoyment of music. We used artificially generated melodies derived from uniform or non-uniform musical scales. We collected EEG signals and decoded the neural responses to the tones in a melody with different transition probabilities. We observed that the listener's brain only learned the melodies' statistical structures when derived from non-uniform scales.

Finally, we investigated brain processing during speech and music imagery with Brain-Computer Interface applications. We developed an encoder-decoder neural network architecture to find a transformation between neural responses to the listened and imagined sounds. Using this map, we could reconstruct the imagery signals reliably, which could be used as a template to decode the actual imagery neural signals. This was possible even when we generalized the model to unseen data of an unseen subject. We decoded these predicted signals and identified the imagined segment with remarkable accuracy.