UMD Theses and Dissertations
Permanent URI for this collectionhttp://hdl.handle.net/1903/3
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a given thesis/dissertation in DRUM.
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
3 results
Search Results
Item Neuromorphic model for sound source segregation(2015) Krishnan, Lakshmi; Shamma, Shihab; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)While humans can easily segregate and track a speaker's voice in a loud noisy environment, most modern speech recognition systems still perform poorly in loud background noise. The computational principles behind auditory source segregation in humans is not yet fully understood. In this dissertation, we develop a computational model for source segregation inspired by auditory processing in the brain. To support the key principles behind the computational model, we conduct a series of electro-encephalography experiments using both simple tone-based stimuli and more natural speech stimulus. Most source segregation algorithms utilize some form of prior information about the target speaker or use more than one simultaneous recording of the noisy speech mixtures. Other methods develop models on the noise characteristics. Source segregation of simultaneous speech mixtures with a single microphone recording and no knowledge of the target speaker is still a challenge. Using the principle of temporal coherence, we develop a novel computational model that exploits the difference in the temporal evolution of features that belong to different sources to perform unsupervised monaural source segregation. While using no prior information about the target speaker, this method can gracefully incorporate knowledge about the target speaker to further enhance the segregation.Through a series of EEG experiments we collect neurological evidence to support the principle behind the model. Aside from its unusual structure and computational innovations, the proposed model provides testable hypotheses of the physiological mechanisms of the remarkable perceptual ability of humans to segregate acoustic sources, and of its psychophysical manifestations in navigating complex sensory environments. Results from EEG experiments provide further insights into the assumptions behind the model and provide motivation for future single unit studies that can provide more direct evidence for the principle of temporal coherence.Item TEMPORAL CODING OF SPEECH IN HUMAN AUDITORY CORTEX(2012) Ding, Nai; Simon, Jonathan Z; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Human listeners can reliably recognize speech in complex listening environments. The underlying neural mechanisms, however, remain unclear and cannot yet be emulated by any artificial system. In this dissertation, we study how speech is represented in the human auditory cortex and how the neural representation contributes to reliable speech recognition. Cortical activity from normal hearing human subjects is noninvasively recorded using magnetoencephalography, during natural speech listening. It is first demonstrated that neural activity from auditory cortex is precisely synchronized to the slow temporal modulations of speech, when the speech signal is presented in a quiet listening environment. How this neural representation is affected by acoustic interference is then investigated. Acoustic interference degrades speech perception via two mechanisms, informational masking and energetic masking, which are addressed respectively by using a competing speech stream and a stationary noise as the interfering sound. When two speech streams are presented simultaneously, cortical activity is predominantly synchronized to the speech stream the listener attends to, even if the unattended, competing speech stream is 8 dB more intense. When speech is presented together with spectrally matched stationary noise, cortical activity remains precisely synchronized to the temporal modulations of speech until the noise is 9 dB more intense. Critically, the accuracy of neural synchronization to speech predicts how well individual listeners can understand speech in noise. Further analysis reveals that two neural sources contribute to speech synchronized cortical activity, one with a shorter response latency of about 50 ms and the other with a longer response latency of about 100 ms. The longer-latency component, but not the shorter-latency component, shows selectivity to the attended speech and invariance to background noise, indicating a transition from encoding the acoustic scene to encoding the behaviorally important auditory object, in auditory cortex. Taken together, we have demonstrated that during natural speech comprehension, neural activity in the human auditory cortex is precisely synchronized to the slow temporal modulations of speech. This neural synchronization is robust to acoustic interference, whether speech or noise, and therefore provides a strong candidate for the neural basis of acoustic background invariant speech recognition.Item Peripheral neural coding strategies for spectral analysis and sound source location in the non-teleost bony fish, Acipenser fulvescens(2008-04-26) Meyer, Michaela; Popper, Arthur N.; Fay, Richard R.; Biology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This work is an investigation of coding strategies for spectral analysis and sound source location in Acipenser fulvescens, the lake sturgeon. A. fulvescens belongs to one of the few extant non-teleost ray-finned fishes. The sturgeon taxonomic family, the (Acipenseridae), has a phylogenetic history that dates back about 200 million years. Studies on sensory coding in any species of this family or any other non-teleost species have not been conducted before. Thus, this is the first study on peripheral coding strategies by the auditory system in a non-teleost bony fish. For this study, a shaker system, similar to that used in previous experiments on teleosts, was used to simulate particle motion of sound at the ears and auditory periphery of A. fulvescens. Simultaneously, electrophysiological recordings of isolated single units from the eighth nerve were obtained. Peripheral coding strategies for spectral analysis and sound source location in A. fulvescens resembled those found in teleosts. Frequency data resembled the characteristics found for auditory afferents (with preferences for lower frequencies) in land vertebrates. In addition, the auditory periphery in A. fulvescens appears to be well suited to encode the intensity of sound. In terms of mechanisms for sound source location, eighth nerve afferents responded to directional stimuli in a cosine-like manner (as in teleosts), which can generally serve as the basis for coding the location of a sound source. Certain differences to teleosts were also found in A. fulvescens and these may have implications for the mechanisms for sound source location in azimuth. The common physiological characteristics between A. fulvescens, teleosts, and land vertebrates may reflect important functions (that are part of the process of auditory scene analysis) of the auditory system that have been conserved throughout the evolution of vertebrates.