Single-Microphone Speech Enhancement Inspired by Auditory System

Mirbagheri, Majid

Single-Microphone Speech Enhancement Inspired by Auditory System

dc.contributor.advisor	Shamma, Shihab	en_US
dc.contributor.author	Mirbagheri, Majid	en_US
dc.contributor.department	Electrical Engineering	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2014-10-16T05:37:44Z
dc.date.available	2014-10-16T05:37:44Z
dc.date.issued	2014	en_US
dc.description.abstract	Enhancing quality of speech in noisy environments has been an active area of research due to the abundance of applications dealing with human voice and dependence of their performance on this quality. While original approaches in the field were mostly addressing this problem in a pure statistical framework in which the goal was to estimate speech from its sum with other independent processes (noise), during last decade, the attention of the scientific community has turned to the functionality of human auditory system. A lot of effort has been put to bridge the gap between the performance of speech processing algorithms and that of average human by borrowing the models suggested for the sound processing in the auditory system. In this thesis, we will introduce algorithms for speech enhancement inspired by two of these models i.e. the cortical representation of sounds and the hypothesized role of temporal coherence in the auditory scene analysis. After an introduction to the auditory system and the speech enhancement framework we will first show how traditional speech enhancement technics such as wiener-filtering can benefit on the feature extraction level from discriminatory capabilities of spectro-temporal representation of sounds in the cortex i.e. the cortical model. We will next focus on the feature processing as opposed to the extraction stage in the speech enhancement systems by taking advantage of models hypothesized for human attention for sound segregation. We demonstrate a mask-based enhancement method in which the temporal coherence of features is used as a criterion to elicit information about their sources and more specifically to form the masks needed to suppress the noise. Lastly, we explore how the two blocks for feature extraction and manipulation can be merged into one in a manner consistent with our knowledge about auditory system. We will do this through the use of regularized non-negative matrix factorization to optimize the feature extraction and simultaneously account for temporal dynamics to separate noise from speech.	en_US
dc.identifier	https://doi.org/10.13016/M2WP44
dc.identifier.uri	http://hdl.handle.net/1903/15905
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Electrical engineering	en_US
dc.subject.pquncontrolled	Attention	en_US
dc.subject.pquncontrolled	Auditory Scene Analysis	en_US
dc.subject.pquncontrolled	Computational Neuroscience	en_US
dc.subject.pquncontrolled	Noise Suppression	en_US
dc.subject.pquncontrolled	Sound Segregation	en_US
dc.subject.pquncontrolled	Speech Enhancement	en_US
dc.title	Single-Microphone Speech Enhancement Inspired by Auditory System	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Mirbagheri_umd_0117E_15547.pdf
Size:: 6.13 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Electrical & Computer Engineering Theses and Dissertations