Comparison Studies of Several Microphone Robustness Techniques

Loading...
Thumbnail Image
Files
TR_94-30.pdf(233.04 KB)
No. of downloads: 379
Publication or External Link
Date
1994
Authors
Sonmez, M.K.
Kao, Yu-Hung
Rajasekaran, P.K.
Baras, John S.
Advisor
Citation
DRUM DOI
Abstract
We study the effectiveness of various microphone robustness techniques from the viewpoint of speech recognition, utilizing the ARPA-sponsored Wall Street Journal (WSJ) data base [1]. Two of the techniques considered are being introduced in this paper: two cepstral normalization algorithms utilizing the artificial neural network techniques Self Organizing Map (SOM) and Learning Vector Quantization (LVQ). The algorithms obtained are low- complexity non-parametric counterparts of the parametric approaches Codeword-dependent Cepstral Normalization (CDCN) and Fixed CDCN (FCDCN). The other techniques considered are Cepstral Mean Normalization (CMN), RASTA, SNR-dependent Cepstral Normalization (SDCN), Interpolated SDCN (ISDCN), CDCN, FCDCN; some of these techniques require one or more of the following information: stereo data, SNR estimate, single microphone data for adaptation, and knowledge of the microphone used for the specific data under test. We determine the effectiveness in several ways: (i) scattergram plot of the speech frame parameter vector (usually a cepstral vector), (ii) adjusted deviation ratio, measured from scattergram, and (iii) correctness of classifying a test vector into a vector code book. All these measures have direct correlation with speech recognition performance, which will be measured with experiments to be conducted.
Notes
Rights