Comparison Studies of Several Microphone Robustness Techniques
Publication or External Link
We study the effectiveness of various microphone robustness techniques from the viewpoint of speech recognition, utilizing the ARPA-sponsored Wall Street Journal (WSJ) data base . Two of the techniques considered are being introduced in this paper: two cepstral normalization algorithms utilizing the artificial neural network techniques Self Organizing Map (SOM) and Learning Vector Quantization (LVQ). The algorithms obtained are low- complexity non-parametric counterparts of the parametric approaches Codeword-dependent Cepstral Normalization (CDCN) and Fixed CDCN (FCDCN). The other techniques considered are Cepstral Mean Normalization (CMN), RASTA, SNR-dependent Cepstral Normalization (SDCN), Interpolated SDCN (ISDCN), CDCN, FCDCN; some of these techniques require one or more of the following information: stereo data, SNR estimate, single microphone data for adaptation, and knowledge of the microphone used for the specific data under test. We determine the effectiveness in several ways: (i) scattergram plot of the speech frame parameter vector (usually a cepstral vector), (ii) adjusted deviation ratio, measured from scattergram, and (iii) correctness of classifying a test vector into a vector code book. All these measures have direct correlation with speech recognition performance, which will be measured with experiments to be conducted.