Application of Auditory Representations on Speaker Identification

Loading...
Thumbnail Image

Files

MS_97-9.pdf (935.51 KB)
No. of downloads: 526

Publication or External Link

Date

1997

Citation

DRUM DOI

Abstract

The noise-robustness of auditory spectrum and cortical representation is examined by applying it to text-independent speaker identification tasks. A Bayes classifier residing on an M-ary hypothesis test is employed to evaluate the robustness of the auditory cepstrum and demonstrate its superior performance to that of the well-studied mel-cepstrum. In addition, the phase feature of the wavelet-transform based multiscale cortical representation is shown to be much more stable than the magnitude feature in characterizing speakers by correlator technique, which is traditionally used in scene matching application. This observation is consistent with physiological and psychoacoustic phenomena. The underlying purpose of this study is to inspect the inherent robustness of auditory representations derived from a human perception-based model. The experimental results indicate that biologically motivated features significantly enhance speaker identification accuracy in noisy environments.

Notes

Rights