Application of Auditory Representations on Speaker Identification

Chi, Taishih

Application of Auditory Representations on Speaker Identification

Files

MS_97-9.pdf (935.51 KB)

No. of downloads: 536

Date

1997

Authors

Chi, Taishih

Advisor

Shamma, S. A.

Abstract

The noise-robustness of auditory spectrum and cortical representation is examined by applying it to text-independent speaker identification tasks. A Bayes classifier residing on an M-ary hypothesis test is employed to evaluate the robustness of the auditory cepstrum and demonstrate its superior performance to that of the well-studied mel-cepstrum. In addition, the phase feature of the wavelet-transform based multiscale cortical representation is shown to be much more stable than the magnitude feature in characterizing speakers by correlator technique, which is traditionally used in scene matching application. This observation is consistent with physiological and psychoacoustic phenomena. The underlying purpose of this study is to inspect the inherent robustness of auditory representations derived from a human perception-based model. The experimental results indicate that biologically motivated features significantly enhance speaker identification accuracy in noisy environments.

URI (handle)

http://hdl.handle.net/1903/5898

Collections

Institute for Systems Research Technical Reports

Full item page