Maximum Likelihood Pitch Estimation Using Sinusoidal Modeling
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
The aim of the work presented in this thesis is to automatically extract the fundamental frequency of a periodic signal from noisy observations, a task commonly referred to as pitch estimation. An algorithm for optimal pitch estimation using a maximum likelihood formulation is presented. The speech waveform is modeled using sinusoidal basis functions that are harmonically tied together to explicitly capture the periodic structure of voiced speech. The problem of pitch estimation is casted as a model selection problem and the Akaike Information Criterion is used to estimate the pitch. The algorithm is compared with several existing pitch detection algorithms (PDAs) on a reference pitch database. The results indicate the superior performance of the algorithm in comparison with most of the PDAs. The application of parametric modeling in single channel speech segregation and the use of mel-frequency cepstral coefficients for sequential grouping are analyzed in the speech separation challenge database.