Skip to content
University of Maryland LibrariesDigital Repository at the University of Maryland
    • Login
    View Item 
    •   DRUM
    • A. James Clark School of Engineering
    • Institute for Systems Research Technical Reports
    • View Item
    •   DRUM
    • A. James Clark School of Engineering
    • Institute for Systems Research Technical Reports
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Robust Speech Recognition by Topology Preserving Adaptation

    Thumbnail
    View/Open
    PhD_2000-4.pdf (5.117Mb)
    No. of downloads: 489

    Date
    2000
    Author
    Sonmez, Kemal S.
    Advisor
    Baras, John S.
    Metadata
    Show full item record
    Abstract
    The performance degradation as a result of acoustical environment mismatch remains an important practical problem in speech recognition.The problem carries a greater significance in applications overtelecommunication channels, especially with the wider use ofpersonal communications systems such as cellular phoneswhich invariably present challenging acoustical conditions. Such conditions are difficult to model analytically for a generalspeech representation, and most existing data-driven models require simultaneous ("stereo") recordings of training and testing environments,impractical to collect in most cases of interest.<p>In this dissertation, we propose an invariance principle fornon-parametric speech representations in acoustical environments.We stipulate that the topology of the codevectors in a vector quantization (VQ) codebookas defined in terms of class posterior distributionswill be preserved in a certain information-theoretic sense,and make this invariance principle our basis in deriving normalizationalgorithms that correct for the acoustical mismatch betweenenvironments.<p>We develop topology preserving algorithms in two frameworks, constrained distortionminimization (VQ with a topology preservation constraint) andinformation geometry (alternating minimization with a topology preservation constraint) and show their equivalence.Finally, we report results on the <I>Wall Street Journal</I> data,the Spoken Speed Dial corpus and the TI Cellular Corpus.<p>The algorithm is shown to improve performancesignificantly in all three tasks, most notably in the more difficult problemof cellular hands free microphone speech wherethe technique decreases theword error for continuous ten digit recognition from 23.8% to 13.6% and the speaker dependent voice callingsentence error from 16.5% to 10.6%.
    URI
    http://hdl.handle.net/1903/6147
    Collections
    • Institute for Systems Research Technical Reports

    Related items

    Showing items related by title, author, creator and subject.

    • Automatic Speech Codec Identification with Applications to Tampering Detection of Speech Recordings 

      Zhou, Jingting (2011)
      In this work many versions of CELP codecs are explored, and an observation is made that different codebooks are used to encode noisy part of residual. Taking advantage of noise patterns they generated, an algorithm was ...
    • Representation of speech in the primary auditory cortex and its implications for robust speech processing 

      Mesgarani, Nima (2008-08-05)
      Speech has evolved as a primary form of communication between humans. This most used means of communication has been the subject of intense study for years, but there is still a lot that we do not know about it. It is an ...
    • Discrimination of Speech From Non-Speech Based on Multiscale Spectro-Temporal Modulations 

      Mesgarani, Nima (2005-05-16)
      We describe a content-based audio classification algorithm based on novel multiscale spectrotemporal modulation features inspired by a model of auditory cortical processing. The task explored is to discriminate speech from ...

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility
     

     

    Browse

    All of DRUMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister
    Pages
    About DRUMAbout Download Statistics

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility