Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
4 results
Search Results
Item Speech Segregation and Representation in the Ferret Auditory and Frontal Cortices(2022) Joshi, Neha Hemant; Shamma, Shihab; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The problem of separating overlapping streams of sound, and selectively attending to a sound of interest is ubiquitous in humans and animals alike as a means for survival and communication. This problem, known as the cocktail party problem, is the focus of this thesis, where we explore the neural correlates of two-speaker segregation in the auditory and frontal cortex, using the ferret as an animal model. While speech segregation has been studied extensively in humans using various non-invasive imaging as well as some restricted invasive techniques, these do not provide a way to obtain neural data at the single-unit level. In animal models, streaming studies have been limited to simple stimuli like tone streams, or sound in noise. In this thesis, we extend this work to understand how complex auditory stimuli such as human speech is encoded at the single-unit and population level in both the auditory cortex, as well as the frontal cortex of the ferret. In the first part of the thesis, we explore current literature in auditory streaming and design a behavioral task using the ferret as an animal model to perform stream segregation. We train ferrets to selectively listen to one speaker over another, and perform a task to indicate detection of the attended speaker. We show the validity of this behavioral task, and the reliability with which the animal performs this task of two speaker stream segregation. In the second part, we collect neurophysiological data which is post-processed to obtain data from single units in both the auditory cortex (the primary auditory cortex, and the secondary region which includes the dorsal posterior ectosylvian gyrus) as well as the dorsolateral aspect of the frontal cortex of the ferret. We analyse the data and present findings of how the auditory and frontal cortices encode the information required to reliably segregate the speaker of relevance from the mixture of two speakers, and the insights provided into stream segregation mechanisms and the cocktail party solved by animals using neural decoding approaches. We finally demonstrate that stream segregation has already begun at the level of the primary auditory cortex. In agreement with previous attention-modulated neural studies in the auditory cortex, we show that this stream segregation is more pronounced in the secondary cortex, where we see clear enhancement of the attended speaker, and suppression of the unattended speaker. We explore the contribution of various areas within the primary and secondary regions, and how it relates to speaker selectivity of individual neuronal units. We also study the neural encoding of top-down attention modulation in the ferret frontal cortex. Finally, we discuss the conclusions from these results in the broader context of their relevance to the field, and what future directions it may hold for the field.Item MEG, PSYCHOPHYSICAL AND COMPUTATIONAL STUDIES OF LOUDNESS, TIMBRE, AND AUDIOVISUAL INTEGRATION(2011) Jenkins III, Julian; Poeppel, David; Biology; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Natural scenes and ecological signals are inherently complex and understanding of their perception and processing is incomplete. For example, a speech signal contains not only information at various frequencies, but is also not static; the signal is concurrently modulated temporally. In addition, an auditory signal may be paired with additional sensory information, as in the case of audiovisual speech. In order to make sense of the signal, a human observer must process the information provided by low-level sensory systems and integrate it across sensory modalities and with cognitive information (e.g., object identification information, phonetic information). The observer must then create functional relationships between the signals encountered to form a coherent percept. The neuronal and cognitive mechanisms underlying this integration can be quantified in several ways: by taking physiological measurements, assessing behavioral output for a given task and modeling signal relationships. While ecological tokens are complex in a way that exceeds our current understanding, progress can be made by utilizing synthetic signals that encompass specific essential features of ecological signals. The experiments presented here cover five aspects of complex signal processing using approximations of ecological signals : (i) auditory integration of complex tones comprised of different frequencies and component power levels; (ii) audiovisual integration approximating that of human speech; (iii) behavioral measurement of signal discrimination; (iv) signal classification via simple computational analyses and (v) neuronal processing of synthesized auditory signals approximating speech tokens. To investigate neuronal processing, magnetoencephalography (MEG) is employed to assess cortical processing non-invasively. Behavioral measures are employed to evaluate observer acuity in signal discrimination and to test the limits of perceptual resolution. Computational methods are used to examine the relationships in perceptual space and physiological processing between synthetic auditory signals, using features of the signals themselves as well as biologically-motivated models of auditory representation. Together, the various methodologies and experimental paradigms advance the understanding of ecological signal analytics concerning the complex interactions in ecological signal structure.Item Representation of speech in the primary auditory cortex and its implications for robust speech processing(2008-08-05) Mesgarani, Nima; Shamma, Shihab; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Speech has evolved as a primary form of communication between humans. This most used means of communication has been the subject of intense study for years, but there is still a lot that we do not know about it. It is an oft repeated fact, that even the performance of the best speech processing algorithms still lags far behind that of the average human, It seems inescapable that unless we know more about the way the brain performs this task, our machines can not go much further. This thesis focuses on the question of speech representation in the brain, both from a physiological and technological perspective. We explore the representation of speech through the encoding of its smallest elements - phonemic features - in the primary auditory cortex. We report on how population of neurons with diverse tuning properties respond discriminately to phonemes resulting in explicit encoding of their parameters. Next, we show that this sparse encoding of the phonemic features is a simple consequence of the linear spectro-temporal properties of the auditory cortical neurons and that a Spectro-Temporal receptive field model can predict similar patterns of activation. This is an important step toward the realization of systems that operate based on the same principles as the cortex. Using an inverse method of reconstruction, we shall also explore the extent to which phonemic features are preserved in the cortical representation of noisy speech. The results suggest that the cortical responses are more robust to noise and that the important features of phonemes are preserved in the cortical representation even in noise. Finally, we explain how a model of this cortical representation can be used for speech processing and enhancement applications to improve their robustness and performance.Item The Wheel of Language: Representing Speech in Middle English Narrative, 1377-1422(2008-04-22) Coley, David Kennedy; Coletti, Theresa M; English Language and Literature; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This dissertation examines representations of speech in narrative poetry in English between 1377 and 1422, a four-and-a-half decade span marked by almost constant political, religious, social and economic upheaval. By analyzing the work that late medieval writers imagined the spoken word to perform - or, alternately, by examining how speech acts functioned performatively in medieval literary discourse - the author demonstrates how the spoken word functioned as a defining link between the Middle English text and the cultural tumult of the late medieval period. More important, by focusing on speech as a distinct category within linguistic discourse, the study allows for a reappraisal of the complicated relationships between text and cultural environment that have been illuminated by scholarship on the politics of vernacularity and the development of the English language. Chapter one uses The Manciple's Tale to probe Chaucer's engagement with the nominalist philosophy of William of Ockham, a philosophy which opposed the via antiqua and threatened to overturn the linguistic, epistemological, and ontological hierarchies that had been prevailed in various forms since the writings of Augustine of Hippo. Chapter two analyzes representations of sacramental and priestly speech in the anonymous Saint Erkenwald. By doing so, it redirects the critical conversation about the poem away from the role of baptism in redeeming the righteous heathen and toward the eucharistic theology that undergirds it, a critical that shift extends our understanding of the poem's engagement with the emerging Wycliffite heresy and with typological notions of medieval Christian identity. Chapter three focuses on the works of Thomas Hoccleve, fifteenth-century Privy Seal clerk and would-be court poet. By examining the overtly performative speech acts in Hoccleve's Marian lyrics, particularly "The Story of The Monk Who Clad the Virgin," it establishes the existence of an idiosyncratic economy of speech within the poet's canon, an economy that becomes paradigmatic for the mingled systems of monetary and interpersonal exchange that prevailed in the Lancastrian dynasty's early decades.