Linguistics

Permanent URI for this communityhttp://hdl.handle.net/1903/2255

Browse

Search Results

Now showing 1 - 5 of 5
  • Thumbnail Image
    Item
    MODELING ADAPTABILITY MECHANISMS OF SPEECH PERCEPTION Nika Jurov
    (2024) Jurov, Nika; Feldman, Naomi H.; Idsardi, William; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Speech is a complex, redundant and variable signal happening in a noisy and ever changing world. How do listeners navigate these complex auditory scenes and continuously and effortlessly understand most of the speakers around them? Studies show that listeners can quickly adapt to new situations, accents and even to distorted speech. Although prior research has established that listeners rely more on some speech cues (or also called features or dimensions) than others, it is yet not understood how listeners weight them flexibly on a moment-to-moment basis when the input might deviate from the standard speech. This thesis computationally explores flexible cue re-weighting as an adaptation mechanism using real speech corpora. The computational framework it relies on is rate distortion theory. This framework models a channel that is optimized on a trade off between distortion and rate: on the one hand, the input signal should be reconstructed with minimal error after it goes through the channel. On the other hand, the channel needs to extract parsimonious information from the incoming data. This channel can be implemented as a neural network with a beta variational auto-encoder. We use this model to show that two mechanistic components are needed for adaptation: focus and switch. We firstly show that focus on a cue mimics humans better than cue weights that simply depend on long term statistics as has been largely assumed in the prior research. And second, we show a new model that can quickly adapt and switch weighting the features depending on the input of a particular moment. This model's flexibility comes from implementing a cognitive mechanism that has been called ``selective attention" with multiple encoders. Each encoder serves as a focus on a different part of the signal. We can then choose how much to rely on each focus depending on the moment. Finally, we ask whether cue weighting is informed by being able to separate the noise from speech. To this end we adapt a feature disentanglement adversarial training from vision to disentangle speech (noise) features from noise (speech) labels. We show that although this does not give us human-like cue weighting behavior, there is an effect of disentanglement of weighting spectral information slightly more than temporal information compared to the baselines. Overall, this thesis explores adaptation computationally and offers a possible mechanistic explanation for ``selective attention'' with focus and switch mechanisms, based on rate distortion theory. It also argues that cue weighting cannot be determined solely on speech carefully articulated in laboratories or in quiet. Lastly, it explores a way to inform speech models from a cognitive angle to make the models more flexible and robust, like human speech perception is.
  • Thumbnail Image
    Item
    Relating lexical and syntactic processes in language: Bridging research in humans and machines
    (2018) Ettinger, Allyson; Phillips, Colin; Resnik, Philip; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Potential to bridge research on language in humans and machines is substantial - as linguists and cognitive scientists apply scientific theory and methods to understand how language is processed and represented by humans, computer scientists apply computational methods to determine how to process and represent language in machines. The present work integrates approaches from each of these domains in order to tackle an issue of relevance for both: the nature of the relationship between low-level lexical processes and syntactically-driven interpretation processes. In the first part of the dissertation, this distinction between lexical and syntactic processes focuses on understanding asyntactic lexical effects in online sentence comprehension in humans, and the relationship of those effects to syntactically-driven interpretation processes. I draw on computational methods for simulating these lexical effects and their relationship to interpretation processes. In the latter part of the dissertation, the lexical/syntactic distinction is focused on the application of semantic composition to complex lexical content, for derivation of sentence meaning. For this work I draw on methodology from cognitive neuroscience and linguistics to analyze the capacity of natural language processing systems to do vector-based sentence composition, in order to improve the capacities of models to compose and represent sentence meaning.
  • Thumbnail Image
    Item
    Information and Incrementality in Syntactic Bootstrapping
    (2015) White, Aaron Steven; Hacquard, Valentine; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Some words are harder to learn than others. For instance, action verbs like "run" and "hit" are learned earlier than propositional attitude verbs like "think" and "want." One reason "think" and "want" might be learned later is that, whereas we can see and hear running and hitting, we can't see or hear thinking and wanting. Children nevertheless learn these verbs, so a route other than the senses must exist. There is mounting evidence that this route involves, in large part, inferences based on the distribution of syntactic contexts a propositional attitude verb occurs in---a process known as "syntactic bootstrapping." This fact makes the domain of propositional attitude verbs a prime proving ground for models of syntactic bootstrapping. With this in mind, this dissertation has two goals: on the one hand, it aims to construct a computational model of syntactic bootstrapping; on the other, it aims to use this model to investigate the limits on the amount of information about propositional attitude verb meanings that can be gleaned from syntactic distributions. I show throughout the dissertation that these goals are mutually supportive. In Chapter 1, I set out the main problems that drive the investigation. In Chapters 2 and 3, I use both psycholinguistic experiments and computational modeling to establish that there is a significant amount of semantic information carried in both participants' syntactic acceptability judgments and syntactic distributions in corpora. To investigate the nature of this relationship I develop two computational models: (i) a nonnegative model of (semantic-to-syntactic) projection and (ii) a nonnegative model of syntactic bootstrapping. In Chapter 4, I use a novel variant of the Human Simulation Paradigm to show that the information carried in syntactic distribution is actually utilized by (simulated) learners. In Chapter 5, I present a proposal for how to solve a standing problem in how syntactic bootstrapping accounts for certain kinds of cross-linguistic variation. And in Chapter 6, I conclude with future directions for this work.
  • Thumbnail Image
    Item
    Computational modeling of the role of discourse information in language production and language acquisition
    (2015) Orita, Naho; Feldman, Naomi H; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This dissertation explores the role of discourse information in language production and language acquisition. Discourse information plays an important role in various aspects of linguistic processes and learning. However, characterizing what it is and how it is used has been challenging. Previous studies on discourse tend to focus on the correlations between certain discourse factors and speaker/comprehender's behavior, rather than looking at how the discourse information is used in the system of language and why. This dissertation aims to provide novel insights into the role of discourse information by formalizing how it is represented and how it is used. First, I formalize the latent semantic information in humans' discourse representations by examining speakers' choices of referring expressions. Simulation results suggest that topic models can capture aspects of discourse representations that are relevant to the choices of referring expressions, beyond simple referent frequency. Second, I propose a language production model that extends the rational speech act model from \citeA{frank2012predicting} to incorporate updates to listeners' beliefs as discourse proceeds. Simulations suggest that speakers' behavior can be modeled in a principled way by considering the probabilities of referents in the discourse and the information conveyed by each word. Third, I examine the role of discourse information in language acquisition, focusing on the learning of grammatical categories of pronouns. I show that a Bayesian model with prior discourse knowledge can accurately recover grammatical categories of pronouns, but simply having strong syntactic prior knowledge is not sufficient. This suggests that discourse information can help learners acquire grammatical categories of pronouns. Throughout this dissertation, I propose frameworks for modeling speakers and learners using techniques from Bayesian modeling. These models provide ways to flexibly investigate the effects of various sources of information, including discourse salience, expectations about referents and grammatical knowledge.
  • Thumbnail Image
    Item
    Structured local exponential models for machine translation
    (2010) Subotin, Michael; Resnik, Philip; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This thesis proposes a synthesis and generalization of local exponential translation models, the subclass of feature-rich translation models which associate probability distributions with individual rewrite rules used by the translation system, such as synchronous context-free rules, or with other individual aspects of translation hypotheses such as word pairs or reordering events. Unlike other authors we use these estimates to replace the traditional phrase models and lexical scores, rather than in addition to them, thereby demonstrating that the local exponential phrase models can be regarded as a generalization of standard methods not only in theoretical but also in practical terms. We further introduce a form of local translation models that combine features associated with surface forms of rules and features associated with less specific representation -- including those based on lemmas, inflections, and reordering patterns -- such that surface-form estimates are recovered as a special case of the model. Crucially, the proposed approach allows estimation of parameters for the latter type of features from training sets that include multiple source phrases, thereby overcoming an important training set fragmentation problem which hampers previously proposed local translation models. These proposals are experimentally validated. Conditioning all phrase-based probabilities in a hierarchical phrase-based system on source-side contextual information produces significant performance improvements. Extending the contextually-sensitive estimates with features modeling source-side morphology and reordering patterns yields consistent additional improvements, while further experiments show significant improvements obtained from modeling observed and unobserved inflections for a morphologically rich target language.