Linguistics Theses and Dissertations

Permanent URI for this collectionhttp://hdl.handle.net/1903/2787

Browse

Search Results

Now showing 1 - 10 of 11
  • Thumbnail Image
    Item
    MODELING ADAPTABILITY MECHANISMS OF SPEECH PERCEPTION Nika Jurov
    (2024) Jurov, Nika; Feldman, Naomi H.; Idsardi, William; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Speech is a complex, redundant and variable signal happening in a noisy and ever changing world. How do listeners navigate these complex auditory scenes and continuously and effortlessly understand most of the speakers around them? Studies show that listeners can quickly adapt to new situations, accents and even to distorted speech. Although prior research has established that listeners rely more on some speech cues (or also called features or dimensions) than others, it is yet not understood how listeners weight them flexibly on a moment-to-moment basis when the input might deviate from the standard speech. This thesis computationally explores flexible cue re-weighting as an adaptation mechanism using real speech corpora. The computational framework it relies on is rate distortion theory. This framework models a channel that is optimized on a trade off between distortion and rate: on the one hand, the input signal should be reconstructed with minimal error after it goes through the channel. On the other hand, the channel needs to extract parsimonious information from the incoming data. This channel can be implemented as a neural network with a beta variational auto-encoder. We use this model to show that two mechanistic components are needed for adaptation: focus and switch. We firstly show that focus on a cue mimics humans better than cue weights that simply depend on long term statistics as has been largely assumed in the prior research. And second, we show a new model that can quickly adapt and switch weighting the features depending on the input of a particular moment. This model's flexibility comes from implementing a cognitive mechanism that has been called ``selective attention" with multiple encoders. Each encoder serves as a focus on a different part of the signal. We can then choose how much to rely on each focus depending on the moment. Finally, we ask whether cue weighting is informed by being able to separate the noise from speech. To this end we adapt a feature disentanglement adversarial training from vision to disentangle speech (noise) features from noise (speech) labels. We show that although this does not give us human-like cue weighting behavior, there is an effect of disentanglement of weighting spectral information slightly more than temporal information compared to the baselines. Overall, this thesis explores adaptation computationally and offers a possible mechanistic explanation for ``selective attention'' with focus and switch mechanisms, based on rate distortion theory. It also argues that cue weighting cannot be determined solely on speech carefully articulated in laboratories or in quiet. Lastly, it explores a way to inform speech models from a cognitive angle to make the models more flexible and robust, like human speech perception is.
  • Thumbnail Image
    Item
    GENERATING AND MEASURING PREDICTIONS IN LANGUAGE PROCESSING
    (2023) Nakamura, Masato; Philips, Colin; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Humans can comprehend utterances quickly, efficiently, and often robustly against noise in the inputs. Researchers have argued that such a remarkable ability is supported by prediction of upcoming inputs. If people use the context to infer what they would hear/see and prepare for likely inputs, they should be able to efficiently process the predicted inputs.This thesis investigates how contexts can predictively activate lexical representations (lexical pre-activation). I address two different aspects of prediction: (i) how pre-activation is generated using contextual information and stored knowledge, and (ii) how pre-activation is reflected in different measures. I first assess the linking hypothesis of the speeded cloze task, a measure of pre-activation, through computational simulations. I demonstrate that an earlier model accounts for qualitative patterns of human data but fails to predict quantitative patterns. I argue that a model with an additional but reasonable assumption of lateral inhibition successfully explains these patterns. Building on the first study, I demonstrate that pre-activation measures fail to align with each other in cases called argument role reversals, even if the time courses and stimuli are carefully matched. The speeded cloze task shows that “role-appropriate” serve in ... which customer the waitress had served is more strongly pre-activated compared to the “role- inappropriate” serve in ... which waitress the customer had served. On the other hand, the N400 amplitude, which is another pre-activation measure, does not show contrasts be- tween the role-appropriate and inappropriate serve. Accounting for such a mismatch between measures in argument role reversals provides insights into whether and how argument roles constrain pre-activation as well as how different measures reflect pre-activation. Subsequent studies addressed whether pre-activation is sensitive to argument roles or not. Analyses of context-wise variability of role-inappropriate candidates suggest that there are some role-inappropriate pre-activations even in the speeded cloze task. The next study at- tempts to directly contrast pre-activations of role-appropriate and inappropriate candidates, eliminating the effect of later confounding processes by distributional analyses of reaction times. While one task suggests that role-appropriate candidates are more strongly pre- activated compared to the role-inappropriate candidates, the other task suggests that they have matched pre-activation. Finally, I examine the influence of role-appropriate competitors on role-inappropriate competitors. The analyses of speeded cloze data suggest that N400 amplitudes can be sensitive to argument roles when there are strong role-appropriate competitors. This finding can be explained by general role-insensitivity and partial role-sensitivity in pre-activation processes. Combined together, these studies suggest that pre-activation processes are generally insensitive to argument roles, but some role-sensitive mechanisms can cause role-sensitivity in pre-activation measures under some circumstances.
  • Thumbnail Image
    Item
    All about alles: The syntax of wh-quantifier float in German
    (2021) Doliana, Aaron Gianmaria Gabriel; Lasnik, Howard; Hornstein, Norbert; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This thesis offers an in-depth investigation of “wh-quantifier float” of the quantifying particle ‘alles’ in German. 'Alles' (etymologically, ‘all’) appears in wh-questions like 'Wen alles hat die Mare eingeladen?' (‘Who-all did Mare invite?’). The thesis focuses on the syntactic distribution of 'alles'. 'Alles' enjoys a wide distribution in the clause. It can occur both ‘adjacent’ to its ‘associate’ wh-phrase, and ‘distant’ from it, in various positions of the clause. I address three questions: What determines the distribution of 'alles'? Are adjacent 'alles' and ‘distal alles’ the same category? What licenses distal 'alles'? I answer these questions by arguing for a stranding analysis of distal 'alles': 'alles' and its associate form a first-Merge constituent, which is optionally separated in the course of the derivation through a process that involves movement ([WH alles] ⇒ [WH. . . [[WH alles]. . . ]]). The conclusion is compatible with prior analyses that argued for or assumed (a) constituency, and (b) a movement dependency in overt syntax. The conclusion is at odds with adverbial analyses, which assume that distal 'alles' is an adverbial. I provide two main empirical arguments. First, I argue against the idea that distal 'alles' and adjacent 'alles' are separate lexical items, or have different lexical content. Second, I argue that the “Chain Link Generalization” is the most accurate generalization for the distribution of 'alles': Given a derivation involving 'alles' and a licit associate, 'alles' may appear in any position which hosts an Abar-chain link of the associate, and in no other position. I show that 'alles' has “no distribution of its own in the clause”. Rather, the distribution of 'alles' depends on the potential distribution of its associate and can be predicted by the associate’s category, the associate’s base-position, the derivation that the associate undergoes in a given sentence. Conceptually, I argue that a stranding analysis is favored by simplicity as most generalizations established in this dissertation are directly entailed by it.
  • Thumbnail Image
    Item
    The Psycho-logic of Universal Quantifiers
    (2021) Knowlton, Tyler Zarus; Lidz, Jeffrey; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    A universally quantified sentence like every frog is green is standardly thought to express a two-place second-order relation (e.g., the set of frogs is a subset of the set of green things). This dissertation argues that as a psychological hypothesis about how speakers mentally represent universal quantifiers, this view is wrong in two respects. First, each, every, and all are not represented as two-place relations, but as one-place descriptions of how a predicate applies to a restricted domain (e.g., relative to the frogs, everything is green). Second, while every and all are represented in a second-order way that implicates a group, each is represented in a completely first-order way that does not involve grouping the satisfiers of a predicate together (e.g., relative to individual frogs, each one is green).These “psycho-logical” distinctions have consequences for how participants evaluate sentences like every circle is green in controlled settings. In particular, participants represent the extension of the determiner’s internal argument (the cir- cles), but not the extension of its external argument (the green things). Moreover, the cognitive system they use to represent the internal argument differs depend- ing on the determiner: Given every or all, participants show signatures of forming ensemble representations, but given each, they represent individual object-files. In addition to psychosemantic evidence, the proposed representations provide explanations for at least two semantic phenomena. The first is the “conservativity” universal: All determiners allow for duplicating their first argument in their second argument without a change in informational significance (e.g., every fish swims has the same truth-conditions as every fish is a fish that swims). This is a puzzling gen- eralization if determiners express two-place relations, but it is a logical consequence if they are devices for forming one-place restricted quantifiers. The second is that every, but not each, naturally invites certain kinds of generic interpretations (e.g., gravity acts on every/#each object). This asymmetry can po- tentially be explained by details of the interfacing cognitive systems (ensemble and object-file representations). And given that the difference leads to lower-level con- comitants in child-ambient speech (as revealed by a corpus investigation), children may be able to leverage it to acquire every’s second-order meaning. This case study on the universal quantifiers suggests that knowing the meaning of a word like every consists not just in understanding the informational contribu- tion that it makes, but in representing that contribution in a particular format. And much like phonological representations provide instructions to the motor plan- ning system, it supports the idea that meaning representations provide (sometimes surprisingly precise) instructions to conceptual systems.
  • Thumbnail Image
    Item
    Information and Incrementality in Syntactic Bootstrapping
    (2015) White, Aaron Steven; Hacquard, Valentine; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Some words are harder to learn than others. For instance, action verbs like "run" and "hit" are learned earlier than propositional attitude verbs like "think" and "want." One reason "think" and "want" might be learned later is that, whereas we can see and hear running and hitting, we can't see or hear thinking and wanting. Children nevertheless learn these verbs, so a route other than the senses must exist. There is mounting evidence that this route involves, in large part, inferences based on the distribution of syntactic contexts a propositional attitude verb occurs in---a process known as "syntactic bootstrapping." This fact makes the domain of propositional attitude verbs a prime proving ground for models of syntactic bootstrapping. With this in mind, this dissertation has two goals: on the one hand, it aims to construct a computational model of syntactic bootstrapping; on the other, it aims to use this model to investigate the limits on the amount of information about propositional attitude verb meanings that can be gleaned from syntactic distributions. I show throughout the dissertation that these goals are mutually supportive. In Chapter 1, I set out the main problems that drive the investigation. In Chapters 2 and 3, I use both psycholinguistic experiments and computational modeling to establish that there is a significant amount of semantic information carried in both participants' syntactic acceptability judgments and syntactic distributions in corpora. To investigate the nature of this relationship I develop two computational models: (i) a nonnegative model of (semantic-to-syntactic) projection and (ii) a nonnegative model of syntactic bootstrapping. In Chapter 4, I use a novel variant of the Human Simulation Paradigm to show that the information carried in syntactic distribution is actually utilized by (simulated) learners. In Chapter 5, I present a proposal for how to solve a standing problem in how syntactic bootstrapping accounts for certain kinds of cross-linguistic variation. And in Chapter 6, I conclude with future directions for this work.
  • Thumbnail Image
    Item
    Bayesian Model of Categorical Effects in L1 and L2 Speech Processing
    (2014) Kronrod, Yakov; Feldman, Naomi; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In this dissertation I present a model that captures categorical effects in both first language (L1) and second language (L2) speech perception. In L1 perception, categorical effects range between extremely strong for consonants to nearly continuous perception of vowels. I treat the problem of speech perception as a statistical inference problem and by quantifying categoricity I obtain a unified model of both strong and weak categorical effects. In this optimal inference mechanism, the listener uses their knowledge of categories and the acoustics of the signal to infer the intended productions of the speaker. The model splits up speech variability into meaningful category variance and perceptual noise variance. The ratio of these two variances, which I call Tau, directly correlates with the degree of categorical effects for a given phoneme or continuum. By fitting the model to behavioral data from different phonemes, I show how a single parametric quantitative variation can lead to the different degrees of categorical effects seen in perception experiments with different phonemes. In L2 perception, L1 categories have been shown to exert an effect on how L2 sounds are identified and how well the listener is able to discriminate them. Various models have been developed to relate the state of L1 categories with both the initial and eventual ability to process the L2. These models largely lacked a formalized metric to measure perceptual distance, a means of making a-priori predictions of behavior for a new contrast, and a way of describing non-discrete gradient effects. In the second part of my dissertation, I apply the same computational model that I used to unify L1 categorical effects to examining L2 perception. I show that we can use the model to make the same type of predictions as other SLA models, but also provide a quantitative framework while formalizing all measures of similarity and bias. Further, I show how using this model to consider L2 learners at different stages of development we can track specific parameters of categories as they change over time, giving us a look into the actual process of L2 category development.
  • Thumbnail Image
    Item
    Pragmatic enrichment in language processing and development
    (2013) Lewis, Shevaun; Phillips, Colin; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The goal of language comprehension for humans is not just to decode the semantic content of sentences, but rather to grasp what speakers intend to communicate. To infer speaker meaning, listeners must at minimum assess whether and how the literal meaning of an utterance addresses a question under discussion in the conversation. In cases of implicature, where the speaker intends to communicate more than just the literal meaning, listeners must access additional relevant information in order to understand the intended contribution of the utterance. I argue that the primary challenge for inferring speaker meaning is in identifying and accessing this relevant contextual information. In this dissertation, I integrate evidence from several different types of implicature to argue that both adults and children are able to execute complex pragmatic inferences relatively efficiently, but encounter some difficulty finding what is relevant in context. I argue that the variability observed in processing costs associated with adults' computation of scalar implicatures can be better understood by examining how the critical contextual information is presented in the discourse context. I show that children's oft-cited hyper-literal interpretation style is limited to scalar quantifiers. Even 3-year-olds are adept at understanding indirect requests and "parenthetical" readings of belief reports. Their ability to infer speaker meanings is limited only by their relative inexperience in conversation and lack of world knowledge.
  • Thumbnail Image
    Item
    Respecting Relations: Memory Access and Antecedent Retrieval in Incremental Sentence Processing
    (2013) Kush, Dave W; Phillips, Colin; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This dissertation uses the processing of anaphoric relations to probe how linguistic information is encoded in and retrieved from memory during real-time sentence comprehension. More specifically, the dissertation attempts to resolve a tension between the demands of a linguistic processor implemented in a general-purpose cognitive architecture and the demands of abstract grammatical constraints that govern language use. The source of the tension is the role that abstract configurational relations (such as c-command, Reinhart 1983) play in constraining computations. Anaphoric dependencies are governed by formal grammatical constraints stated in terms of relations. For example, Binding Principle A (Chomsky 1981) requires that antecedents for local anaphors (like the English reciprocal each other) bear the c-command relation to those anaphors. In incremental sentence processing, antecedents of anaphors must be retrieved from memory. Recent research has motivated a model of processing that exploits a cue-based, associative retrieval process in content-addressable memory (e.g. Lewis, Vasishth & Van Dyke 2006) in which relations such as c-command are difficult to use as cues for retrieval. As such, the c-command constraints of formal grammars are predicted to be poorly implemented by the retrieval mechanism. I examine retrieval's sensitivity to three constraints on anaphoric dependencies: Principle A (via Hindi local reciprocal licensing), the Scope Constraint on bound-variable pronoun licensing (often stated as a c-command constraint, though see Barker 2012), and Crossover constraints on pronominal binding (Postal 1971, Wasow 1972). The data suggest that retrieval exhibits fidelity to the constraints: structurally inaccessible NPs that match an anaphoric element in morphological features do not interfere with the retrieval of an antecedent in most cases considered. In spite of this alignment, I argue that retrieval's apparent sensitivity to c-command constraints need not motivate a memory access procedure that makes direct reference to c-command relations. Instead, proxy features and general parsing operations conspire to mimic the extension of a system that respects c-command constraints. These strategies provide a robust approximation of grammatical performance while remaining within the confines of a independently- motivated general-purpose cognitive architecture.
  • Thumbnail Image
    Item
    The Temporal Dimension of Linguistic Prediction
    (2013) Chow, Wing Yee; Phillips, Colin; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This thesis explores how predictions about upcoming language inputs are computed during real-time language comprehension. Previous research has demonstrated humans' ability to use rich contextual information to compute linguistic prediction during real-time language comprehension, and it has been widely assumed that contextual information can impact linguistic prediction as soon as it arises in the input. This thesis questions this key assumption and explores how linguistic predictions develop in real-time. I provide event-related potential (ERP) and reading eye-movement (EM) evidence from studies in Mandarin Chinese and English that even prominent and unambiguous information about preverbal arguments' structural roles cannot immediately impact comprehenders' verb prediction. I demonstrate that the N400, an ERP response that is modulated by a word's predictability, becomes sensitive to argument role-reversals only when the time interval for prediction is widened. Further, I provide initial evidence that different sources of contextual information, namely, information about preverbal arguments' lexical identity vs. their structural roles, may impact linguistic prediction on different time scales. I put forth a research framework that aims to characterize the mental computations underlying linguistic prediction along a temporal dimension.
  • Thumbnail Image
    Item
    Statistical Knowledge and Learning in Phonology
    (2013) Dunbar, Ewan; Idsardi, William J; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This thesis deals with the theory of the phonetic component of grammar in a formal probabilistic inference framework: (1) it has been recognized since the beginning of generative phonology that some language-specific phonetic implementation is actually context-dependent, and thus it can be said that there are gradient "phonetic processes" in grammar in addition to categorical "phonological processes." However, no explicit theory has been developed to characterize these processes. Meanwhile, (2) it is understood that language acquisition and perception are both really informed guesswork: the result of both types of inference can be reasonably thought to be a less-than-perfect committment, with multiple candidate grammars or parses considered and each associated with some degree of credence. Previous research has used probability theory to formalize these inferences in implemented computational models, especially in phonetics and phonology. In this role, computational models serve to demonstrate the existence of working learning/per- ception/parsing systems assuming a faithful implementation of one particular theory of human language, and are not intended to adjudicate whether that theory is correct. The current thesis (1) develops a theory of the phonetic component of grammar and how it relates to the greater phonological system and (2) uses a formal Bayesian treatment of learning to evaluate this theory of the phonological architecture and for making predictions about how the resulting grammars will be organized. The coarse description of the consequence for linguistic theory is that the processes we think of as "allophonic" are actually language-specific, gradient phonetic processes, assigned to the phonetic component of grammar; strict allophones have no representation in the output of the categorical phonological grammar.