Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
5 results
Search Results
Item Are you asking me or telling me? Learning clause types and speech acts in English and Mandarin(2022) Yang, Yu'an; Hacquard, Valentine; Lidz, Jeffrey; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Languages tend to have three major clause types (declaratives, interrogatives, imperatives), dedicated to three main speech acts (assertions, questions, commands). However, the particular forms that these clause types take differ from language to language, and have to be learned. Previous experimental results suggest that by 18 months old, children differentiate these clause types and associate them with their canonical speech act. This dissertation investigates how children learn to identify different clause types and speech acts. To learn clause types, children need to identify the right categories of clauses (the "clustering problem") and figure out what speech act they are canonically used for (the "labeling problem"). I investigate the extent to which learners need to rely on pragmatic information (i.e., knowing what speech act a given utterance of a sentence is conveying), to solve not just labeling, but the clustering itself. I examine the role of pragmatics computationally by building two Bayesian clustering models. I find that morpho-syntactic and prosodic information are not enough for identifying the right clause type clustering, and that pragmatics is necessary. I applied the same model to a morphological impoverished language, Mandarin, and found that the model without pragmatics performs even worse. Speech act information is crucial for finding the right categories for both languages. Additionally, I find that a little pragmatics goes a long way. I simulate the learning process with noisy speech act information, and find that even when speech act information is noisy, the model hones in on the right clause type categories, when the model without fails. But if speech act information is useful for clause type learning, how do children figure out speech act information? I explore what kind of non-clause type cues for speech act information are present in the input. Even if children must rely on clause type information to figure out speech acts, they could have access to additional information that is unrelated to clause typing, but informative for recognizing speech act type. When speakers perform speech acts, because of the conventional functions of these speech acts on the discourse, the performance might be associated with certain socio-pragmatic features. For example, because of questions' response-elicitation function, we might expect speakers to pause longer after questions. If children are equipped with some expectations about the functions of communication, and about what questions do, they might be able to use these socio-pragmatic cues to figure out speech act. I explore two cues that could potentially differentiate questions from other speech acts: pauses, and direct eye gaze. I find that parents tend to pause longer after questions, and attend to the child more when asking questions. Therefore it is in principle plausible that there are some socio-pragmatic features that children can use, in addition to their growing knowledge of clause types to infer the speech act category of an utterance. This little bit of information about speech act could then be used to provide the information that the child needs in order to get the clause type clusters identified accurately.Item How Grammars Grow: Argument Structure and the Acquisition of Non-Basic Syntax(2019) Perkins, Laurel; Lidz, Jeffrey; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This dissertation examines the acquisition of argument structure as a window into the role of development in grammar learning. The way that children represent the data for language acquisition depends on the grammatical knowledge they have at any given point in development. Children use their immature grammatical knowledge, together with other non-linguistic conceptual, pragmatic, and cognitive abilities, to parse and interpret their input. But until children have fully acquired the target grammar, these input representations will be incomplete and potentially inaccurate. Our learning theory must take into account how learning can operate over input representations that change over the course of development. What allows learners to acquire new knowledge from partial and noisy representations of their data, one step at a time, and still converge on the right grammar? The case study in this dissertation points towards one way to characterize the role of development in grammar acquisition by probing more deeply into the resources that learners bring to their learning task. I consider two types of resources. The first is representational: learners need resources for representing their input in useful ways, even early in development. In two behavioral studies, I ask what resources infants in their second year of life use to represent their input for argument structure acquisition. I show that English learners differentiate the grammatical and thematic relations of clause arguments, and that they recognize local argument relations before they recognize non-local predicate-argument dependencies. The second type of resource includes mechanisms for learning from input representations even when they are incomplete or inaccurate early in development. In two computational experiments, I investigate how learners could in principle use a combination of domain-specific linguistic knowledge and domain-general cognitive abilities in order to draw accurate inferences about verb argument structure from messy data, and to identify the forms that argument movement can take in their language. By investigating some of the earliest steps of syntax acquisition in infancy, this work aims to provide a fuller picture of what portion of the input is useful to an individual child at any single point in development, how the child perceives that portion of the input given her current grammatical knowledge, and what internal mechanisms enable the child to generalize beyond her input in inferring the grammar of her language. This work has implications not only for theories of language learning, but also for learning in general, by offering a new perspective on the use of data in the acquisition of knowledge.Item An Affiliative Model of Early Lexical Learning(2019) Tripp, Alayo; Feldman, Naomi; Idsardi, William; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In defining the language acquisition problem, traditional models abstract away effects of variability, defining the learner as acquiring a single language variety, which is spoken homogeneously by their speech community. However, infants are exposed to as many unique varieties of speech as they are speakers. Adult sociolinguistic competence is also characterized by the capacity to employ and interpret non-phonological linguistic distinctions which are associated with different social groups, including ‘code-switching’ or ‘style-shifting’ between languages and speech registers. This dissertation presents a model of infant lexical acquisition which assumes that learners monitor linguistic sources for variation in reliability. This model is adapted from Shafto, Eaves, Navarro, and Perfors (2012) which the authors used to describe the behavior of preschool children in selecting sources to learn labels from in K. Corriveau and Harris (2009) and M. Corriveau and Harris (2009). I show that this probabilistic model effectively simulates two experiments from the literature on preverbal infants’ perception of labeling, Rost and McMurray (2009) and Koenig and Echols (2003). Evidence suggests that the receptiveness of preverbal infants to novel lexical items is correlated with infant beliefs regarding the informant’s knowledgeability and social group membership. These simulations demonstrate that language learners may well be recruiting processes of epistemic trust to guide lexical acquisition much earlier than previously suggested. We should therefore expect even very young listeners to respond differently to dialects not solely as a function of exposure, but also as a function of attitudes towards the speech determined by the quality of that exposure. Developmental differences between populations in attention to non-linguistic affiliative cues are therefore expected to emerge early and have significant effects on language outcomes. Measures of online language proficiency may be vulnerable to significant bias owing to the activation of sociolinguistic biases in the presentation of test items. Differences in the breadth or specificity of listener preferences for speakers in turn predict differences in task complexity for learners of standard and non-standard dialects. A new research program in early sociophonetic perception, uniting accounts of selective trust with language learning has the potential to deepen understanding of both typical and disordered language development.Item Computational modeling of the role of discourse information in language production and language acquisition(2015) Orita, Naho; Feldman, Naomi H; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This dissertation explores the role of discourse information in language production and language acquisition. Discourse information plays an important role in various aspects of linguistic processes and learning. However, characterizing what it is and how it is used has been challenging. Previous studies on discourse tend to focus on the correlations between certain discourse factors and speaker/comprehender's behavior, rather than looking at how the discourse information is used in the system of language and why. This dissertation aims to provide novel insights into the role of discourse information by formalizing how it is represented and how it is used. First, I formalize the latent semantic information in humans' discourse representations by examining speakers' choices of referring expressions. Simulation results suggest that topic models can capture aspects of discourse representations that are relevant to the choices of referring expressions, beyond simple referent frequency. Second, I propose a language production model that extends the rational speech act model from \citeA{frank2012predicting} to incorporate updates to listeners' beliefs as discourse proceeds. Simulations suggest that speakers' behavior can be modeled in a principled way by considering the probabilities of referents in the discourse and the information conveyed by each word. Third, I examine the role of discourse information in language acquisition, focusing on the learning of grammatical categories of pronouns. I show that a Bayesian model with prior discourse knowledge can accurately recover grammatical categories of pronouns, but simply having strong syntactic prior knowledge is not sufficient. This suggests that discourse information can help learners acquire grammatical categories of pronouns. Throughout this dissertation, I propose frameworks for modeling speakers and learners using techniques from Bayesian modeling. These models provide ways to flexibly investigate the effects of various sources of information, including discourse salience, expectations about referents and grammatical knowledge.Item BEYOND STATISTICAL LEARNING IN THE ACQUISITION OF PHRASE STRUCTURE(2009) Takahashi, Eri; Lidz, Jeffrey; Linguistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The notion that children use statistical distributions present in the input to acquire various aspects of linguistic knowledge has received considerable recent attention. But the roles of learner's initial state have been largely ignored in those studies. What remains unclear is the nature of learner's contribution. At least two possibilities exist. One is that all that learners do is to collect and compile accurately predictive statistics from the data, and they do not have antecedently specified set of possible structures (Elman, et al. 1996; Tomasello 2000). On this view, outcome of the learning is solely based on the observed input distributions. A second possibility is that learners use statistics to identify particular abstract syntactic representations (Miller & Chomsky 1963; Pinker 1984; Yang 2006). On this view, children have predetermined linguistic knowledge on possible structures and the acquired representations have deductive consequences beyond what can be derived from the observed statistical distributions alone. This dissertation examines how the environment interacts with the structure of the learner, and proposes a linking between distributional approach and nativist approach to language acquisition. To investigate this more general question, we focus on how infants, adults and neural networks acquire the phrase structure of their target language. This dissertation presents seven experiments, which show that adults and infants can project their generalizations to novel structures, while the Simple Recurrent Network fails. Moreover, it will be shown that learners' generalizations go beyond the stimuli, but those generalizations are constrained in the same ways that natural languages are constrained. This is compatible with the view that statistical learning interacts with inherent representational system, but incompatible with the view that statistical learning is the sole mechanism by which the existence of phrase structure is discovered. This provides novel evidence that statistical learning interacts with innate constraints on possible representations, and that learners have a deductive power that goes beyond the input data. This suggests that statistical learning is used merely as a method for mapping the surface string to abstract representation, while innate knowledge specifies range of possible grammars and structures.