Computational modeling of the role of discourse information in language production and language acquisition

Thumbnail Image


Publication or External Link





This dissertation explores the role of discourse information in language production and language acquisition. Discourse information plays an important role in various aspects of linguistic processes and learning. However, characterizing what it is and how it is used has been challenging. Previous studies on discourse tend to focus on the correlations between certain discourse factors and speaker/comprehender's behavior, rather than looking at how the discourse information is used in the system of language and why. This dissertation aims to provide novel insights into the role of discourse information by formalizing how it is represented and how it is used. First, I formalize the latent semantic information in humans' discourse representations by examining speakers' choices of referring expressions. Simulation results suggest that topic models can capture aspects of discourse representations that are relevant to the choices of referring expressions, beyond simple referent frequency. Second, I propose a language production model that extends the rational speech act model from \citeA{frank2012predicting} to incorporate updates to listeners' beliefs as discourse proceeds. Simulations suggest that speakers' behavior can be modeled in a principled way by considering the probabilities of referents in the discourse and the information conveyed by each word. Third, I examine the role of discourse information in language acquisition, focusing on the learning of grammatical categories of pronouns. I show that a Bayesian model with prior discourse knowledge can accurately recover grammatical categories of pronouns, but simply having strong syntactic prior knowledge is not sufficient. This suggests that discourse information can help learners acquire grammatical categories of pronouns. Throughout this dissertation, I propose frameworks for modeling speakers and learners using techniques from Bayesian modeling. These models provide ways to flexibly investigate the effects of various sources of information, including discourse salience, expectations about referents and grammatical knowledge.