Показать сокращенную информацию

Improved Online Learning and Modeling for Feature-Rich Discriminative Machine Translation

dc.contributor.advisorResnik, Philipen_US
dc.contributor.authorEidelman, Vladimiren_US
dc.description.abstractMost modern statistical machine translation (SMT) systems learn how to translate by constructing a discriminative model based on statistics from the data. A growing number of methods for discriminative training have been proposed, but most suffer from limitations hindering their utility for training feature-rich models on large amounts of data. In this thesis, we present novel models and learning algorithms that address this issue by tackling three core problems for discriminative training: what to optimize, how to optimize, and how to represent the input. In addressing these issues, we develop fast learning algorithms that are both suitable for large-scale training and capable of generalization in high-dimensional feature spaces. The algorithms are developed in an online margin-based framework. While these methods are firmly established in machine learning, their adaptation to SMT is not straightforward. Thus, the first problem we address is what to optimize when learning for SMT. We define a family of objective functions for large-margin learning with loss-augmented inference over latent variables, and investigate their optimization performance in standard and high-dimensional feature spaces. After establishing what to optimize, the second problem we focus on is improving learning in the feature-rich space. We develop an online gradient-based algorithm that improves upon large-margin learning by considering and bounding the spread of the data while maximizing the margin. Utilizing the learning regimes developed thus far, we are able to focus on the third problem and introduce new features targeting generalization to new domains. We employ topic models to perform unsupervised domain induction, and introduce adaptation features based on probabilistic domain membership. As a final question, we look at how to take advantage of the latent derivation structure. In current models of SMT, there is an exponential number of derivations that produce the same translation. The standard practice is to sidestep this ambiguity. In the final part of the thesis, we define a framework for latent variable models which explicitly takes advantage of all derivations in both learning and inference. We present a novel loss function for large-margin learning in that setting along with developing a suitable optimization algorithm.en_US
dc.titleImproved Online Learning and Modeling for Feature-Rich Discriminative Machine Translationen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.contributor.departmentComputer Scienceen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pquncontrolledMachine Learningen_US
dc.subject.pquncontrolledMachine Translationen_US
dc.subject.pquncontrolledNatural Language Processingen_US
dc.subject.pquncontrolledOnline Learningen_US
dc.subject.pquncontrolledStructured Predictionen_US

Файлы в этом документе


Данный элемент включен в следующие коллекции

Показать сокращенную информацию