Discriminative Interlingual Representations

dc.contributor.advisorJagarlamudi, Jagadeeshen_US
dc.contributor.authorJagarlamudi, Jagadeeshen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2013-06-28T06:45:56Z
dc.date.available2013-06-28T06:45:56Z
dc.date.issued2013en_US
dc.description.abstractThe language barrier in many multilingual natural language processing (NLP) tasks can be overcome by mapping objects from different languages (“views”) into a common low-dimensional subspace. For example, the name transliteration task involves mapping bilingual names and word translation mining involves mapping bilingual words into a common low-dimensional subspace. Multi-view models learn such a low-dimensional subspace using a training corpus of paired objects, e.g., names written in different languages, represented as feature vectors. The central idea of my dissertation is to learn low-dimensional subspaces (or interlingual representations) that are effective for various multilingual and monolingual NLP tasks. First, I demonstrate the effectiveness of interlingual representations in mining bilingual word translations, and then proceed to developing models for diverse situations that often arise in NLP tasks. In particular, I design models for the following problem settings: 1) when there are more than two views but we only have training data from a single pivot view into each of the remaining views 2) when an object from one view is associated with a ranked list of objects from another view, and finally 3) when the underlying objects have rich structure, such as a tree. These problem settings arise often in real world applications. I choose a canonical task for each of the settings and compare my model with existing state-of-the-art baseline systems. I provide empirical evidence for the first two models on multilingual name transliteration and reranking for the part-of-speech tagging tasks, espectively. For the third problem setting, I experiment with the task of re-scoring target language word translations based on the source word's context. The model roposed for this problem builds on the ideas proposed in the previous models and, hence, leads to a natural conclusion.en_US
dc.identifier.urihttp://hdl.handle.net/1903/14106
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pqcontrolledLinguisticsen_US
dc.subject.pquncontrolledCanonical Correlation Analysisen_US
dc.subject.pquncontrolledDimensionality Reductionen_US
dc.subject.pquncontrolledInterlingual Representationsen_US
dc.subject.pquncontrolledMachine Learningen_US
dc.subject.pquncontrolledMachine Translationen_US
dc.subject.pquncontrolledNatural Language Processingen_US
dc.titleDiscriminative Interlingual Representationsen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Jagarlamudi_umd_0117E_14207.pdf
Size:
1.65 MB
Format:
Adobe Portable Document Format