Discriminative Interlingual Representations
dc.contributor.advisor | Jagarlamudi, Jagadeesh | en_US |
dc.contributor.author | Jagarlamudi, Jagadeesh | en_US |
dc.contributor.department | Computer Science | en_US |
dc.contributor.publisher | Digital Repository at the University of Maryland | en_US |
dc.contributor.publisher | University of Maryland (College Park, Md.) | en_US |
dc.date.accessioned | 2013-06-28T06:45:56Z | |
dc.date.available | 2013-06-28T06:45:56Z | |
dc.date.issued | 2013 | en_US |
dc.description.abstract | The language barrier in many multilingual natural language processing (NLP) tasks can be overcome by mapping objects from different languages (“views”) into a common low-dimensional subspace. For example, the name transliteration task involves mapping bilingual names and word translation mining involves mapping bilingual words into a common low-dimensional subspace. Multi-view models learn such a low-dimensional subspace using a training corpus of paired objects, e.g., names written in different languages, represented as feature vectors. The central idea of my dissertation is to learn low-dimensional subspaces (or interlingual representations) that are effective for various multilingual and monolingual NLP tasks. First, I demonstrate the effectiveness of interlingual representations in mining bilingual word translations, and then proceed to developing models for diverse situations that often arise in NLP tasks. In particular, I design models for the following problem settings: 1) when there are more than two views but we only have training data from a single pivot view into each of the remaining views 2) when an object from one view is associated with a ranked list of objects from another view, and finally 3) when the underlying objects have rich structure, such as a tree. These problem settings arise often in real world applications. I choose a canonical task for each of the settings and compare my model with existing state-of-the-art baseline systems. I provide empirical evidence for the first two models on multilingual name transliteration and reranking for the part-of-speech tagging tasks, espectively. For the third problem setting, I experiment with the task of re-scoring target language word translations based on the source word's context. The model roposed for this problem builds on the ideas proposed in the previous models and, hence, leads to a natural conclusion. | en_US |
dc.identifier.uri | http://hdl.handle.net/1903/14106 | |
dc.subject.pqcontrolled | Computer science | en_US |
dc.subject.pqcontrolled | Linguistics | en_US |
dc.subject.pquncontrolled | Canonical Correlation Analysis | en_US |
dc.subject.pquncontrolled | Dimensionality Reduction | en_US |
dc.subject.pquncontrolled | Interlingual Representations | en_US |
dc.subject.pquncontrolled | Machine Learning | en_US |
dc.subject.pquncontrolled | Machine Translation | en_US |
dc.subject.pquncontrolled | Natural Language Processing | en_US |
dc.title | Discriminative Interlingual Representations | en_US |
dc.type | Dissertation | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Jagarlamudi_umd_0117E_14207.pdf
- Size:
- 1.65 MB
- Format:
- Adobe Portable Document Format