Handling Translation Divergences in Generation-Heavy Hybrid Machine Translation

Loading...
Thumbnail Image

Files

CS-TR-4341.ps (191.5 KB)
No. of downloads: 226
CS-TR-4341.pdf (205.78 KB)
No. of downloads: 679

Publication or External Link

Date

2002-04-04

Advisor

Citation

DRUM DOI

Abstract

This paper describes a novel approach for handling translation divergences in a Generation-Heavy Hybrid Machine Translation (GHMT) system. The approach depends on the existence of rich target language resources such as word lexical semantics, including information about categorial variations and subcategorization frames. These resources are used to generate multiple structural variations from a target-glossed lexico-syntactic representation of the source language sentence. The multiple structural variations account for different translation divergences. The overgeneration of the approach is constrained by a target-language model using corpus-based statistics. The exploitation of target language resources (symbolic and statistical) to handle a problem
usually reserved to Transfer and Interlingual MT is useful for translation from structurally divergent source languages with scarce linguistic resources. A preliminary evaluation on the application of this approach to Spanish-English MT proves this approach extremely promising. The approach however is not limited to MT as it can be extended to monolingual NLG applications such as summarization. Also UMIACS-TR-2002-23 Also LAMP-TR-083

Notes

Rights