Handling Translation Divergences in Generation-Heavy Hybrid Machine Translation
Handling Translation Divergences in Generation-Heavy Hybrid Machine Translation
Files
Publication or External Link
Date
2002-04-04
Authors
Habash, Nizar
Dorr, Bonnie
Advisor
Citation
DRUM DOI
Abstract
This paper describes a novel approach for handling translation divergences
in a Generation-Heavy Hybrid Machine Translation (GHMT) system. The
approach depends on the existence of rich target language resources such
as word lexical semantics, including information about categorial
variations and subcategorization frames. These resources are used to
generate multiple structural variations from a target-glossed
lexico-syntactic representation of the source language sentence. The
multiple structural variations account for different translation
divergences. The overgeneration of the approach is constrained by a
target-language model using corpus-based statistics. The exploitation of
target language resources (symbolic and statistical) to handle a problem
usually reserved to Transfer and Interlingual MT is useful for translation
from structurally divergent source languages with scarce linguistic
resources. A preliminary evaluation on the application of this approach to
Spanish-English MT proves this approach extremely promising. The approach
however is not limited to MT as it can be extended to monolingual NLG
applications such as summarization.
Also UMIACS-TR-2002-23
Also LAMP-TR-083