Abstract
State-of-the-art statistical machine translation is based on alignments between phrases – sequences of words in the source and target sentences. The learning step in these systems often relies on alignments between words. It is often assumed that the quality of this word alignment is critical for translation. However, recent results suggest that the relationship between alignment quality and translation quality is weaker than previously thought. We investigate this question directly, comparing the impact of high-quality alignments with a carefully constructed set of degraded alignments. In order to tease apart various interactions, we report experiments investigating the impact of alignments on different aspects of the system. Our results confirm a weak correlation, but they also illustrate that more data and better feature engineering may be more beneficial than better alignment.
Original language | English |
---|---|
Title of host publication | Proceedings of AMTA 2006 |
Publication status | Published - 2006 |