Projects per year
Abstract / Description of output
Current phrase-based SMT systems perform poorly when using small training sets. This is a consequence of unreliable translation estimates and low coverage over source and target phrases. This paper presents a method which alleviates this problem by exploiting multiple translations of the same source phrase. Central to our approach is triangulation, the process of translating from a source to a target language via an intermediate third language. This allows the use of a much wider range of parallel corpora for training, and can be combined with a standard phrase-table using conventional smoothing methods. Experimental results demonstrate BLEU improvements for triangulated models over a standard phrase-based system.
Original language | English |
---|---|
Title of host publication | ACL 2007, Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, June 23-30, 2007, Prague, Czech Republic |
Pages | 728-735 |
Number of pages | 8 |
Publication status | Published - 2007 |
Fingerprint
Dive into the research topics of 'Machine Translation by Triangulation: Making Effective Use of Multi-Parallel Corpora'. Together they form a unique fingerprint.Projects
- 2 Finished