Abstract
SyMGiza++ — a tool that computes symmetric word alignment models with the capability to take advantage of multi-processor systems — is presented. A series of fairly simple modifications to the original IBM/Giza++ word alignment models allows to update the symmetrized models between chosen iterations of the original training algorithms. We achieve a relative alignment quality improvement of more than 17% compared to Giza++ and MGiza++ on the standard Canadian Hansards task, while maintaining the speed improvements provided by the capability of parallel computations of MGiza++.
Furthermore, the alignment models are evaluated in the context of phrase-based statistical machine translation, where a consistent improvement measured in BLEU scores can be observed when SyMGiza++ is used instead of Giza++ or MGiza++.
Furthermore, the alignment models are evaluated in the context of phrase-based statistical machine translation, where a consistent improvement measured in BLEU scores can be observed when SyMGiza++ is used instead of Giza++ or MGiza++.
Original language | English |
---|---|
Title of host publication | Security and Intelligent Information Systems |
Subtitle of host publication | International Joint Conferences, SIIS 2011, Warsaw, Poland, June 13-14, 2011, Revised Selected Papers |
Editors | Pascal Bouvry, Mieczysław A. Kłopotek, Franck Leprévost, Małgorzata Marciniak, Agnieszka Mykowiecka, Henryk Rybiński |
Place of Publication | Berlin, Heidelberg |
Publisher | Springer |
Pages | 379-390 |
Number of pages | 12 |
ISBN (Electronic) | 978-3-642-25261-7 |
ISBN (Print) | 978-3-642-25260-0 |
DOIs | |
Publication status | Published - 2012 |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Sprinter Berlin Heidelberg |
Volume | 7053 |
ISSN (Print) | 0302-9743 |