SyMGiza++: Symmetrized Word Alignment Models for Statistical Machine Translation

Marcin Junczys-Dowmunt, Arkadiusz Szał

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

SyMGiza++ — a tool that computes symmetric word alignment models with the capability to take advantage of multi-processor systems — is presented. A series of fairly simple modifications to the original IBM/Giza++ word alignment models allows to update the symmetrized models between chosen iterations of the original training algorithms. We achieve a relative alignment quality improvement of more than 17% compared to Giza++ and MGiza++ on the standard Canadian Hansards task, while maintaining the speed improvements provided by the capability of parallel computations of MGiza++.
Furthermore, the alignment models are evaluated in the context of phrase-based statistical machine translation, where a consistent improvement measured in BLEU scores can be observed when SyMGiza++ is used instead of Giza++ or MGiza++.
Original languageEnglish
Title of host publicationSecurity and Intelligent Information Systems
Subtitle of host publicationInternational Joint Conferences, SIIS 2011, Warsaw, Poland, June 13-14, 2011, Revised Selected Papers
EditorsPascal Bouvry, Mieczysław A. Kłopotek, Franck Leprévost, Małgorzata Marciniak, Agnieszka Mykowiecka, Henryk Rybiński
Place of PublicationBerlin, Heidelberg
PublisherSpringer Berlin Heidelberg
Pages379-390
Number of pages12
ISBN (Electronic)978-3-642-25261-7
ISBN (Print)978-3-642-25260-0
DOIs
Publication statusPublished - 2012

Publication series

NameLecture Notes in Computer Science
PublisherSprinter Berlin Heidelberg
Volume7053
ISSN (Print)0302-9743

Cite this