Mixture-Modeling with Unsupervised Clusters for Domain Adaptation in Statistical Machine Translation

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In Statistical Machine Translation, in-domain and out-of-domain training data are not always clearly delineated. This paper investigates how we can still use mixture-modeling techniques for domain adaptation in such cases. We apply unsupervised clustering methods to split the original training set, and then use mixture-modeling techniques to build a model adapted to a given target domain. We show that this approach improves performance over an unadapted baseline, and several alternative domain adaptation methods.

Original languageEnglish
Title of host publicationProceedings of the 16th EAMT Conference
Pages185-192
Number of pages8
Publication statusPublished - May 2012
Event16th Annual Conference of the European Association for Machine Translation (EAMT) - Fondazione Bruno Kessler (FBK) Center for Scientific and Technological Research, Trento, Italy
Duration: 28 May 201230 May 2012

Conference

Conference16th Annual Conference of the European Association for Machine Translation (EAMT)
Country/TerritoryItaly
CityTrento
Period28/05/1230/05/12

Cite this