Projects per year
This paper describes the University of Edinburgh’s (UEDIN) phrase-based submissions to the translation and medical translation shared tasks of the 2014 Workshop on Statistical Machine Translation (WMT). We participated in all language pairs. We have improved upon our 2013 system by i) using generalized representations, specifically automatic word clusters for translations out of English, ii) using unsupervised character-based models to translate unknown words in Russian-English and Hindi-English pairs, iii) synthesizing Hindi data from closely-related Urdu data, and iv) building huge language on the common crawl corpus.
|Title of host publication||Proceedings of the Ninth Workshop on Statistical Machine Translation|
|Place of Publication||Baltimore, Maryland, USA|
|Publisher||Association for Computational Linguistics|
|Number of pages||8|
|Publication status||Published - 1 Jun 2014|