This paper investigates the task of training discriminatively a phrase based SMT system with millions of features using the structured perceptron and the Margin Infused Relax Algorithm (MIRA), two popular online learning algorithms. We also compare two different update strategies, one where we update towards an oracle translation candidate extracted from an N-best list vs a more aggressive approach in which we update towards an oracle extracted prior to training using a minloss decoder. We evaluate our different training algorithms on the Czech-English translation task. Our results show that while both learning algorithms achieve similar results, with the perceptron converging more rapidly, the aggressive update strategy performs significantly worse than the more conservative strategy corroborating Liang et al. (2006)’s findings.
|Title of host publication||MT SUMMIT XI 10-14 September 2007, Copenhagen, Denmark, Proceedings|
|Publisher||European Association for Machine Translation|
|Number of pages||6|
|Publication status||Published - 2007|