Latent-Variable Synchronous CFGs for Hierarchical Translation

Avneesh Saluja, Chris Dyer, Shay B. Cohen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data-driven refinement of non-terminal categories has been demonstrated to be a reliable technique for improving monolingual parsing with PCFGs. In this paper, we extend these techniques to learn latent refinements of single-category synchronous grammars, so as to improve translation performance. We compare two estimators for this latent-variable model: one based on EM and the other is a spectral algorithm based on the method of moments. We evaluate their performance on a Chinese–English translation task. The results indicate that we can achieve significant gains over the baseline with both approaches, but in particular the moments-based estimator is both faster and performs
better than EM.
Original languageEnglish
Title of host publicationProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Place of PublicationDoha, Qatar
PublisherAssociation for Computational Linguistics
Pages1953-1964
Number of pages12
Publication statusPublished - 1 Oct 2014

Fingerprint

Dive into the research topics of 'Latent-Variable Synchronous CFGs for Hierarchical Translation'. Together they form a unique fingerprint.

Cite this