Supersense Tagging for Arabic: the MT-in-the-Middle Attack

Nathan Schneider, Behrang Mohit, Chris Dyer, Kemal Oflazer, Noah A. Smith

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We consider the task of tagging Arabic nouns with WordNet supersenses. Three approaches are evaluated. The first uses an expertcrafted but limited-coverage lexicon, Arabic WordNet, and heuristics. The second uses unsupervised sequence modeling. The third and most successful approach uses machine translation to translate the Arabic into English, which is automatically tagged with English supersenses, the results of which are then projected back into Arabic. Analysis shows gains and remaining obstacles in four Wikipedia topical domains.
Original languageEnglish
Title of host publicationHuman Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 9-14, 2013, Westin Peachtree Plaza Hotel, Atlanta, Georgia, USA
PublisherAssociation for Computational Linguistics
Pages661-667
Number of pages7
Publication statusPublished - 2013

Cite this