Abstract / Description of output
Due to the ever-changing nature of the human language and the variations in writing style, age-old texts in one language may be incomprehensible to a modern reader. In order to make these texts familiar to the modern reader, we need to rewrite them manually. But this is not always feasible if the volume of texts is very large. In this paper, we present this rewriting task as a neural machine translation (NMT) problem. We propose an effective approach for training NMT system on a tiny parallel corpus comprising of only 2.7k parallel sentences. We inject parallel phrase pairs extracted using Statistical Machine Translation (SMT) as additional training examples to NMT. We choose publicly available old-modern English parallel texts for our experiments. Evaluation results show that our proposed approach outperforms the baseline NMT systemby more than 18 BLEU points without using any external data.
Original language | English |
---|---|
Number of pages | 12 |
Publication status | Published - 7 Apr 2019 |
Event | 20th International Conference on Computational Linguistics and Intelligent Text Processing - La Rochelle, France Duration: 7 Apr 2019 → 13 Apr 2019 https://www.cicling.org/2019/ |
Conference
Conference | 20th International Conference on Computational Linguistics and Intelligent Text Processing |
---|---|
Abbreviated title | CICLing 2019 |
Country/Territory | France |
City | La Rochelle |
Period | 7/04/19 → 13/04/19 |
Internet address |