Projects per year
Abstract / Description of output
Terminology correctness is important in the downstream application of machine translation, and a prevalent way to ensure this is to inject terminology constraints into a translation system. In our submission to the WMT 2023 terminology translation task, we adopt a translate-then-refine approach which can be domain-independent and requires minimal manual efforts. We annotate random source words with pseudo-terminology translations obtained from word alignment to first train a terminology-aware model. Further, we explore two post-processing methods. First, we use an alignment process to discover whether a terminology constraint has been violated, and if so, we re-decode with the violating word negatively constrained. Alternatively, we leverage a large language model to refine a hypothesis by providing it with terminology constraints. Results show that our terminology-aware model learns to incorporate terminologies effectively, and the large language model refinement process can further improve terminology recall.
Original language | English |
---|---|
Title of host publication | Proceedings of the Eighth Conference on Machine Translation |
Publisher | Association for Computational Linguistics |
Pages | 890-896 |
ISBN (Electronic) | 979-8-89176-041-7 |
DOIs | |
Publication status | Published - 6 Dec 2023 |
Event | Eighth Conference on Machine Translation - Singapore, Singapore Duration: 6 Dec 2023 → 7 Dec 2023 Conference number: 8 https://machinetranslate.org/wmt23 |
Conference
Conference | Eighth Conference on Machine Translation |
---|---|
Abbreviated title | WMT 2023 |
Country/Territory | Singapore |
City | Singapore |
Period | 6/12/23 → 7/12/23 |
Internet address |
Fingerprint
Dive into the research topics of 'Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting'. Together they form a unique fingerprint.Projects
- 1 Active
-
UTTER: Unified Transcription and Translation for Extended Reality
1/10/22 → 30/09/25
Project: Research