Projects per year
Abstract / Description of output
Abstract Synchronous context-free grammars (SCFGs) can be learned from parallel texts that are annotated with target-side syntax, and can produce translations by building target-side syntactic trees from source strings. Ideally, producing syntactic trees would entail that the translation is grammatically well-formed, but in reality, this is often not the case. Focusing on translation into German, we discuss various ways in which string-to-tree translation models over- or undergeneralise. We show how these problems can be addressed by choosing a suitable parser and modifying its output, by introducing linguistic constraints that enforce morphological agreement and constrain subcategorisation, and by modelling the productive generation of German compounds.
Original language | English |
---|---|
Pages (from-to) | 27-45 |
Number of pages | 19 |
Journal | Computer Speech and Language |
Volume | 32 |
Issue number | 1 |
DOIs | |
Publication status | Published - Jul 2015 |
Keywords / Materials (for Non-textual outputs)
- Morphology
- Statistical machine translation
- Syntactic translation models
- String-to-tree models
Fingerprint
Dive into the research topics of 'A tree does not make a well-formed sentence: Improving syntactic string-to-tree statistical machine translation with more linguistic knowledge'. Together they form a unique fingerprint.Projects
- 1 Finished