Abstract
Introduction In Text-to-Speech synthesis, the input is plain text, which may then be analysed syntactically and morphologically before converting it to speech. In Concept--to--Speech synthesis (CTS), on the contrary, the input text is annotated with semantic and pragmatic information. The system then has to provide acoustic cues to semantic and pragmatic information in the synthesised speech signal. To determine direct acoustic correlates of linguistic concepts on the phonetic and prosodic level is very difficult. Ideally, those cues would be specified at a more abstract level of processing, since it is very difficult to determine direct acoustic correlates of linguistic concepts. Portele and Heuft [12] claim that prominence "a quantitative parameter of a syllable or a boundary that describes markedness relative to surrounding syllables and boundaries, respectively" which can take values between 0 and 31 for syllables[5],might provide such an interface between
Original language | English |
---|---|
Title of host publication | Proceedings of Konvens 1998 |
Publisher | Peter Lang Publishing |
Volume | 1 |
Publication status | Published - 1998 |
Keywords / Materials (for Non-textual outputs)
- focus perception
- linguistic concept
- pragmatic information
- direct acoustic correlate
- plain text
- prosodic level
- text-to-speech synthesis
- speech synthesis
- input text
- quantitative parameter
- synthesised speech signal
- acoustic cue
- abstract level