Abstract / Description of output
It is all too common for systems processing natural language, whether for input (automatic speech recognition, text queries, dialogue etc.) or output (text-to-speech), to ignore or strip out punctuation. The effect of prosodic factors, such as intonation and pausing, on language processing remains controversial. While there is an obvious relationship between punctuation and prosody it cannot be a simple mapping: grammatical rules prevent the inclusion of punctuation at points where a speaker might pause, and the set of punctuation is not rich enough to transcribe all the spoken features categorised as prosody. It is therefore important for any realistic text-to-speech (or speech-to-text) conversion to consider these important features of language. An experimental investigation showed that commas exert a consistently strong and direct rhetorical influence on sentences being read aloud. They result in the slower delivery of words preceding the comma and an increase in pauses in speech. While the lengthening effect is an uncontroversial feature found at the end of clauses, even in the absence of punctuation, there is evidence to suggest that the comma is particularly useful in acoustically segmenting text by stimulating a gap, or period of silence, between linguistic units. This is particularly salient at points where a break can convey disambiguating information. Somewhat surprisingly, commas do not induce shifts in the fundamental frequency of speech or alter intonational patterns. Any generation of naturalistic synthetic speech should therefore take these factors into consideration.
Original language | English |
---|---|
Journal | TinyToCS |
Volume | 2 |
Publication status | Published - 2013 |