Project Details
Description
The objective of the project was to produce a synthetic voice without
recource to phonological and phonetic analysis that is required for
creating a traditionl phone set and a pronunication lexicon. The input
to the system would be speech recorded from a single speaker, divided
up into utterances, each utterance to be accompanied by a normalised
text transcription. No language specific techniques would be applied.
The only restriction we required was that the orthographic system of
the language was syllabic or alphabetic and not logographic or
logophonetic.
recource to phonological and phonetic analysis that is required for
creating a traditionl phone set and a pronunication lexicon. The input
to the system would be speech recorded from a single speaker, divided
up into utterances, each utterance to be accompanied by a normalised
text transcription. No language specific techniques would be applied.
The only restriction we required was that the orthographic system of
the language was syllabic or alphabetic and not logographic or
logophonetic.
Layman's description
Speech synthesis is the conversion of text to speech by computer. Conventional methods for doing this require a lot of pre-existing knowledge to be brought into play, such as a large pronunciation dictionary and knowledge about the speech sounds that make up the language. This makes it very expensive to build a system for any language that does not have these resources available: that means almost all the languages of the world. This project aimed to improve this situation, by creating a speech synthesis systems that do not require knowledge of the sound system or require a pronunciation dictionary.
Key findings
The project succeeded in producing both statistical parametric and unit selection "emergent phone" systems. In addition we also created orthographic unit-based systems. These systems were evaluated against classical phone systems. A large number of techniques were evaluated for generating these systems and applied to the two main underlying problems that needed to
be solved namely:
* Segmenting and categorising units of speech.
* Generalising the expected units in unseen words from a database of segmented and categorised speech from a single speaker.
The main advance made in the project was the use of orthographic unit-based systems (i.e., using graphemes instead of phonemes), and this notion has been followed up in subsequent projects.
be solved namely:
* Segmenting and categorising units of speech.
* Generalising the expected units in unseen words from a database of segmented and categorised speech from a single speaker.
The main advance made in the project was the use of orthographic unit-based systems (i.e., using graphemes instead of phonemes), and this notion has been followed up in subsequent projects.
| Acronym | ePhones |
|---|---|
| Status | Finished |
| Effective start/end date | 1/07/06 → 30/09/10 |
Funding
- EPSRC: £238,471.00
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.
Research output
- 4 Conference contribution
-
Speech synthesis without a phone inventory
Aylett, M., King, S. & Yamagishi, J., 2009, Interspeech. p. 2087-2090 4 p.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
Open AccessFile -
Combining Statistical Parameteric Speech Synthesis and Unit-Selection for Automatic Voice Cloning
Aylett, M. & Yamagishi, J., Sept 2008, Proc. LangTech 2008.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
Open AccessFile -
Unsupervised adaptation for HMM-based speech synthesis
King, S., Tokuda, K., Zen, H. & Yamagishi, J., Sept 2008, Proc. Interspeech. ISCA, p. 1869-1872 4 p.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
Open AccessFile
Activities
-
SSW7 invited tutorial: Speech synthesis without the right data
King, S. (Keynote speaker)
22 Sept 2010 → 24 Sept 2010Activity: Academic talk or presentation types › Invited talk
File -
ACSSP invited talk: Expressive Speech Synthesis
King, S. (Invited speaker)
31 Aug 2010Activity: Academic talk or presentation types › Invited talk
File -
Universiti Teknologi Malaysia
King, S. (Teacher)
Nov 2009Activity: Visiting an external institution types › Research and Teaching at External Organisation
File