Projects per year
Abstract
This paper introduces a first attempt to perform phoneme-level segmentation of speech based on a perceptual representation - the Spectro Temporal Excitation Pattern (STEP) - and a dimensionality reduction technique - the t-Distributed Stochastic Neighbour Embedding (t-SNE). The method searches for the true phonetic boundaries in the vicinity of those produced by an HMM-based segmentation. It looks for perceptually-salient spectral changes which occur at these phonetic transitions, and exploits t-SNE's ability to capture both local and global structure of the data. The method is intended to be used in any language and it is therefore not tailored to any particular dataset or language. Results show that this simple approach improves segmentation accuracy of unvoiced phonemes by 4% within a 5 ms margin, and 5% at a 10 ms margin. For the voiced phonemes, however, accuracy drops slightly.
Original language | English |
---|---|
Title of host publication | 2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) |
Editors | Corneliu Burileanu, Corneliu Rusu, Horia-Nicolai Teodorescu, Horia-Nicolai Teodorescu |
Publisher | Institute of Electrical and Electronics Engineers |
Number of pages | 6 |
ISBN (Electronic) | 9781467375603 |
DOIs | |
Publication status | Published - 3 Dec 2015 |
Event | 8th International Conference on Speech Technology and Human-Computer Dialogue - Bucharest, Romania Duration: 14 Oct 2015 → 17 Oct 2015 https://sped.pub.ro/archive/sped2015/index-1.html |
Conference
Conference | 8th International Conference on Speech Technology and Human-Computer Dialogue |
---|---|
Abbreviated title | SpeD 2015 |
Country/Territory | Romania |
City | Bucharest |
Period | 14/10/15 → 17/10/15 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- HMM acoustic model
- k-Means
- phonetic segmentation
- STEP
- t-SNE
Fingerprint
Dive into the research topics of 'Phonetic segmentation of speech using STEP and t-SNE'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Simple4All: Speech synthesis that improves through adaptive learning
1/11/11 → 31/10/14
Project: Research
File