Abstract / Description of output
Although frequency analysis often leads us to a speech signal in the complex domain, the acoustic models we frequently use are designed for real-valued data. Phase is usually ignored or modelled separately from spectral amplitude. Here, we propose a complex-valued neural network (CVNN) for directly modelling the results of the frequency analysis in the complex domain (such as the complex amplitude). We also introduce a phase encoding technique to map real-valued data (e.g. cepstra or log amplitudes) into the complex domain so we can use the same CVNN processing seamlessly. In this paper, a fully complex-valued neural network, namely a neural network where all of the weight matrices, activation functions and learning algorithms are in the complex domain, is applied for speech synthesis. Results show its ability to model both complex-valued and real-valued data.
Original language | English |
---|---|
Title of host publication | 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 5630 - 5634 |
Number of pages | 5 |
ISBN (Print) | 978-1-4799-9988-0 |
DOIs | |
Publication status | Published - Mar 2016 |
Event | 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - China, Shanghai, China Duration: 20 Mar 2016 → 25 Mar 2016 https://www2.securecms.com/ICASSP2016/Default.asp |
Conference
Conference | 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 |
---|---|
Abbreviated title | ICASSP 2016 |
Country/Territory | China |
City | Shanghai |
Period | 20/03/16 → 25/03/16 |
Internet address |
Fingerprint
Dive into the research topics of 'Initial investigation of speech synthesis based on complex-valued neural networks'. Together they form a unique fingerprint.Profiles
-
Korin Richmond
- School of Philosophy, Psychology and Language Sciences - Reader
- Institute of Language, Cognition and Computation
- Centre for Speech Technology Research
Person: Academic: Research Active