Edinburgh Research Explorer

Centre for Speech Technology Research

Organisational unit: Research Centre

  1. 2020
  2. Integrating lexical and prosodic features for automatic paragraph segmentation

    Lai, C., Farrús, M. & Moore, J., 11 May 2020, In : Speech Communication. 121, p. 44-57

    Research output: Contribution to journalArticle

  3. Acoustic model adaptation from raw waveforms with Sincnet

    Fainberg, J., Klejch, O., Loweimi, E., Bell, P. & Renals, S., 20 Feb 2020, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). Institute of Electrical and Electronics Engineers (IEEE), p. 897-904 8 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  4. Bootstrapping Non-Parallel Voice Conversion From Speaker-Adaptive Text-to-Speech

    Luong, H-T. & Yamagishi, J., 20 Feb 2020, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). Institute of Electrical and Electronics Engineers (IEEE), p. 200-207 8 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  5. Embeddings for DNN speaker adaptive training

    Równicka, J., Bell, P. & Renals, S., 20 Feb 2020, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). Institute of Electrical and Electronics Engineers (IEEE), p. 479-486 8 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  6. Speaker adaptive training using model agnostic meta-learning

    Klejch, O., Fainberg, J., Bell, P. & Renals, S., 20 Feb 2020, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). Institute of Electrical and Electronics Engineers (IEEE), p. 881-888 8 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  7. The MGB-5 Challenge: Recognition and Dialect Identification of Dialectal Arabic Speech

    Ali, A., Shon, S., Samih, Y., Mubarak, H., Abdelali, A., Glass, J., Renals, S. & Choukri, K., 20 Feb 2020, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). Institute of Electrical and Electronics Engineers (IEEE), p. 1026-1033 8 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  8. European Language Grid: An Overview

    Rehm, G., Berger, M., Elsholz, E., Hegele, S., Kintzel, F., Marheinecke, K., Piperidis, S., Deligiannis, M., Galanis, D., Gkirtzou, K., Labropoulou, P., Bontcheva, K., Jones, D., Roberts, I., Hajic, J., Hamrlová, J., Kačena, L., Choukri, K., Arranz, V., Vasiļjevs, A. & 16 others, Anvari, O., Lagzdiņš, A., Meļņika, J., Backfried, G., Dikici, E., Janosik, M., Prinz, K., Prinz, C., Stampler, S., Thomas-Aniola, D., Manuel Gomez-Perez, J., Garcia Silva, A., Berrío, C., Germann, U., Renals, S. & Klejch, O., 11 Feb 2020, (Accepted/In press) Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020). 15 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  9. Channel Adversarial Training for Speaker Verification and Diarization

    Luu, C., Bell, P. & Renals, S., 24 Jan 2020, (Accepted/In press) Proceedings of the 45th International Conference on Acoustics, Speech, and Signal Processing. 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  10. Cross Lingual Transfer Learning for Zero-Resource Domain Adaptation

    Abad Gareta, A., Bell, P., Carmantini, A. & Renals, S., 24 Jan 2020, (Accepted/In press) Proceedings of the 45th International Conference on Acoustics, Speech, and Signal Processing. 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  11. Learning Noise Invariant Features Through Transfer Learning for Robust End-to-End Speech Recognition

    Zhang, S., Do, C-T., Doddipatla, R. & Renals, S., 24 Jan 2020, (Accepted/In press) 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing. Institute of Electrical and Electronics Engineers (IEEE), 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  12. Multi-Scale Octave Convolutions for Robust Speech Recognition

    Równicka, J., Bell, P. & Renals, S., 24 Jan 2020, (Accepted/In press) Proceedings of the 45th International Conference on Acoustics, Speech, and Signal Processing. 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  13. A Vector Quantized Variational Autoencoder (VQ-VAE) Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis

    Wang, X., Takaki, S., Yamagishi, J., King, S. & Tokuda, K., 1 Jan 2020, In : IEEE/ACM Transactions on Audio, Speech, and Language Processing. 28, p. 157-170 13 p.

    Research output: Contribution to journalArticle

  14. Neural Source-Filter Waveform Models for Statistical Parametric Speech Synthesis

    Wang, X., Takaki, S. & Yamagishi, J., 2020, In : IEEE/ACM Transactions on Audio, Speech, and Language Processing . 28, p. 402-415 14 p.

    Research output: Contribution to journalArticle

  15. 2019
  16. Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments

    Yasuda, Y., Wang, X. & Yamagishi, J., 22 Sep 2019, Proceedings of the 10th ISCA Speech Synthesis Workshop. International Speech Communication Association, p. 1-6 6 p. (Proc. 10th ISCA Speech Synthesis Workshop).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  17. Measuring the contribution to cognitive load of each predicted vocoder speech parameter in DNN-based speech synthesis

    Govender, A., Valentini-Botinhao, C. & King, S., 22 Sep 2019, Proceedings of the 10th ISCA Speech Synthesis Workshop. International Speech Communication Association, p. 121-126 6 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  18. Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis

    Wang, X. & Yamagishi, J., 22 Sep 2019, Proceedings of the 10th ISCA Speech Synthesis Workshop. International Speech Communication Association, p. 1-6 6 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  19. Where do the improvements come from in sequence-to-sequence neural TTS?

    Watts, O., Henter, G., Fong, J. & Valentini-Botinhao, C., 22 Sep 2019, 10th ISCA Speech Synthesis Workshop. International Speech Communication Association, p. 217-222 6 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  20. ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection

    Todisco, M., Wang, X., Vestman, V., Sahidullah, M., Delgado, H., Nautsch, A., Yamagishi, J., Evans, N., Kinnunen, T. & Aik Lee, K., 19 Sep 2019, Proceedings Interspeech 2019. International Speech Communication Association, p. 1008-1012 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  21. Detecting Topic-Oriented Speaker Stance in Conversational Speech

    Lai, C., Alex, B., Moore, J. D., Tian, L., Hori, T. & Francesca, G., 19 Sep 2019, Proceedings of Interspeech 2019. International Speech Communication Association, p. 46-50 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  22. Direct F0 Estimation with Neural-Network-based Regression

    Xu, S. & Shimodaira, H., 19 Sep 2019, Proc. Interspeech 2019. International Speech Communication Association, p. 1995-1999 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  23. Evaluating Near End Listening Enhancement Algorithms in Realistic Environments

    Chermaz, C., Valentini Botinhao, C., Schepker, H. & King, S., 19 Sep 2019, Proceedings Interspeech 2019. International Speech Communication Association, p. 1373-1377 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  24. GELP: GAN-Excited Liner Prediction for Speech Synthesis from Mel-Spectrogram

    Juvela, L., Bollepalli, B., Yamagishi, J. & Alku, P., 19 Sep 2019, Proceedings Interspeech 2019. International Speech Communication Association, p. 694-698 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  25. Improving speech synthesis with discourse relations

    Aubin, A., Cervone, A., Watts, O. & King, S., 19 Sep 2019, Interspeech 2019. ISCA, Vol. 2019-September. p. 4470-4474 (Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  26. Lattice-based lightly-supervised acoustic model training

    Fainberg, J., Klejch, O., Renals, S. & Bell, P., 19 Sep 2019, Proceedings Interspeech 2019. International Speech Communication Association, p. 1596-1600 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  27. On Learning Interpretable CNNs with Parametric Modulated Kernel-based Filters

    Loweimi, E., Bell, P. & Renals, S., 19 Sep 2019, Proceedings Interspeech 2019. International Speech Communication Association, p. 3480-3484 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  28. Synchronising audio and ultrasound by learning cross-modal embeddings

    Eshky, A., Ribeiro, M., Richmond, K. & Renals, S., 19 Sep 2019, INTERSPEECH 2019: Proceedings of the 20th Annual Conference of the International Speech Communication Association (ISCA). Graz, Austria: International Speech Communication Association, p. 4100-4104 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  29. Trainable Dynamic Subsampling for End-to-End Speech Recognition

    Zhang, S., Loweimi, E., Xu, Y., Bell, P. & Renals, S., 19 Sep 2019, Proceedings Interspeech 2019. International Speech Communication Association, p. 1413-1417 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  30. Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora

    Luong, H-T., Wang, X., Yamagishi, J. & Nishizawa, N., 19 Sep 2019, Proceedings Interspeech 2019. International Speech Communication Association, p. 1303-1307 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  31. Ultrasound tongue imaging for diarization and alignment of child speech therapy sessions

    Ribeiro, M., Eshky, A., Richmond, K. & Renals, S., 19 Sep 2019, INTERSPEECH 2019: Proceedings of the 20th Annual Conference of the International Speech Communication Association (ISCA). Graz, Austria: International Speech Communication Association, p. 16-20 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  32. Untranscribed web audio for low resource speech recognition

    Carmantini, A., Bell, P. & Renals, S., 19 Sep 2019, Proceedings Interspeech 2019. International Speech Communication Association, p. 226-230 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  33. The prosody of presupposition projection in naturally-occurring utterances

    Mahler, T., de Marneffe, M-C. & Lai, C., 7 Sep 2019. 2 p.

    Research output: Contribution to conferencePoster

  34. Modern speech synthesis for phonetic sciences: a discussion and an evaluation

    Malisz, Z., Eje Henter, G., Valentini Botinhao, C., Watts, O., Beskow, J. & Gustafson, J., 31 Aug 2019, Proceedings of the 19th International Congress of Phonetic Sciences ICPhS 2019. Calhoun, S., Escudero, P., Tabain, M. & Warren, P. (eds.). Canberra, Australia: Australasian Speech Science and Technology Association Inc.: Australian Speech Science & Technology Association Inc, p. 487-491 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  35. Normal-to-Lombard Adaptation of Speech Synthesis Using Long Short-Term Memory Recurrent Neural Networks

    Bollepalli, B., Juvela, L., Airaksinen, M., Valentini Botinhao, C. & Alku, P., 1 Jul 2019, In : Speech Communication. 110, p. 64-75 21 p.

    Research output: Contribution to journalArticle

  36. Multi-task Learning For Detecting and Segmenting Manipulated Facial Images and Videos

    H. Nguyen, H., Fang, F., Yamagishi, J. & Echizen, I., 15 Jun 2019, (Accepted/In press) The Tenth IEEE International Conference on Biometrics: Theory, Applications, and Systems (BTAS 2019). 8 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  37. Spatio-temporal generative adversarial network for gait anonymization

    Tieu, N. D. T., Nguyen, H. H., Nguyen-Son, H. Q., Yamagishi, J. & Echizen, I., 1 Jun 2019, In : Journal of Information Security and Applications. 46, p. 307-319 13 p.

    Research output: Contribution to journalArticle

  38. Audiovisual Speaker Conversion: Jointly and Simultaneously Transforming Facial Expression and Acoustic Characteristics

    Fang, F., Wang, X., Yamagishi, J. & Echizen, I., 17 May 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: Institute of Electrical and Electronics Engineers (IEEE), p. 6795-6799 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  39. Capsule-forensics: Using Capsule Networks to Detect Forged Images and Videos

    Nguyen, H. H., Yamagishi, J. & Echizen, I., 17 May 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: Institute of Electrical and Electronics Engineers (IEEE), p. 2307-2311 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  40. Cycle-consistent Adversarial Networks for Non-parallel Vocal Effort Based Speaking Style Conversion

    Seshadri, S., Juvela, L., Yamagishi, J., Rasanen, O. & Alku, P., 17 May 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: Institute of Electrical and Electronics Engineers (IEEE), p. 6835-6839 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  41. Investigation of Enhanced Tacotron Text-to-speech Synthesis Systems with Self-attention for Pitch Accent Language

    Yasuda, Y., Wang, X., Takaki, S. & Yamagishi, J., 17 May 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: Institute of Electrical and Electronics Engineers (IEEE), p. 6905-6909 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  42. STFT Spectral Loss for Training a Neural Speech Waveform Model

    Takaki, S., Nakashika, T., Wang, X. & Yamagishi, J., 17 May 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: Institute of Electrical and Electronics Engineers (IEEE), p. 7065-7069 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  43. Waveform Generation for Text-to-speech Synthesis Using Pitch-synchronous Multi-scale Generative Adversarial Networks

    Juvela, L., Bollepalli, B., Yamagishi, J. & Alku, P., 17 May 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: Institute of Electrical and Electronics Engineers (IEEE), p. 6915-6919 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  44. "Why is the Doctor a Man?" Reactions of Older Adults to a Virtual Training Doctor

    Constantin, A., Lai, C., Farrow, E., Alex, B., Pel-Littel, R., Nap, H. H. & Jeuring, J., 2 May 2019, Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. Glasgow, Scotland UK: ACM, 6 p. LBW1719. (CHI EA '19).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  45. Attentive filtering networks for audio replay attack detection

    Lai, C-I., Abad, A., Richmond, K., Yamagishi, J., Dehak, N. & King, S., 17 Apr 2019, 2019 IEEE International Conference on Acoustics, Speech and Signal Processing. p. 6316-6320

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  46. Dynamic Evaluation of Transformer Language Models

    Krause, B., Mbabazi, E., Murray, I. & Renals, S., 17 Apr 2019, 6 p.

    Research output: Working paper

  47. On the Usefulness of Statistical Normalisation of Bottleneck Features for Speech Recognition

    Loweimi, E., Bell, P. & Renals, S., 17 Apr 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: Institute of Electrical and Electronics Engineers (IEEE), p. 3862-3866 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  48. Speaker-Independent Classification of Phonetic Segments from Raw Ultrasound in Child Speech

    Ribeiro, M. S., Eshky, A., Richmond, K. & Renals, S., 17 Apr 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: Institute of Electrical and Electronics Engineers (IEEE), p. 1328-1332 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  49. Speech Waveform Reconstruction using Convolutional Neural Networks with Noise and Periodic Inputs

    Watts, O., Valentini Botinhao, C. & King, S., 17 Apr 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: Institute of Electrical and Electronics Engineers (IEEE), p. 7045-7049 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  50. Windowed Attention Mechanisms for Speech Recognition

    Zhang, S., Loweimi, E., Bell, P. & Renals, S., 17 Apr 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: Institute of Electrical and Electronics Engineers (IEEE), p. 7100-7104 5 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  51. Unsupervised Speaker Adaptation for DNN-based Speech Synthesis using Input Codes

    Takaki, S., Nishimura, Y. & Yamagishi, J., 7 Mar 2019, Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2018. Honolulu, Hawaii, USA: Institute of Electrical and Electronics Engineers (IEEE), p. 649-658 10 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  52. Recognizing Induced Emotions of Movie Audiences From Multimodal Information

    Muszynski, M., Tian, L., Lai, C., Moore, J., Kostoulas, T., Lombardo, P., Pun, T. & Chanel, G., 27 Feb 2019, In : IEEE Transactions on Affective Computing. 17 p.

    Research output: Contribution to journalArticle

Previous 1 2 3 4 5 6 7 8 ...15 Next