Evaluation of a hierarchical reinforcement learning spoken dialogue system

Heriberto Cuayáhuitl, Steve Renals, Oliver Lemon, Hiroshi Shimodaira

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

We describe an evaluation of spoken dialogue strategies designed using hierarchical reinforcement learning agents. The dialogue strategies were learnt in a simulated environment and tested in a laboratory setting with 32 users. These dialogues were used to evaluate three types of machine dialogue behaviour: hand-coded, fully-learnt and semi-learnt. These experiments also served to evaluate the realism of simulated dialogues using two proposed metrics contrasted with 'Precision-Recall'. The learnt dialogue behaviours used the Semi-Markov Decision Process (SMDP) model, and we report the first evaluation of this model in a realistic conversational environment. Experimental results in the travel planning domain provide evidence to support the following claims: (a) hierarchical semi-learnt dialogue agents are a better alternative (with higher overall performance) than deterministic or fully-learnt behaviour; (b) spoken dialogue strategies learnt with highly coherent user behaviour and conservative recognition error rates (keyword error rate of 20%) can outperform a reasonable hand-coded strategy; and (c) hierarchical reinforcement learning dialogue agents are feasible and promising for the (semi) automatic design of optimized dialogue behaviours in larger-scale systems.
Original languageEnglish
Pages (from-to)395-429
Number of pages35
JournalComputer Speech and Language
Volume24
Issue number2
DOIs
Publication statusPublished - Apr 2010

Keywords / Materials (for Non-textual outputs)

  • Spoken dialogue systems
  • Hierarchical reinforcement learning
  • Human–machine dialogue simulation
  • Dialogue strategies
  • System evaluation

Fingerprint

Dive into the research topics of 'Evaluation of a hierarchical reinforcement learning spoken dialogue system'. Together they form a unique fingerprint.

Cite this