Hierarchical Dialogue Optimization Using Semi-Markov Decision

Heriberto Cuayáhuitl, Steve Renals, Oliver Lemon, Hiroshi Shimodaira

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper addresses the problem of dialogue optimization on large search spaces. For such a purpose, in this paper we propose to learn dialogue strategies using multiple Semi-Markov Decision Processes and hierarchical reinforcement learning. This approach factorizes state variables and actions in order to learn a hierarchy of policies. Our experiments are based on a simulated flight booking dialogue system and compare flat versus hierarchical reinforcement learning. Experimental results show that the proposed approach produced a dramatic search space reduction (99.36%), and converged four orders of magnitude faster than flat reinforcement learning with a very small loss in optimality (on average 0.3 system turns). Results also report that the learnt policies outperformed a hand-crafted one under three different conditions of ASR confidence levels. This approach is appealing to dialogue optimization due to faster learning, reusable subsolutions, and scalability to larger problems.
Original languageEnglish
Title of host publicationProceedings of the 8th Annual Conference of the International Speech Communication Association
Subtitle of host publicationInterspeech 2007
PublisherISCA
Pages2693-2696
Number of pages4
Publication statusPublished - 2007

Keywords

  • Spoken dialogue systems
  • semi-Markov decision processes
  • hierarchical reinforcement learning

Fingerprint Dive into the research topics of 'Hierarchical Dialogue Optimization Using Semi-Markov Decision'. Together they form a unique fingerprint.

Cite this