Learning Factored Markov Decision Processes with Unawareness

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Methods for learning and planning in sequential decision problems often assume the learner is aware of all possible states and actions in advance. This assumption is sometimes untenable. In this paper, we give a method to learn factored markov decision problems from both domain exploration and expert assistance, which guarantees convergence to near-optimal behaviour, even when the agent begins unaware of factors critical to success. Our experiments show our agent learns optimal behaviour on small and large problems, and that conserving information on discovering new possibilities results in faster convergence.
Original languageEnglish
Title of host publicationProceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI 2019
Subtitle of host publicationTel Aviv, Israel, July 22-25, 2019
Place of PublicationTel Aviv, Israel
Number of pages11
Publication statusPublished - 22 Jul 2019
Event35th Conference on Uncertainty in Artificial Intelligence, UAI 2019 - Tel Aviv, Israel
Duration: 22 Jul 201925 Jul 2019
http://auai.org/uai2019/

Conference

Conference35th Conference on Uncertainty in Artificial Intelligence, UAI 2019
Abbreviated titleUAI 2019
Country/TerritoryIsrael
CityTel Aviv
Period22/07/1925/07/19
Internet address

Fingerprint

Dive into the research topics of 'Learning Factored Markov Decision Processes with Unawareness'. Together they form a unique fingerprint.

Cite this