Recursive Stochastic Games with Positive Rewards

Kousha Etessami, Dominik Wojtczak, Mihalis Yannakakis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We study the complexity of a class of Markov decision processes and,
more generally, stochastic games, called 1-exit Recursive Markov Decision Processes (1-RMDPs) and Simple Stochastic Games (1-RSSGs) with strictly positive rewards. These are a class of finitely presented countable-state zero-sum stochastic games, with total expected reward objective. They subsume standard finite-state MDPs and Condon’s simple stochastic games and correspond to optimization and game versions of several classic stochastic models, with rewards. Such stochastic models arise naturally as models of probabilistic procedural programs with recursion, and the problems we address are motivated by the goal of analyzing the optimal/pessimal expected running time in such a setting. We give polynomial time algorithms for 1-exit Recursive Markov decision processes (1-RMDPs) with positive rewards. Specifically, we show that the exact optimal value of both maximizing and minimizing 1-RMDPs with positive rewards can be computed in polynomial time (this value may be 1). For twoplayer 1-RSSGs with positive rewards, we prove a “stackless and memoryless” determinacy result, and show that deciding whether the game value is at least a given value r is in NP \ coNP. We also prove that a simultaneous strategy improvement algorithm converges to the value and optimal strategies for these stochastic games. Whether this algorithm runs in P-time is open, just like its classic version for finite SSGs. We observe that 1-RSSG positive reward games are “harder” than finite-state SSGs in several senses.
Original languageEnglish
Title of host publicationProceedings of the 35th International Colloquium on Automata, Languages and Programming, Part I
Place of PublicationBerlin, Heidelberg
PublisherSpringer-Verlag GmbH
Pages711-723
Number of pages13
DOIs
Publication statusPublished - 2008

Publication series

NameICALP '08
PublisherSpringer-Verlag

Fingerprint Dive into the research topics of 'Recursive Stochastic Games with Positive Rewards'. Together they form a unique fingerprint.

Cite this