Exploiting Action Categories in Learning Complex Games

Mihai Dobre, Alex Lascarides

Research output: Chapter in Book/Report/Conference proceedingConference contribution


This paper presents a model for planning in a highly complex game, where certain action types are more common than others and cyclic behaviour can also easily arise. These issues are addressed by exploiting the inherent structure among the possible options to enhance the online learning algorithm: sampling during Monte Carlo Tree Search becomes a two step process, by first sampling from a distribution over the types of legal actions followed by sampling from individual actions of the chosen type. This policy drastically reduces the breadth of the rollout as well as its depth by avoiding redundant sampling behaviour. The result is a large increase in both the performance and efficiency of the model. Another contribution of this paper is assessing the benefits of a parallel implementation and afterstates in complex games. Evaluation is done via agent simulations in the board game Settlers of Catan. The resulting agent is the first
based on purely online learning strategies that can handle the full set of legal actions of the game. The evaluation shows that our model outperforms previous state-of-the-art agents while taking decisions in a time threshold tolerated by human opponents.
Original languageEnglish
Title of host publicationIEEE Technically Sponsored Intelligent Systems Conference (IntelliSys 2017)
Place of PublicationLondon. UK
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages9
ISBN (Electronic)978-1-5090-6435-9
ISBN (Print)978-1-5090-6436-6
Publication statusPublished - 26 Mar 2018
EventSAI Intelligent Systems Conference 2017 - London, United Kingdom
Duration: 7 Sep 20178 Sep 2017


ConferenceSAI Intelligent Systems Conference 2017
Abbreviated titleIntelliSys 2017
CountryUnited Kingdom
Internet address

Fingerprint Dive into the research topics of 'Exploiting Action Categories in Learning Complex Games'. Together they form a unique fingerprint.

Cite this