Counterfactual explanations as plans

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

There has been considerable recent interest in explainability in AI, especially with black-box machine learning models. As correctly observed by the planning community, when the application at hand is not a single-shot decision or prediction, but a sequence of actions that depend on observations, a richer notion of explanations are desirable.

In this paper, we look to provide a formal account of “counterfactual explanations," based in terms of action sequences. We then show that this naturally leads to an account of model reconciliation, which might take the form of the user correcting the agent’s model, or suggesting actions to the agent’s plan. For this, we will need to articulate what is true versus what is known, and we appeal to a modal fragment of the situation calculus to formalise these intuitions. We consider various settings: the agent knowing partial truths, weakened truths and having false beliefs, and show that our definitions easily generalize to these different settings.
Original languageEnglish
Title of host publicationProceedings 40th International Conference on Logic Programming
PublisherOpen Publishing Association
Publication statusAccepted/In press - 24 Mar 2023
EventThe 39th International Conference on Logic Programming - Imperial College London, London, United Kingdom
Duration: 9 Jul 202315 Jul 2023
Conference number: 39
https://iclp2023.imperial.ac.uk/home

Publication series

NameElectronic Proceedings in Theoretical Computer Science (EPTCS).
PublisherOpen Publishing Association
ISSN (Electronic)2075-2180

Conference

ConferenceThe 39th International Conference on Logic Programming
Abbreviated titleICLP 2023
Country/TerritoryUnited Kingdom
CityLondon
Period9/07/2315/07/23
Internet address

Fingerprint

Dive into the research topics of 'Counterfactual explanations as plans'. Together they form a unique fingerprint.

Cite this