Interpretable Neural Predictions with Differentiable Binary Variables

Joost Bastings, Wilker Aziz, Ivan Titov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The success of neural networks comes hand in hand with a desire for more interpretability. We focus on text classifiers and make them more interpretable by having them provide a justification—a rationale—for their predictions.

We approach this problem by jointly training two neural network models: a latent model that selects a rationale (i.e. a short and informative part of the input text), and a classifier that learns from the words in the rationale alone. Previous work proposed to assign binary latent masks to input positions and to promote short selections via sparsityinducing penalties such as L0 regularisation.

We propose a latent model that mixes discrete and continuous behaviour allowing at the same time for binary selections and gradient-based training without REINFORCE. In our formulation, we can tractably compute the expected value of penalties such as L0, which allows us to directly optimise the model towards a prespecified text selection rate. We show that our approach is competitive with previous work on rationale extraction, and explore further uses in attention mechanisms.
Original languageEnglish
Title of host publicationProceedings of the 57th Annual Meeting of the Association for Computational Linguistics (long papers)
EditorsAnna Korhonen, David Traum, Lluís Màrquez
Place of PublicationFlorence, Italy
PublisherACL Anthology
Pages 2963–2977
Number of pages15
Volume1
ISBN (Print)978-1-950737-48-2
Publication statusE-pub ahead of print - 2 Aug 2019
Event57th Annual Meeting of the Association for Computational Linguistics - Fortezza da Basso, Florence, Italy
Duration: 28 Jul 20192 Aug 2019
Conference number: 57
http://www.acl2019.org/EN/index.xhtml

Conference

Conference57th Annual Meeting of the Association for Computational Linguistics
Abbreviated titleACL 2019
CountryItaly
CityFlorence
Period28/07/192/08/19
Internet address

Fingerprint Dive into the research topics of 'Interpretable Neural Predictions with Differentiable Binary Variables'. Together they form a unique fingerprint.

Cite this