Explainability and hate speech: Structured explanations make social media moderators faster

Agostina Calabrese*, Leonardo Neves, Neil Shah, Maarten W. Bos, Björn Ross, Mirella Lapata, Francesco Barbieri

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Content moderators play a key role in keeping the conversation on social media healthy. While the high volume of content they need to judge represents a bottleneck to the moderation pipeline, no studies have explored how models could support them to make faster decisions. There is, by now, a vast body of research into detecting hate speech, sometimes explicitly motivated by a desire to help improve content moderation, but published research using real content moderators is scarce. In this work we investigate the effect of explanations on the speed of real-world moderators. Our experiments show that while generic explanations do not affect their speed and are often ignored, structured explanations lower moderators' decision making time by 7.4%.
Original languageEnglish
Title of host publicationProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics
PublisherACL Anthology
DOIs
Publication statusAccepted/In press - 16 May 2024
EventThe 62nd Annual Meeting of the Association for Computational Linguistics - Centara Grand and Bangkok Convention Centre at CentralWorld, Bangkok, Thailand
Duration: 11 Aug 202416 Aug 2024
Conference number: 62
https://2024.aclweb.org/

Conference

ConferenceThe 62nd Annual Meeting of the Association for Computational Linguistics
Abbreviated titleACL 2024
Country/TerritoryThailand
CityBangkok
Period11/08/2416/08/24
Internet address

Fingerprint

Dive into the research topics of 'Explainability and hate speech: Structured explanations make social media moderators faster'. Together they form a unique fingerprint.

Cite this