Mitigating covertly unsafe text within natural language systems

Alex Mei, Anisha Kabir, Sharon Levy, Melanie Subbiah, Emily Allaway, John Judge, Desmond Patton, Bruce Bimber, Kathleen McKeown, William Yang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

An increasingly prevalent problem for intelligent technologies is text safety, as uncontrolled systems may generate recommendations to their users that lead to injury or life-threatening consequences. However, the degree of explicitness of a generated statement that can cause physical harm varies. In this paper, we distinguish types of text that can lead to physical harm and establish one particularly underexplored category: covertly unsafe text. Then, we further break down this category with respect to the system’s information and discuss solutions to mitigate the generation of text in each of these subcategories. Ultimately, our work defines the problem of covertly unsafe language that causes physical harm and argues that this subtle yet dangerous issue needs to be prioritized by stakeholders and regulators. We highlight mitigation strategies to inspire future researchers to tackle this challenging problem and help improve safety within smart systems.
Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics: EMNLP 2022
EditorsYoav Goldberg, Zornitsa Kozareva, Yue Zhang
Place of PublicationAbu Dhabi, United Arab Emirates
PublisherAssociation for Computational Linguistics
Pages2914–2926
Number of pages13
Edition3
ISBN (Electronic)9781959429432
DOIs
Publication statusPublished - 11 Dec 2022
EventThe 2022 Conference on Empirical Methods in Natural Language Processing - Abu Dhabi National Exhibition Centre, Abu Dhabi, United Arab Emirates
Duration: 7 Dec 202211 Dec 2022
Conference number: 27
https://2022.emnlp.org/

Publication series

NameFindings of the Association for Computational Linguistics
PublisherACL
ISSN (Print)0891-2017
ISSN (Electronic)1530-9312

Conference

ConferenceThe 2022 Conference on Empirical Methods in Natural Language Processing
Abbreviated titleEMNLP 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period7/12/2211/12/22
Internet address

Fingerprint

Dive into the research topics of 'Mitigating covertly unsafe text within natural language systems'. Together they form a unique fingerprint.

Cite this