Uncertainty and inclusivity in gender bias annotation: An annotation taxonomy and annotated datasets of British English text

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Mitigating harms from gender biased language in Natural Language Processing (NLP) systems remains a challenge, and the situated nature of language means bias is inescapable in NLP data. Though efforts to mitigate gender bias in NLP are numerous, they often vaguely define gender and bias, only consider two genders, and do not incorporate uncertainty into models. To address these limitations, in this paper we present a taxonomy of gender biased language and apply it to create annotated datasets. We created the taxonomy and annotated data with the aim of making gender bias in language transparent. If biases are communicated clearly, varieties of biased language can be better identified and measured. Our taxonomy contains eleven types of gender biases inclusive of people whose gender expressions do not fit into the binary conceptions of woman and man, and whose gender differs from that they were assigned at birth, while also allowing annotators to document unknown gender information. The taxonomy and annotated data will, in future work, underpin analysis and more equitable language model development.
Original languageEnglish
Title of host publicationProceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
EditorsChristian Hardmeier, Christine Basta, Marta R. Costa-jussà, Gabriel Stanovsky, Hila Gonen
PublisherACL Anthology
Pages30-57
Number of pages28
ISBN (Print)9781955917681
DOIs
Publication statusPublished - 15 Jul 2022
Event4th Workshop on Gender Bias in Natural Language Processing at NAACL - Seattle, United States
Duration: 15 Jul 202215 Jul 2022
https://genderbiasnlp.talp.cat

Workshop

Workshop4th Workshop on Gender Bias in Natural Language Processing at NAACL
Country/TerritoryUnited States
CitySeattle
Period15/07/2215/07/22
Internet address

Fingerprint

Dive into the research topics of 'Uncertainty and inclusivity in gender bias annotation: An annotation taxonomy and annotated datasets of British English text'. Together they form a unique fingerprint.

Cite this