A hunt for the Snark: Annotator Diversity in Data Practices

Shivani Kapania, Alex S. Taylor, Ding Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Diversity in datasets is a key component to building responsible AI/ML. Despite this recognition, we know little about the diversity among the annotators involved in data production. We investigated the approaches to annotator diversity through 16 semi-structured interviews and a survey with 44 AI/ML practitioners. While practitioners described nuanced understandings of annotator diversity, they rarely designed dataset production to account for diversity in the annotation process. The lack of action was explained through operational barriers: from the lack of visibility in the annotator hiring process, to the conceptual difficulty in incorporating worker diversity. We argue that such operational barriers and the widespread resistance to accommodating annotator diversity surface a prevailing logic in data practices - where neutrality, objectivity and 'representationalist thinking' dominate. By understanding this logic to be part of a regime of existence, we explore alternative ways of accounting for annotator subjectivity and diversity in data practices.

Original languageEnglish
Title of host publicationCHI '23
Subtitle of host publicationProceedings of the 2023 CHI Conference on Human Factors in Computing Systems
PublisherAssociation for Computing Machinery
Pages1-15
Number of pages15
ISBN (Electronic)9781450394215
DOIs
Publication statusPublished - 19 Apr 2023
Event2023 CHI Conference on Human Factors in Computing Systems - Hamburg, Germany
Duration: 23 Apr 202328 Apr 2023
https://chi2023.acm.org/

Publication series

NameConference on Human Factors in Computing Systems - Proceedings

Conference

Conference2023 CHI Conference on Human Factors in Computing Systems
Abbreviated titleCHI 2023
Country/TerritoryGermany
CityHamburg
Period23/04/2328/04/23
Internet address

Keywords / Materials (for Non-textual outputs)

  • annotator diversity
  • data annotation
  • data production
  • data work
  • diversity
  • machine learning
  • ML datasets

Fingerprint

Dive into the research topics of 'A hunt for the Snark: Annotator Diversity in Data Practices'. Together they form a unique fingerprint.

Cite this