Documenting gender identities: Challenges and approaches to records of gender in archival metadata descriptions

Research output: Contribution to conferenceAbstractpeer-review

Abstract / Description of output

Gender bias has been built into algorithms through data collection practices that privilege a particular perspective, misrepresenting or excluding perspectives of many gender groups. Assumptions these algorithms make about data’s representation of universal truths shape the way people find and interpret information for learning and research, rendering so-called unauthorized, minoritized, or perverse perspectives invisible. As digitization of heritage collections progresses, and the online discoverability of heritage data grows, there is a risk that historical perspectives within cultural heritage collections will amplify gender stereotyping and discrimination already well-documented as sources of oppression. Scholars and practitioners have published approaches for the removal of gender bias, attempting to create objective technologies. However, little attention has been given to understanding the origins of gender bias, and how it manifests in the descriptive language of heritage collections’ metadata. As records of culture and history, heritage institutions’ metadata descriptions provide repositories of text well-suited to serve as data sources for diachronic, intersectional analyses of gender-biased language. Only once we understand gender bias – where it comes from, how it is communicated, how it varies from one culture to another – can we begin to effectively mitigate its harmful consequences and design technological systems that can navigate it. Through a case study with English-language archives at the University of Edinburgh’s Centre for Research Collections, this work outlines the challenges of respecting gender identities in a heritage context and lays out a path to addressing these challenges through natural language processing and participatory research methods.
Original languageEnglish
Publication statusPublished - 1 Jun 2021
EventAssociation for Computers and the Humanities Conference 2021 - Online, United States
Duration: 21 Jul 202122 Jul 2021


ConferenceAssociation for Computers and the Humanities Conference 2021
Abbreviated titleACH2021
Country/TerritoryUnited States
Internet address

Keywords / Materials (for Non-textual outputs)

  • gender bias
  • archives
  • metadata descriptions
  • text mining
  • participatory research


Dive into the research topics of 'Documenting gender identities: Challenges and approaches to records of gender in archival metadata descriptions'. Together they form a unique fingerprint.

Cite this