(Commercial) Automatic Speech Recognition as a Tool in Sociolinguistic Research

Nina Markl

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

As speech datasets used in sociolinguistic research increase in size, laborious and time-intensive manual orthographic transcription is a challenge, limiting the amount of (transcribed) data which can be analysed. In this paper, I discuss the use of (commercial) automatic speech recognition (ASR) as a tool in sociolinguistic research in the context of a case study: the Lothian Diary Project. I describe the kinds of errors produced by two commercial ASR systems for British English within the broader context of algorithmic bias in ASR, and suggest some best practices when working with ASR in sociolinguistic work.
Original languageEnglish
Title of host publicationUniversity of Pennsylvania Working Papers in Linguistics
Subtitle of host publicationSelected Papers from NWAV 49
PublisherPenn Graduate Linguistics Society
Number of pages12
Volume28
Edition2
Publication statusPublished - 19 Sept 2022
Event49th meeting of New Ways of Analyzing Variation - The University of Texas, Austin, United States
Duration: 19 Oct 202124 Oct 2021
Conference number: 49
https://www.nwav49.org/

Conference

Conference49th meeting of New Ways of Analyzing Variation
Abbreviated titleNWAV 49
Country/TerritoryUnited States
CityAustin
Period19/10/2124/10/21
Internet address

Fingerprint

Dive into the research topics of '(Commercial) Automatic Speech Recognition as a Tool in Sociolinguistic Research'. Together they form a unique fingerprint.

Cite this