Names, Nicknames, and Spelling Errors: Protecting Participant Identity in Learning Analytics of Online Discussions

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Messages exchanged between participants in online discussion forums often contain personal names and other details that need to be redacted before the data is used for research purposes in learning analytics. However, removing the names entirely makes it harder to track the exchange of ideas between individuals within a message thread and across threads, and thereby reduces the value of this type of conversational data. In contrast, the consistent use of pseudonyms allows contributions from individuals to be tracked across messages, while also hiding the real identities of the contributors. Several factors can make it difficult to identify all instances of personal names that refer to the same individual, including spelling errors and the use of shortened forms. We developed a semi-automated approach for replacing personal names with consistent pseudonyms. We evaluated our approach on a data set of over 1, 700 messages exchanged during a distance-learning course, and compared it to a general-purpose pseudonymisation tool that used deep neural networks to identify names to be redacted. We found that our tailored approach out-performed the general-purpose tool in both precision and recall, correctly identifying all but 31 substitutions out of 2, 888.
Original languageEnglish
Title of host publicationProceedings of the 13th International Conference on Learning Analytics and Knowledge (LAK23)
PublisherAssociation for Computing Machinery (ACM)
Pages145-155
Number of pages11
VolumeLAK23
ISBN (Electronic)9781450398657
DOIs
Publication statusPublished - 13 Mar 2023
EventThe 13th International Learning Analytics and Knowledge Conference, 2023 - Arlington, United States
Duration: 13 Mar 202317 Mar 2023
Conference number: 13
https://www.solaresearch.org/events/lak/lak23/

Conference

ConferenceThe 13th International Learning Analytics and Knowledge Conference, 2023
Abbreviated titleLAK 2023
Country/TerritoryUnited States
CityArlington
Period13/03/2317/03/23
Internet address

Keywords / Materials (for Non-textual outputs)

  • anonymisation
  • pseudonymisation
  • redaction
  • personal name
  • de-identification
  • learning analytics
  • ethical issues
  • privacy

Fingerprint

Dive into the research topics of 'Names, Nicknames, and Spelling Errors: Protecting Participant Identity in Learning Analytics of Online Discussions'. Together they form a unique fingerprint.

Cite this