LIWC-UD: Classifying Online Slang Terms into LIWC Categories

Mohamed Bahgat, Steve R. Wilson, Walid Magdy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Linguistic Inquiry and Word Count (LIWC), a popular tool for automated text analysis, relies on an expert-crafted internal dictionary of psychologically relevant words and their corresponding categories. While LIWC’s dictionary covers a significant portion of commonly used words, the continuous evolution of language and the usage of slang in settings such as social media requires fixed resources to be frequently updated in order to stay relevant. In this work we present LIWC-UD, an automatically generated extension to LIWC’s dictionary which includes terms defined in Urban Dictionary. While original LIWC contains 6,547 unique entries, LIWC-UD consists of 141K unique terms automatically categorized into LIWC categories with high confidence using BERT classifier. LIWC-UD covers many additional terms that are commonly used on social media platforms like Twitter. We release LIWC-UD publicly to the community as a supplement to the original LIWC lexicon.
Original languageEnglish
Title of host publicationProceedings of the 14th ACM Web Science Conference
PublisherAssociation for Computing Machinery (ACM)
Pages422–432
Number of pages11
ISBN (Electronic)978-1-4503-9191-7
DOIs
Publication statusPublished - 26 Jun 2022
Event14th ACM Web Science Conference 2022
- Barcelona, Spain
Duration: 26 Jun 202229 Jun 2022
Conference number: 14
https://websci22.webscience.org/

Conference

Conference14th ACM Web Science Conference 2022
Abbreviated titleWebSci 2022
Country/TerritorySpain
CityBarcelona
Period26/06/2229/06/22
Internet address

Keywords

  • LIWC
  • Urban Dictionary
  • Lexicons
  • Expansion

Fingerprint

Dive into the research topics of 'LIWC-UD: Classifying Online Slang Terms into LIWC Categories'. Together they form a unique fingerprint.

Cite this