LIWC-UD: Classifying Online Slang Terms into LIWC Categories

Mohamed Bahgat, Steve R. Wilson, Walid Magdy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Linguistic Inquiry and Word Count (LIWC), a popular tool for automated text analysis, relies on an expert-crafted internal dictionary of psychologically relevant words and their corresponding categories. While LIWC’s dictionary covers a significant portion of commonly used words, the continuous evolution of language and the usage of slang in settings such as social media requires fixed resources to be frequently updated in order to stay relevant. In this work we present LIWC-UD, an automatically generated extension to LIWC’s dictionary which includes terms defined in Urban Dictionary. While original LIWC contains 6,547 unique entries, LIWC-UD consists of 141K unique terms automatically categorized into LIWC categories with high confidence using BERT classifier. LIWC-UD covers many additional terms that are commonly used on social media platforms like Twitter. We release LIWC-UD publicly to the community as a supplement to the original LIWC lexicon.
Original languageEnglish
Title of host publicationProceedings of the 14th ACM Web Science Conference
PublisherAssociation for Computing Machinery (ACM)
Number of pages11
ISBN (Electronic)978-1-4503-9191-7
Publication statusPublished - 26 Jun 2022
Event14th ACM Web Science Conference 2022
- Barcelona, Spain
Duration: 26 Jun 202229 Jun 2022
Conference number: 14


Conference14th ACM Web Science Conference 2022
Abbreviated titleWebSci 2022
Internet address

Keywords / Materials (for Non-textual outputs)

  • LIWC
  • Urban Dictionary
  • Lexicons
  • Expansion


Dive into the research topics of 'LIWC-UD: Classifying Online Slang Terms into LIWC Categories'. Together they form a unique fingerprint.

Cite this