Abstract
Linguistic Inquiry and Word Count (LIWC), a popular tool for automated text analysis, relies on an expert-crafted internal dictionary of psychologically relevant words and their corresponding categories. While LIWC’s dictionary covers a significant portion of commonly used words, the continuous evolution of language and the usage of slang in settings such as social media requires fixed resources to be frequently updated in order to stay relevant. In this work we present LIWC-UD, an automatically generated extension to LIWC’s dictionary which includes terms defined in Urban Dictionary. While original LIWC contains 6,547 unique entries, LIWC-UD consists of 141K unique terms automatically categorized into LIWC categories with high confidence using BERT classifier. LIWC-UD covers many additional terms that are commonly used on social media platforms like Twitter. We release LIWC-UD publicly to the community as a supplement to the original LIWC lexicon.
Original language | English |
---|---|
Title of host publication | Proceedings of the 14th ACM Web Science Conference |
Publisher | Association for Computing Machinery (ACM) |
Pages | 422–432 |
Number of pages | 11 |
ISBN (Electronic) | 978-1-4503-9191-7 |
DOIs | |
Publication status | Published - 26 Jun 2022 |
Event | 14th ACM Web Science Conference 2022 - Barcelona, Spain Duration: 26 Jun 2022 → 29 Jun 2022 Conference number: 14 https://websci22.webscience.org/ |
Conference
Conference | 14th ACM Web Science Conference 2022 |
---|---|
Abbreviated title | WebSci 2022 |
Country/Territory | Spain |
City | Barcelona |
Period | 26/06/22 → 29/06/22 |
Internet address |
Keywords
- LIWC
- Urban Dictionary
- Lexicons
- Expansion