Abstract
We investigate the use of machine learning classifiers for detecting online abuse in empirical research. We show that uncalibrated clas- sifiers (i.e. where the ‘raw’ scores are used) align poorly with human evaluations. This limits their use for understanding the dynamics, patterns and prevalence of online abuse. We examine two widely used classifiers (created by Perspective and Davidson et al.) on a dataset of tweets directed against candidates in the UK’s 2017 general election. A Bayesian approach is presented to recalibrate the raw scores from the classifiers, using probabilistic programming and newly annotated data. We argue that interpretability evaluation and recalibration is integral to the application of abusive content classifiers.
Original language | English |
---|---|
Title of host publication | Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science |
Editors | David Bamman, Dirk Hovy, David Jurgens, Brendan O'Connor, Svitlana Volkova |
Publisher | ACL Anthology |
Pages | 132-138 |
Number of pages | 7 |
ISBN (Print) | 978-1-952148-80-4 |
DOIs | |
Publication status | Published - 20 Nov 2020 |
Event | Fourth Workshop on Natural Language Processing and Computational Social Science 2020 - Online Duration: 20 Nov 2020 → 20 Nov 2020 Conference number: 4 https://sites.google.com/site/nlpandcss/previous-editions/nlp-css-at-emnlp-2020 |
Workshop
Workshop | Fourth Workshop on Natural Language Processing and Computational Social Science 2020 |
---|---|
Abbreviated title | NLP+CSS 2020 |
Period | 20/11/20 → 20/11/20 |
Internet address |