From Arabic Sentiment Analysis to Sarcasm Detection: The ArSarcasm Dataset

Ibrahim Abu Farha, Walid Magdy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Sarcasm is one of the main challenges for sentiment analysis systems. Its complexity comes from the expression of opinion using implicit indirect phrasing. In this paper, we present ArSarcasm, an Arabic sarcasm detection dataset, which was created through the reannotation of available Arabic sentiment analysis datasets. The dataset contains 10,547 tweets, 16% of which are sarcastic. In addition to sarcasm the data was annotated for sentiment and dialects. Our analysis shows the highly subjective nature of these tasks, which is demonstrated by the shift in sentiment labels based on annotators’ biases. Experiments show the degradation of state-of-the-art sentiment analysers when faced with sarcastic content. Finally, we train a deep learning model for sarcasm detection using BiLSTM. The model achieves an F1-score of 0.46, which shows the challenging nature of the task, and should act as a basic baseline for future research on our dataset.
Original languageEnglish
Title of host publicationProceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools
PublisherEuropean Language Resources Association (ELRA)
Pages32-39
Number of pages8
ISBN (Electronic)979-10-95546-51-1
Publication statusPublished - 12 May 2020
EventThe 4th Workshop on Open-Source Arabic Corpora and Processing Tools - Marseille, France
Duration: 12 May 202012 May 2020
http://edinburghnlp.inf.ed.ac.uk/workshops/OSACT4/

Workshop

WorkshopThe 4th Workshop on Open-Source Arabic Corpora and Processing Tools
Abbreviated titleOSACT4
Country/TerritoryFrance
CityMarseille
Period12/05/2012/05/20
Internet address

Fingerprint

Dive into the research topics of 'From Arabic Sentiment Analysis to Sarcasm Detection: The ArSarcasm Dataset'. Together they form a unique fingerprint.

Cite this