Abstract
Speech acts are the type of communicative acts within a conversation. Speech act recognition (aka classification) has been an active research in recent years. However, much less attention was directed towards this task in Arabic due to the lack of resources for training an Arabic speech-act classifier. In this paper we present ArSAS , an Arabic corpus of tweets annotated for the tasks of speech-act recognition and sentiment analysis. A large set of 21k Arabic tweets covering multiple topics were collected, prepared and annotated for six different classes of speech-act labels, such as expression, assertion, and question. In addition, the same set of tweets were also annotated with four classes of sentiment. We aim to have this corpus promoting the research in both speech-act recognition and sentiment analysis tasks for Arabic language.
Original language | English |
---|---|
Title of host publication | Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) |
Place of Publication | Paris, France |
Publisher | European Language Resources Association (ELRA) |
Number of pages | 6 |
ISBN (Electronic) | 979-10-95546-25-2 |
Publication status | E-pub ahead of print - 12 May 2018 |
Event | The 3rd Workshop on Open-Source Arabic Corpora and Processing Tools - Miyazaki, Japan Duration: 8 May 2018 → … http://edinburghnlp.inf.ed.ac.uk/workshops/OSACT3/ |
Conference
Conference | The 3rd Workshop on Open-Source Arabic Corpora and Processing Tools |
---|---|
Abbreviated title | OSACT3 |
Country/Territory | Japan |
City | Miyazaki |
Period | 8/05/18 → … |
Internet address |