Identification of Answer-Seeking Questions in Arabic Microblogs

Maram Hasanain, Tamer Elsayed, Walid Magdy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Over the past years, Twitter has earned a growing reputation as a hub for communication, and events advertisement and tracking. However, several recent research studies have shown that Twitter users (and microblogging platforms' users in general) are increasingly posting microblogs containing questions seeking answers from their readers. To help those users answer or route their questions, the problem of question identification in tweets has been studied over English tweets; up to our knowledge, no study has attempted it over Arabic (not to mention dialectal Arabic) tweets.

In this paper, we tackle the problem of identifying answer-seeking questions in different dialects over a large collection of Arabic tweets. Our approach is 2-stage. We first used a rule-based filter to extract tweets with interrogative questions. We then leverage a binary classifier (trained using a carefully-developed set of features) to detect tweets with answer-seeking questions. In evaluating the classifier, we used a set of randomly-sampled dialectal Arabic tweets that were labeled using crowdsourcing. Our approach achieved a relatively-good performance as a first study of that problem on the Arabic domain, exhibiting 64% recall with 80% precision in identifying tweets with answer-seeking questions.
Original languageEnglish
Title of host publicationProceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
Place of PublicationNew York, NY, USA
PublisherACM
Pages1839-1842
Number of pages4
ISBN (Print)978-1-4503-2598-1
DOIs
Publication statusPublished - Nov 2014

Publication series

NameCIKM '14
PublisherACM

Fingerprint

Dive into the research topics of 'Identification of Answer-Seeking Questions in Arabic Microblogs'. Together they form a unique fingerprint.

Cite this