Abstract / Description of output
Over the past years, Twitter has earned a growing reputation as a hub for communication, and events advertisement and tracking. However, several recent research studies have shown that Twitter users (and microblogging platforms' users in general) are increasingly posting microblogs containing questions seeking answers from their readers. To help those users answer or route their questions, the problem of question identification in tweets has been studied over English tweets; up to our knowledge, no study has attempted it over Arabic (not to mention dialectal Arabic) tweets.
In this paper, we tackle the problem of identifying answer-seeking questions in different dialects over a large collection of Arabic tweets. Our approach is 2-stage. We first used a rule-based filter to extract tweets with interrogative questions. We then leverage a binary classifier (trained using a carefully-developed set of features) to detect tweets with answer-seeking questions. In evaluating the classifier, we used a set of randomly-sampled dialectal Arabic tweets that were labeled using crowdsourcing. Our approach achieved a relatively-good performance as a first study of that problem on the Arabic domain, exhibiting 64% recall with 80% precision in identifying tweets with answer-seeking questions.
In this paper, we tackle the problem of identifying answer-seeking questions in different dialects over a large collection of Arabic tweets. Our approach is 2-stage. We first used a rule-based filter to extract tweets with interrogative questions. We then leverage a binary classifier (trained using a carefully-developed set of features) to detect tweets with answer-seeking questions. In evaluating the classifier, we used a set of randomly-sampled dialectal Arabic tweets that were labeled using crowdsourcing. Our approach achieved a relatively-good performance as a first study of that problem on the Arabic domain, exhibiting 64% recall with 80% precision in identifying tweets with answer-seeking questions.
Original language | English |
---|---|
Title of host publication | Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management |
Place of Publication | New York, NY, USA |
Publisher | ACM |
Pages | 1839-1842 |
Number of pages | 4 |
ISBN (Print) | 978-1-4503-2598-1 |
DOIs | |
Publication status | Published - Nov 2014 |
Publication series
Name | CIKM '14 |
---|---|
Publisher | ACM |