ClassStrength: A Multilingual Tool for Tweets Classification

Walid Magdy, Mohamed Eldesoukyy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

In this paper we present our multilingual tweet classification tool. ClassStrength provides a set of classification models in different languages that classify tweets into 14 generalpurpose categories, including: sports, politics, entertainment, comedy, etc. Our classifier uses a distant-supervision approach for creating training data in any available language on Twitter. The classifier uses a soft-classification scheme, where it generates a likelihood score for a tweet to match each of the 14 categories. The initial version of our tool covers five languages, namely: English, Arabic, French, German, and Russian. More languages are to be covered in next releases. The classification model created for each language is generated from hundreds of thousands of training tweets. Our evaluation to the classifier shows superior accuracy compared to standard manual methods. Our reported accuracy is 84% based on crowd preferences over a balanced test set of English tweets covering all 14 classes.
Original languageEnglish
Title of host publicationASONAM 2017 Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017
PublisherACM
Pages593-596
Number of pages4
ISBN (Electronic)978-1-4503-4993-2
DOIs
Publication statusPublished - 31 Jul 2017
Event2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 - Sydney, Australia
Duration: 31 Jul 20173 Aug 2017
http://asonam.cpsc.ucalgary.ca/2017/

Conference

Conference2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017
Abbreviated titleASONAM 2017
Country/TerritoryAustralia
CitySydney
Period31/07/173/08/17
Internet address

Fingerprint

Dive into the research topics of 'ClassStrength: A Multilingual Tool for Tweets Classification'. Together they form a unique fingerprint.

Cite this