"How Old Do You Think I Am?" A Study of Language and Age in Twitter

Dong Nguyen, Rilana Gravel, Dolf Trieschnigg, Theo Meder

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Multilingual speakers switch between languages in online and spoken communication. Analyses of large scale multilingual data require automatic language identification at the word level. For our experiments with multilingual online discussions, we first tag the language of individual words using language models and dictionaries. Secondly, we incorporate context to improve the performance. We achieve an accuracy of 98%. Besides word level accuracy, we use two new metrics to evaluate this task.
Original languageEnglish
Title of host publicationProceedings of the Seventh International Conference on Weblogs and Social Media, ICWSM 2013, Cambridge, Massachusetts, USA, July 8-11, 2013.
PublisherThe AAAI Press
Pages857–862
Number of pages6
Publication statusPublished - 28 Jun 2013

Fingerprint Dive into the research topics of '"How Old Do You Think I Am?" A Study of Language and Age in Twitter'. Together they form a unique fingerprint.

Cite this