The Edinburgh Twitter Corpus

Sasa Petrovic, Miles Osborne, Victor Lavrenko

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We describe the first release of our corpus of 97 million Twitter posts. We believe that this data will prove valuable to researchers working in social media, natural language processing, large-scale data processing, and similar areas.
Original languageEnglish
Title of host publicationProceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media
Place of PublicationStroudsburg, PA, USA
PublisherAssociation for Computational Linguistics
Pages25-26
Number of pages2
Publication statusPublished - Jun 2010

Fingerprint

Dive into the research topics of 'The Edinburgh Twitter Corpus'. Together they form a unique fingerprint.

Cite this