Abstract / Description of output
Existing corpora for intrinsic evaluation are not targeted towards tasks in informal domains such as Twitter or news comment forums. We want to test whether a representation of informal words fulfills the promise of eliding explicit text normalization as a preprocessing step. One possible evaluation metric for such domains is the proximity of spelling variants. We propose how such a metric might be computed and how a spelling variant dataset can be collected using UrbanDictionary.
Original language | English |
---|---|
Title of host publication | Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP |
Publisher | Association for Computational Linguistics |
Pages | 94-98 |
Number of pages | 5 |
ISBN (Electronic) | 978-1-945626-14-2 |
DOIs | |
Publication status | Published - 12 Aug 2016 |
Event | 1st Workshop on Evaluating Vector Space Representations for NLP - Berlin, Germany Duration: 12 Aug 2016 → 12 Aug 2016 https://sites.google.com/site/repevalacl16/home https://sites.google.com/site/repevalacl16/home |
Conference
Conference | 1st Workshop on Evaluating Vector Space Representations for NLP |
---|---|
Abbreviated title | RepEval 2016 |
Country/Territory | Germany |
City | Berlin |
Period | 12/08/16 → 12/08/16 |
Internet address |