Abstract
We introduce a crowd-powered approach for the creation of a lexicon for any theme given a set of seed words that cover a variety of concepts within the theme. Terms are initially sorted by automatically clustering their embeddings and subsequently rearranged by crowd workers in order to create a tree structure. This type of organization captures hierarchical relationships between concepts and allows for a tunable level of specificity when using the lexicon to collect measurements from a piece of text. We use a lexicon expansion method to increase the overall coverage of the produced resource. Using our proposed approach, we create a hierarchical lexicon of personal values and evaluate its internal and external consistency. We release this novel resource to the community as a tool for measuring value content within text corpora.
Original language | English |
---|---|
Title of host publication | Social Informatics |
Editors | Steffen Staab, Olessia Koltsova, Dmitry I. Ignatov |
Place of Publication | Cham |
Publisher | Springer |
Pages | 455-470 |
Number of pages | 16 |
ISBN (Electronic) | 978-3-030-01129-1 |
ISBN (Print) | 978-3-030-01128-4 |
DOIs | |
Publication status | Published - 20 Sept 2018 |
Event | 10th International Conference on Social Informatics 2018 - Saint Petersburg, Russian Federation Duration: 25 Sept 2018 → 28 Sept 2018 https://socinfo2018.hse.ru/ |
Publication series
Name | Lecture Notes in Computer Science (LCNS) |
---|---|
Publisher | Springer, Cham |
Volume | 11185 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 10th International Conference on Social Informatics 2018 |
---|---|
Abbreviated title | SocInfo 2018 |
Country/Territory | Russian Federation |
City | Saint Petersburg |
Period | 25/09/18 → 28/09/18 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- Lexicon induction
- Crowd sourcing
- Personal values