Building and Validating Hierarchical Lexicons with a Case Study on Personal Values

Steven Wilson, Yiting Shen, Rada Mihalcea

Research output: Chapter in Book/Report/Conference proceedingConference contribution


We introduce a crowd-powered approach for the creation of a lexicon for any theme given a set of seed words that cover a variety of concepts within the theme. Terms are initially sorted by automatically clustering their embeddings and subsequently rearranged by crowd workers in order to create a tree structure. This type of organization captures hierarchical relationships between concepts and allows for a tunable level of specificity when using the lexicon to collect measurements from a piece of text. We use a lexicon expansion method to increase the overall coverage of the produced resource. Using our proposed approach, we create a hierarchical lexicon of personal values and evaluate its internal and external consistency. We release this novel resource to the community as a tool for measuring value content within text corpora.
Original languageEnglish
Title of host publicationSocial Informatics
EditorsSteffen Staab, Olessia Koltsova, Dmitry I. Ignatov
Place of PublicationCham
PublisherSpringer International Publishing AG
Number of pages16
ISBN (Electronic)978-3-030-01129-1
ISBN (Print)978-3-030-01128-4
Publication statusPublished - 20 Sep 2018
Event10th International Conference on Social Informatics 2018 - Saint Petersburg, Russian Federation
Duration: 25 Sep 201828 Sep 2018

Publication series

NameLecture Notes in Computer Science (LCNS)
PublisherSpringer, Cham
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference10th International Conference on Social Informatics 2018
Abbreviated titleSocInfo 2018
Country/TerritoryRussian Federation
CitySaint Petersburg
Internet address


  • Lexicon induction
  • Crowd sourcing
  • Personal values


Dive into the research topics of 'Building and Validating Hierarchical Lexicons with a Case Study on Personal Values'. Together they form a unique fingerprint.

Cite this