A Framework for Evaluating Snippet Generation for Dataset Search

Xiaxia Wang, Jinchi Chen, Shuxin Li, Gong Cheng, Jeff Z. Pan, Evgeny Kharlamov, Yuzhong Qu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Reusing existing datasets is of considerable significance to researchers and developers. Dataset search engines help a user find relevant datasets for reuse. They can present a snippet for each retrieved dataset to explain its relevance to the user’s data needs. This emerging problem of snippet generation for dataset search has not received much research attention. To provide a basis for future research, we introduce a framework for quantitatively evaluating the quality of a dataset snippet. The proposed metrics assess the extent to which a snippet matches the query intent and covers the main content of the dataset. To establish a baseline, we adapt four state-of-the-art methods from related fields to our problem, and perform an empirical evaluation based on real-world datasets and queries. We also conduct a user study to verify our findings. The results demonstrate the effectiveness of our evaluation framework, and suggest directions for future research.
Original languageEnglish
Title of host publicationThe Semantic Web -- ISWC 2019
Subtitle of host publication18th International Semantic Web Conference, Auckland, New Zealand, October 26–30, 2019, Proceedings, Part I
EditorsChiara Ghidini, Olaf Hartig, Maria Maleshkova, Vojtech Svátek, Isabel Cruz, Aidan Hogan, Jie Song, Maxime Lefrançois, Fabien Gandon
Place of PublicationCham
PublisherSpringer
Pages680-697
Number of pages18
ISBN (Electronic)978-3-030-30793-6
ISBN (Print)978-3-030-30792-9
DOIs
Publication statusPublished - 17 Oct 2019
Event16th International Symposium on Wireless Communication Systems - Oulu, Finland
Duration: 27 Aug 201930 Aug 2019
http://iswcs2019.org/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer, Cham
Volume11778
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Symposium on Wireless Communication Systems
Abbreviated titleISWCS 2019
Country/TerritoryFinland
CityOulu
Period27/08/1930/08/19
Internet address

Keywords / Materials (for Non-textual outputs)

  • Snippet generation
  • Dataset search
  • Evaluation metric

Fingerprint

Dive into the research topics of 'A Framework for Evaluating Snippet Generation for Dataset Search'. Together they form a unique fingerprint.

Cite this