Reference statistics in wikidata topical subsets

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Wikidata is the only general-purpose open knowledge graph with the capability of specifying references for every single statement. Currently, about 68% of Wikidata statements have at least one reference but the quality of these references is rarely covered in data quality studies. There is also a lack of a comprehensive framework for evaluating references. In this paper, we investigate the statistics of Wikidata references in 6 topical subsets of Wikidata. We compare these statistics over two Wikidata dumps; one from 2016 and one from 2021.

Original languageEnglish
Title of host publicationProceedings of the 2nd Wikidata Workshop (Wikidata 2021)
EditorsLucie-Aimée Kaffee, Simon Razniewski, Aidan Hogan
Number of pages14
Publication statusPublished - 24 Oct 2021
Event2nd Wikidata Workshop - Virtual, Online
Duration: 24 Oct 202124 Oct 2021


Conference2nd Wikidata Workshop
Abbreviated titleWikidata 2021
CityVirtual, Online
Internet address


  • Data quality
  • Gene Wiki
  • Reference quality
  • Topical subset
  • Wikidata
  • WikiProject


Dive into the research topics of 'Reference statistics in wikidata topical subsets'. Together they form a unique fingerprint.

Cite this