Abstract
Wikidata is the only general-purpose open knowledge graph with the capability of specifying references for every single statement. Currently, about 68% of Wikidata statements have at least one reference but the quality of these references is rarely covered in data quality studies. There is also a lack of a comprehensive framework for evaluating references. In this paper, we investigate the statistics of Wikidata references in 6 topical subsets of Wikidata. We compare these statistics over two Wikidata dumps; one from 2016 and one from 2021.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2nd Wikidata Workshop (Wikidata 2021) |
Editors | Lucie-Aimée Kaffee, Simon Razniewski, Aidan Hogan |
Publisher | CEUR-WS.org |
Number of pages | 14 |
Publication status | Published - 24 Oct 2021 |
Event | 2nd Wikidata Workshop - Virtual, Online Duration: 24 Oct 2021 → 24 Oct 2021 https://wikidataworkshop.github.io/2021/ |
Conference
Conference | 2nd Wikidata Workshop |
---|---|
Abbreviated title | Wikidata 2021 |
City | Virtual, Online |
Period | 24/10/21 → 24/10/21 |
Internet address |
Keywords
- Data quality
- Gene Wiki
- Reference quality
- Topical subset
- Wikidata
- WikiProject