Requirements for Provenance on the Web

Paul T. Groth, Yolanda Gil, James Cheney, Simon Miles

Research output: Contribution to journalArticlepeer-review


From where did this tweet originate? Was this quote from the New York Times modified? Daily, we rely on data from the Web, but often it is difficult or impossible to determine where it came from or how it was produced. This lack of provenance is particularly evident when people and systems deal with Web information or with any environment where information comes from sources of varying quality. Provenance is not captured pervasively in information systems. There are major technical, social, and economic impediments that stand in the way of using provenance effectively. This paper synthesizes requirements for provenance on the Web for a number of dimensions, focusing on three key aspects of provenance: the content of provenance, the management of provenance records, and the uses of provenance information. To illustrate these requirements, we use three synthesized scenarios that encompass provenance problems faced by Web users today.
Original languageEnglish
Pages (from-to)39-56
Number of pages18
JournalInternational Journal of Digital Curation
Issue number1
Publication statusPublished - 2012


Dive into the research topics of 'Requirements for Provenance on the Web'. Together they form a unique fingerprint.

Cite this