WikiCatSum is a domain specific Multi-Document Summarisation (MDS) dataset. It assumes the summarisation task of generating Wikipedia lead sections for Wikipedia entities of a certain domain (e.g. Companies) from the set of documents cited in Wikipedia articles or returned by Google (using article titles as queries). The dataset includes three domains: Companies, Films, and Animals.

Data Citation

Perez-Beltrachini, Laura; Liu, Yang; Lapata, Mirella. (2019). WikiCatSum, [text]. https://doi.org/10.7488/ds/2582.
Date made available28 Jun 2019
PublisherEdinburgh DataShare

Cite this