Heterogeneous and Permanent data

Project Details

Key findings

Data currency: Much real-wold data is out of data. The challenge here is to identify the current values of real-world entities, and to answer queries using those current values. Fan has developed a data currency model that does not depend on the availability of reliable timestamps, as well as practical techniques to deduce the true (consistent and current) values of entities. This is a new topic that is of interest to data fusion, data integration and data quality.
Graph queries for social network analysis: This important area is plagued by the computational complexity of answering queries on very large graphs. Fan and Libkin have developed new graph pattern languages for querying social data, as well as effective incremental and distributed evaluation algorithms for coping the sheer size of social graphs.
In addition, Fan has worked on relative information completeness; he has extended his work on conditional dependencies for data cleaning and has investigated the complexity of recommendation systems
Libkin has developed a theories of incomplete information for XML and in graph databases.
Viglas has developed new algorithms for indexing in flash memory. The asymmetry in I/O requires conventional data structures to be revisted.
Buneman has studied the issues of provenance and annotation in linked data. Initially we believed that models developed for relational databases would extend immediately to RDF, but this turns out not to be the case, and entirely new models need to be developed. In addition he has studied the interaction between workflow and data provenance. He has also started to study methods for generating citations for data.
Effective start/end date1/03/0829/02/12


  • EPSRC: £1,515,335.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.