Projects per year
Abstract / Description of output
Data in real-life databases become obsolete rapidly. One often finds that multiple values of the same entity reside in a database. While all of these values were once correct, most of them may have become stale and inaccurate. Worse still, the values often do not carry reliable timestamps. With this comes the need for studying data currency, to identify the current value of an entity in a database and to answer queries with the current values, in the absence of timestamps.
This paper investigates the currency of data. (1) We propose a model that specifies partial currency orders in terms of simple constraints. The model also allows us to express what values are copied from other data sources, bearing currency orders in those sources, in terms of copy functions defined on correlated attributes. (2) We study fundamental problems for data currency, to determine whether a specification is consistent, whether a value is more current than another, and whether a query answer is certain no matter how partial currency orders are completed. (3) Moreover, we identify several problems associated with copy functions, to decide whether a copy function imports sufficient current data to answer a query, whether such a function copies redundant data, whether a copy function can be extended to import necessary current data for a query while respecting the constraints, and whether it suffices to copy data of a bounded size. (4) We establish upper and lower bounds of these problems, all matching, for combined complexity and data complexity, and for a variety of query languages. We also identify special cases that warrant lower complexity.
This paper investigates the currency of data. (1) We propose a model that specifies partial currency orders in terms of simple constraints. The model also allows us to express what values are copied from other data sources, bearing currency orders in those sources, in terms of copy functions defined on correlated attributes. (2) We study fundamental problems for data currency, to determine whether a specification is consistent, whether a value is more current than another, and whether a query answer is certain no matter how partial currency orders are completed. (3) Moreover, we identify several problems associated with copy functions, to decide whether a copy function imports sufficient current data to answer a query, whether such a function copies redundant data, whether a copy function can be extended to import necessary current data for a query while respecting the constraints, and whether it suffices to copy data of a bounded size. (4) We establish upper and lower bounds of these problems, all matching, for combined complexity and data complexity, and for a variety of query languages. We also identify special cases that warrant lower complexity.
Original language | English |
---|---|
Title of host publication | PODS '11 Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems |
Place of Publication | New York, NY, USA |
Publisher | ACM |
Pages | 71-82 |
Number of pages | 12 |
ISBN (Print) | 978-1-4503-0660-7 |
DOIs | |
Publication status | Published - 2011 |
Event | 2011 ACM SIGMOD/PODS Conference - Athens, Greece Duration: 12 Jun 2011 → 16 Jun 2011 |
Conference
Conference | 2011 ACM SIGMOD/PODS Conference |
---|---|
Country/Territory | Greece |
City | Athens |
Period | 12/06/11 → 16/06/11 |
Keywords / Materials (for Non-textual outputs)
- consistency, currency, data quality
Fingerprint
Dive into the research topics of 'Determining the currency of data'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Heterogeneous and Permanent data
Buneman, P., Fan, W., Libkin, L. & Viglas, S.
1/03/08 → 29/02/12
Project: Research