Projects per year
Abstract / Description of output
A fundamental concern of information integration in an XML context is the ability to embed one or more source documents in a target document so that (a) the target document conforms to a target schema and (b) the information in the source document(s) is preserved. In this paper, information preservation for XML is formally studied, and the results of this study guide the definition of a novel notion of schema embedding between two XML DTD schemas represented as graphs. Schema embedding generalizes the conventional notion of graph similarity by allowing an edge in a source DTD schema to be mapped to a path in the target DTD. Instance-level embeddings can be defined from the schema embedding in a straightforward manner, such that conformance to a target schema and information preservation are guaranteed. We show that it is NP-complete to find an embedding between two DTD schemas. We also provide efficient heuristic algorithms to find candidate embeddings, along with experimental results to evaluate and compare the algorithms. These yield the
first systematic and effective approach to finding information preserving XML mappings.
first systematic and effective approach to finding information preserving XML mappings.
Original language | English |
---|---|
Title of host publication | Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30 - September 2, 2005 |
Pages | 85-96 |
Number of pages | 12 |
Publication status | Published - 2005 |
Fingerprint
Dive into the research topics of 'Information Preserving XML Schema Embedding'. Together they form a unique fingerprint.Projects
- 2 Finished