Normalization Theory for XML

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Specifications of XML documents typically consist of typing information (e.g., a DTD), and integrity constraints. Just like relational schema specifications, not all are good – some are prone to redundancies and update anomalies. In the relational world we have a well-developed theory of data design (also known as normalization). A few definitions of XML normal forms have been proposed, but the main question is why a particular design is good. In the XML world, we still lack universally accepted query languages such as relational algebra, or update languages that let us reason about storage redundancies, lossless decompositions, and update anomalies. A better approach, therefore, is to come up with notions of good design based on the intrinsic properties of the model itself. We present such an approach, based on Shannon’s information theory, and show how it applies to relational normal forms as well as to XML design, for both native and relational storage.
Original languageEnglish
Title of host publicationDatabase and XMLTechnologies
Subtitle of host publication 5th International XML Database Symposium, XSym 2007, Vienna, Austria, September 23-24, 2007, Proceedings
PublisherSpringer
Pages1-13
Number of pages13
Volume4704
ISBN (Electronic)978-3-540-75288-2
ISBN (Print)978-3-540-75287-5
DOIs
Publication statusPublished - 2007

Fingerprint

Dive into the research topics of 'Normalization Theory for XML'. Together they form a unique fingerprint.

Cite this