Union Types for Semistructured Data

Peter Buneman, Benjamin Pierce

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Semistructured databases are treated as dynamically typed: they come equipped with no independent schema or type system to constrain the data. Query languages that are designed for semistructured data, even when used with structured data, typically ignore any type information that may be present. The consequences of this are what one would expect from using a dynamic type system with complex data: fewer guarantees on the correctness of applications. For example, a query that would cause a type error in a statically typed query language will return the empty set when applied to a semistructured representation of the same data. Much semistructured data originates in structured data. A semistructured representation is useful when one wants to add data that does not conform to the original type or when one wants to combine sources of different types. However, the deviations from the prescribed types are often minor, and we believe that a better strategy than throwing away all type information is to preserve as much of it as possible. We describe a system of untagged union types that can accommodate variations in structure while still allowing a degree of static type checking. A novelty of this system is that it involves non-trivial equivalences among types, arising from a law of distributivity for records and unions: a value may be introduced with one type (e.g., a record containing a union) and used at another type (a union of records). We describe programming and query language constructs for dealing with such types, prove the soundness of the type system, and develop algorithms for subtyping and typechecking.
Original languageEnglish
Title of host publicationUnion Types for Semistructured Data
Subtitle of host publication7th International Workshop on Database Programming Languages, DBPL’99 Kinloch Rannoch, UK, September 1–3,1999 Revised Papers
PublisherSpringer-Verlag GmbH
Number of pages24
ISBN (Electronic)978-3-540-44543-2
ISBN (Print)978-3-540-41481-0
Publication statusPublished - 1 Sept 1999

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Berlin / Heidelberg
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Dive into the research topics of 'Union Types for Semistructured Data'. Together they form a unique fingerprint.

Cite this