TY - JOUR
T1 - Interaction between path and type constraints in semistructured data
AU - Buneman, Peter
AU - Fan, Wenfei
AU - Weinstein, Scott
PY - 2003/10/1
Y1 - 2003/10/1
N2 - Path constraints are capable of expressing inclusion and inverse relationships and have proved useful in modeling and querying semistructured data [Abiteboul and Vianu 1999; Buneman et al. 2000]. Types also constrain the structure of data and are commonly found in traditional databases. There has also been work on imposing structure or a type system on semistructured data for storing and querying semistructured data in a traditional database system [Alon et al. 2001; Deutsch et al. 1999a; Florescu and Kossmann 1999; Shanmugasundaram et al. 1999]. One wants to know whether complexity results for reasoning about path constraints established in the untyped (semistructured) context could carry over to traditional databases, and vice versa. It is therefore appropriate to understand the interaction between types and path constraints. In addition, XML [Bray et al. 1998], which may involve both an optional schema (e.g., DTDs or XML Schema [Thompson et al. 2001]) and integrity constraints, highlights the importance of the study of the interaction.This article investigates that interaction. In particular it studies constraint implication problems, which are important both in understanding the semantics of type/constraint systems and in query optimization. It shows that path constraints interact with types in a highly intricate way. For that purpose a number of results on path constraint implication are established in the presence and absence of type systems. These results demonstrate that adding a type system may in some cases simplify reasoning about path constraints and in other cases make it harder. For example, it is shown that there is a path constraint implication problem that is decidable in PTIME in the untyped context, but that becomes undecidable when a type system is added. On the other hand, there is an implication problem that is undecidable in the untyped context, but becomes not only decidable in cubic time but also finitely axiomatizable when a type system is imposed.
AB - Path constraints are capable of expressing inclusion and inverse relationships and have proved useful in modeling and querying semistructured data [Abiteboul and Vianu 1999; Buneman et al. 2000]. Types also constrain the structure of data and are commonly found in traditional databases. There has also been work on imposing structure or a type system on semistructured data for storing and querying semistructured data in a traditional database system [Alon et al. 2001; Deutsch et al. 1999a; Florescu and Kossmann 1999; Shanmugasundaram et al. 1999]. One wants to know whether complexity results for reasoning about path constraints established in the untyped (semistructured) context could carry over to traditional databases, and vice versa. It is therefore appropriate to understand the interaction between types and path constraints. In addition, XML [Bray et al. 1998], which may involve both an optional schema (e.g., DTDs or XML Schema [Thompson et al. 2001]) and integrity constraints, highlights the importance of the study of the interaction.This article investigates that interaction. In particular it studies constraint implication problems, which are important both in understanding the semantics of type/constraint systems and in query optimization. It shows that path constraints interact with types in a highly intricate way. For that purpose a number of results on path constraint implication are established in the presence and absence of type systems. These results demonstrate that adding a type system may in some cases simplify reasoning about path constraints and in other cases make it harder. For example, it is shown that there is a path constraint implication problem that is decidable in PTIME in the untyped context, but that becomes undecidable when a type system is added. On the other hand, there is an implication problem that is undecidable in the untyped context, but becomes not only decidable in cubic time but also finitely axiomatizable when a type system is imposed.
U2 - 10.1145/937555.937560
DO - 10.1145/937555.937560
M3 - Article
VL - 4
SP - 530
EP - 577
JO - ACM Transactions on Computational Logic
JF - ACM Transactions on Computational Logic
SN - 1529-3785
IS - 4
ER -