Composable XML integration grammars

Wenfei Fan, Minos N. Garofalakis, Ming Xiong, Xibei Jia

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The proliferation of XML as a standard for data representation and exchange in diverse, next-generation Web applications has created an emphatic need for effective XML data-integration tools. For several real-life scenarios, such XML data integration needs to be <i>DTD-directed</i> -- in other words, the target, integrated XML database must conform to a prespecified, user- or application-defined DTD. In this paper, we propose a novel formalism, <i>XML Integration Grammars (XIGs)</i>, for specifying DTD-directed integration of XML data. Abstractly, an XIG maps data from multiple XML sources to a target XML document that conforms to a predefined DTD. An XIG extracts source XML data via queries expressed in a fragment of XQuery, and controls target document generation with tree-valued attributes and the target DTD. The novelty of XIGs consists in not only their automatic support for DTD-conformance but also in their: an XIG may embed local and remote XIGs in its definition, and invoke these XIGs during its evaluation. This yields an important modularity property for our XIGs that allows one to divide a complex integration task into manageable sub-tasks and conquer each of them separately. To efficiently evaluate XIGs we provide algorithms for merging XML queries in an XIG and for scheduling queries and embedded XIGs. These lead to an effective framework, as well as a design tool for XQuery, for effectively specifying and computing complex, DTD-directed XML integration.
Original languageEnglish
Title of host publicationProceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management, Washington, DC, USA, November 8-13, 2004
Number of pages10
ISBN (Electronic)1-58113-874-1
Publication statusPublished - 2004

Fingerprint Dive into the research topics of 'Composable XML integration grammars'. Together they form a unique fingerprint.

Cite this