Abstract
Nested relational query languages have long been seen as an attractive tool for scenarios involving large hierarchical datasets. In recent years, there has been a resurgence of interest in nested relational languages. One driver has been the affinity of these languages for large-scale processing platforms such as Spark and Flink.
This demonstration gives a tour of TraNCE, a new system for processing nested data on top of distributed processing systems. The core innovation of the system is a compiler that processes nested relational queries in a series of transformations; these include variants of two prior techniques, shredding and unnesting, as well as a materialization transformation that customizes the way levels of the nested output are generated. The TraNCE platform builds on these techniques by adding components for users to create and visualize queries, as well as data exploration and notebook execution targets to facilitate the construction of large-scale data science applications. The demonstration will both showcase the system from the viewpoint of usability by data scientists and illustrate the data management techniques employed.
This demonstration gives a tour of TraNCE, a new system for processing nested data on top of distributed processing systems. The core innovation of the system is a compiler that processes nested relational queries in a series of transformations; these include variants of two prior techniques, shredding and unnesting, as well as a materialization transformation that customizes the way levels of the nested output are generated. The TraNCE platform builds on these techniques by adding components for users to create and visualize queries, as well as data exploration and notebook execution targets to facilitate the construction of large-scale data science applications. The demonstration will both showcase the system from the viewpoint of usability by data scientists and illustrate the data management techniques employed.
Original language | English |
---|---|
Pages (from-to) | 2727-2730 |
Number of pages | 4 |
Journal | Proceedings of the VLDB Endowment (PVLDB) |
Volume | 14 |
Issue number | 12 |
Early online date | 16 Aug 2021 |
DOIs | |
Publication status | Published - 28 Oct 2021 |
Event | 47th International Conference on Very Large Data Bases - Copenhagen, Denmark Duration: 16 Aug 2021 → 20 Aug 2021 https://vldb.org/2021/ |