TraNCE: Transforming Nested Collections Efficiently

Jaclyn Smith, Michael Benedikt, Brandon Moore, Milos Nikolic

Research output: Contribution to journalArticlepeer-review

Abstract

Nested relational query languages have long been seen as an attractive tool for scenarios involving large hierarchical datasets. In recent years, there has been a resurgence of interest in nested relational languages. One driver has been the affinity of these languages for large-scale processing platforms such as Spark and Flink.

This demonstration gives a tour of TraNCE, a new system for processing nested data on top of distributed processing systems. The core innovation of the system is a compiler that processes nested relational queries in a series of transformations; these include variants of two prior techniques, shredding and unnesting, as well as a materialization transformation that customizes the way levels of the nested output are generated. The TraNCE platform builds on these techniques by adding components for users to create and visualize queries, as well as data exploration and notebook execution targets to facilitate the construction of large-scale data science applications. The demonstration will both showcase the system from the viewpoint of usability by data scientists and illustrate the data management techniques employed.
Original languageEnglish
Pages (from-to)2727-2730
Number of pages4
JournalProceedings of the VLDB Endowment (PVLDB)
Volume14
Issue number12
Early online date16 Aug 2021
DOIs
Publication statusE-pub ahead of print - 16 Aug 2021
Event47th International Conference on Very Large Data Bases - Copenhagen, Denmark
Duration: 16 Aug 202120 Aug 2021
https://vldb.org/2021/

Fingerprint

Dive into the research topics of 'TraNCE: Transforming Nested Collections Efficiently'. Together they form a unique fingerprint.

Cite this