Building a compiled query engine in Python

Hesam Shahrokhi, Amir Shaikhha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The simplicity of Python and its rich set of libraries has made it the most popular language for data science. Moreover, the interpreted nature of Python offers an easy debugging experience for the developers. However, it comes with the price of poor performance compared to the compiled code. In this paper, we adopt and extend state-of-the-art research in query compilers to propose an efficient query engine embedded in Python. Our open-sourced framework enables the developers to do the debugging in Python, while being able to easily build a compiled version of the code for deployment. Our benchmark results on the entire set of TPC-H queries show that our approach covers different types of relational workloads and is competitive with state-of-the-art in-memory engines in both single-and multi-threaded settings.
Original languageEnglish
Title of host publicationProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction
EditorsClark Verbrugge, Ondrej Lhotak, Xipeng Shen
PublisherACM
Pages180-190
Number of pages11
ISBN (Print)9798400700880
DOIs
Publication statusPublished - 17 Feb 2023
Event32nd ACM SIGPLAN International Conference on Compiler Construction - Montreal, Canada
Duration: 25 Feb 202326 Feb 2023

Conference

Conference32nd ACM SIGPLAN International Conference on Compiler Construction
Abbreviated titleCC 2023
Country/TerritoryCanada
CityMontreal
Period25/02/2326/02/23

Keywords / Materials (for Non-textual outputs)

  • data science
  • Python
  • query compilation

Fingerprint

Dive into the research topics of 'Building a compiled query engine in Python'. Together they form a unique fingerprint.

Cite this