Dispel4Py: A Python Framework for Data-intensive eScience

Amrey Krause, Rosa Filgueira, Malcolm Atkinson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present dispel4py, a novel data intensive and high performance computing middleware provided as a standard Python library for describing stream-based workflows. It allows its users to develop their scientific applications locally and then run them on a wide range of HPC-infrastructures without any changes to the code. Moreover, it provides automated and efficient parallel mappings to MPI, multiprocessing, Storm and Spark frameworks, commonly used in big data applications. It builds on the wide availability of Python in many environments and only requires familiarity with basic Python syntax. We will show the dispel4py advantages by walking through an example. We will conclude demonstrating how dispel4py can be employed as an easy-to-use tool for designing scientific applications using real-world scenarios.
Original languageEnglish
Title of host publicationProceedings of the 5th Workshop on Python for High-Performance and Scientific Computing
Place of PublicationNew York, NY, USA
PublisherACM
Number of pages10
ISBN (Print)978-1-4503-4010-6
DOIs
Publication statusPublished - 2015

Fingerprint

Dive into the research topics of 'Dispel4Py: A Python Framework for Data-intensive eScience'. Together they form a unique fingerprint.

Cite this