Observing and Preventing Leakage in MapReduce

Olga Ohrimenko, Manuel Costa, Cédric Fournet, Christos Gkantsidis, Markulf Kohlweiss, Divya Sharma

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

The use of public cloud infrastructure for storing and processing large datasets raises new security concerns. Current solutions propose encrypting all data, and accessing it in plaintext only within secure hardware. Nonetheless, the distributed processing of large amounts of data still involves intensive encrypted communications between different processing and network storage units, and those communications patterns may leak sensitive information. We consider secure implementation of MapReduce jobs, and analyze their intermediate traffic between mappers and reducers. Using datasets that include personal and geographical data, we show how an adversary that observes the runs of typical jobs can infer precise information about their input. We give a new definition of data privacy for MapReduce, and describe two provably-secure, practical solutions. We implement our solutions on top of VC3, a secure implementation of Hadoop, and evaluate their performance.
Original languageEnglish
Title of host publicationProceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, October 12-6, 2015
Number of pages12
ISBN (Electronic)978-1-4503-3832-5
Publication statusPublished - 2015
Event22nd ACM SIGSAC Conference on Computer and Communications Security - The Denver Marriot City Center, Denver, CO, United States
Duration: 12 Oct 201516 Oct 2015


Conference22nd ACM SIGSAC Conference on Computer and Communications Security
Abbreviated titleACM CCS 2015
Country/TerritoryUnited States
CityDenver, CO
Internet address


Dive into the research topics of 'Observing and Preventing Leakage in MapReduce'. Together they form a unique fingerprint.

Cite this