Abstract
The use of public cloud infrastructure for storing and processing large datasets raises new security concerns. Current solutions propose encrypting all data, and accessing it in plaintext only within secure hardware. Nonetheless, the distributed processing of large amounts of data still involves intensive encrypted communications between different processing and network storage units, and those communications patterns may leak sensitive information. We consider secure implementation of MapReduce jobs, and analyze their intermediate traffic between mappers and reducers. Using datasets that include personal and geographical data, we show how an adversary that observes the runs of typical jobs can infer precise information about their input. We give a new definition of data privacy for MapReduce, and describe two provably-secure, practical solutions. We implement our solutions on top of VC3, a secure implementation of Hadoop, and evaluate their performance.
Original language | English |
---|---|
Title of host publication | Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, October 12-6, 2015 |
Publisher | ACM |
Pages | 1570-1581 |
Number of pages | 12 |
ISBN (Electronic) | 978-1-4503-3832-5 |
DOIs | |
Publication status | Published - 2015 |
Event | 22nd ACM SIGSAC Conference on Computer and Communications Security - The Denver Marriot City Center, Denver, CO, United States Duration: 12 Oct 2015 → 16 Oct 2015 https://www.sigsac.org/ccs/CCS2015/ |
Conference
Conference | 22nd ACM SIGSAC Conference on Computer and Communications Security |
---|---|
Abbreviated title | ACM CCS 2015 |
Country/Territory | United States |
City | Denver, CO |
Period | 12/10/15 → 16/10/15 |
Internet address |