Correlated Crash Vulnerabilities

Ramnatthan Alagappan, Aishwarya Ganesan, Yuvraj Patel, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Modern distributed storage systems employ complex protocols to update replicated data. In this paper, we study whether such update protocols work correctly in the presence of correlated crashes. We find that the correctness of such protocols hinges on how local filesystem state is updated by each replica in the system. We build PACE, a framework that systematically generates and explores persistent states that can occur in a distributed execution. PACE uses a set of generic rules to effectively prune the state space, reducing checking time from days to hours in some cases. We apply PACE to eight widely used distributed storage systems to find correlated crash vulnerabilities, i.e., problems in the update protocol that lead to user-level guarantee violations. PACE finds a total of 26 vulnerabilities across eight systems, many of which lead to severe consequences such as data loss, corrupted data, or unavailable clusters.

Original languageEnglish
Title of host publication12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)
Place of PublicationSavannah, GA
PublisherUSENIX Association
Pages151-167
Number of pages17
ISBN (Print)978-1-931971-33-1
Publication statusPublished - 4 Nov 2016
Event12th USENIX Symposium on Operating Systems Design and Implementation - Savannah, United States
Duration: 2 Nov 20164 Nov 2016
https://www.usenix.org/conference/osdi16

Conference

Conference12th USENIX Symposium on Operating Systems Design and Implementation
Abbreviated titleOSDI'16
Country/TerritoryUnited States
CitySavannah
Period2/11/164/11/16
Internet address

Fingerprint

Dive into the research topics of 'Correlated Crash Vulnerabilities'. Together they form a unique fingerprint.

Cite this