Towards Certifiable Adversarial Sample Detection

Ilia Shumailov, Yiren Zhao, Robert Mullins, Ross Anderson

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract / Description of output

Convolutional Neural Networks (CNNs) are deployed in more and more classification systems, but adversarial samples can be maliciously crafted to trick them, and are becoming a real threat. There have been various proposals to improve CNNs' adversarial robustness but these all suffer performance penalties or have other limitations. In this paper, we offer a new approach in the form of a certifiable adversarial detection scheme, the Certifiable Taboo Trap (CTT). This system, in theory, can provide certifiable guarantees of detectability of a range of adversarial inputs for certain l-∞ sizes. We develop and evaluate several versions of CTT with different defense capabilities, training overheads and certifiability on adversarial samples. In practice, against adversaries with various l-p norms, CTT outperforms existing defense methods that focus purely on improving network robustness. We show that CTT has small false positive rates on clean test data, minimal compute overheads when deployed, and can support complex security policies.
Original languageEnglish
Title of host publicationProceedings of the 13th ACM Workshop on Artificial Intelligence and Security
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery, Inc
Number of pages12
ISBN (Print)978-1-4503-8094-2
Publication statusPublished - 9 Nov 2020
Event13th ACM Workshop on Artificial Intelligence and Security - Orlando, United States
Duration: 13 Nov 202013 Nov 2020
Conference number: 13


Workshop13th ACM Workshop on Artificial Intelligence and Security
Abbreviated titleAISEC 2020
Country/TerritoryUnited States


Dive into the research topics of 'Towards Certifiable Adversarial Sample Detection'. Together they form a unique fingerprint.

Cite this