Abstract / Description of output
Convolutional Neural Networks (CNNs) are deployed in more and more classification systems, but adversarial samples can be maliciously crafted to trick them, and are becoming a real threat. There have been various proposals to improve CNNs' adversarial robustness but these all suffer performance penalties or have other limitations. In this paper, we offer a new approach in the form of a certifiable adversarial detection scheme, the Certifiable Taboo Trap (CTT). This system, in theory, can provide certifiable guarantees of detectability of a range of adversarial inputs for certain l-∞ sizes. We develop and evaluate several versions of CTT with different defense capabilities, training overheads and certifiability on adversarial samples. In practice, against adversaries with various l-p norms, CTT outperforms existing defense methods that focus purely on improving network robustness. We show that CTT has small false positive rates on clean test data, minimal compute overheads when deployed, and can support complex security policies.
Original language | English |
---|---|
Title of host publication | Proceedings of the 13th ACM Workshop on Artificial Intelligence and Security |
Place of Publication | New York, NY, USA |
Publisher | Association for Computing Machinery, Inc |
Pages | 13–24 |
Number of pages | 12 |
ISBN (Print) | 978-1-4503-8094-2 |
DOIs | |
Publication status | Published - 9 Nov 2020 |
Event | 13th ACM Workshop on Artificial Intelligence and Security - Orlando, United States Duration: 13 Nov 2020 → 13 Nov 2020 Conference number: 13 |
Workshop
Workshop | 13th ACM Workshop on Artificial Intelligence and Security |
---|---|
Abbreviated title | AISEC 2020 |
Country/Territory | United States |
City | Orlando |
Period | 13/11/20 → 13/11/20 |