CLEAR: Clean-Up Sample Targeted Backdoor in Neural Networks

Document Type

Conference Paper

Publication Date

2021

DOI

10.1109/ICCV48922.2021.01614

Publication Title

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Pages

16453-16462

Conference Name

2021 IEEE/CVF International Conference on Computer Vision (ICCV), 10-17 October 2021, Montreal, QC, Canada

Abstract

The data poisoning attack has raised serious security concerns on the safety of deep neural networks since it can lead to neural backdoor that misclassifies certain inputs crafted by an attacker. In particular, the sample-targeted backdoor attack is a new challenge. It targets at one or a few specific samples, called target samples, to misclassify them to a target class. Without a trigger planted in the backdoor model, the existing backdoor detection schemes fail to detect the sample-targeted backdoor as they depend on reverse-engineering the trigger or strong features of the trigger. In this paper, we propose a novel scheme to detect and mitigate sample-targeted backdoor attacks. We discover and demonstrate a unique property of the sample-targeted backdoor, which forces a boundary change such that small "pockets" are formed around the target sample. Based on this observation, we propose a novel defense mechanism to pinpoint a malicious pocket by "wrapping" them into a tight convex hull in the feature space. We design an effective algorithm to search for such a convex hull and remove the backdoor by fine-tuning the model using the identified malicious samples with the corrected label according to the convex hull. The experiments show that the proposed approach is highly efficient for detecting and mitigating a wide range of sample-targeted backdoor attacks.

Comments

These ICCV 2021 papers are the Open Access versions, provided by the Computer Vision Foundation. The final published version of the proceedings is available on IEEE Xplore at: http://dx.doi.org/10.1109/ICCV48922.2021.01614

Original Publication Citation

Zhu, L., Ning, R., Xin, C., Wang, C., & Wu, H. (2021) CLEAR: Clean-up sample-targeted backdoor in neural networks. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 10-17 October 2021, Montreal, QC, Canada.

Share

COinS