Aliasing is a Driver of Adversarial Attacks
Abstract
Aliasing is a highly important concept in signal processing, as careful consideration of resolution changes is essential in ensuring transmission and processing quality of audio, image, and video. Despite this, up until recently aliasing has received very little consideration in Deep Learning, with all common architectures carelessly sub-sampling without considering aliasing effects. In this work, we investigate the hypothesis that the existence of adversarial perturbations is due in part to aliasing in neural networks. Our ultimate goal is to increase robustness against adversarial attacks using explainable, non-trained, structural changes only, derived from aliasing first principles.
Our contributions are the following. First, we establish a sufficient condition for no aliasing for general image transformations. Next, we study sources of aliasing in common neural network layers, and derive simple modifications from first principles to eliminate or reduce it. Lastly, our experimental results show a solid link between anti-aliasing and adversarial attacks. Simply reducing aliasing already results in more robust classifiers, and combining anti-aliasing with robust training out-performs solo robust training on $L_2$ attacks with none or minimal losses in performance on $L_{\infty}$ attacks.
What is aliasing?
The concept of aliasing is intrinsically related to discrete sampling. In layman's terms, the more ”complex” a continuous-domain signal, the finer the sampling needed to properly represent it. Using an insufficiently fine sampling results in visual artifacts that perceptually destroy the original signal; we call this phenomenon ”aliasing”. Consider the example shown in the above figure: the main ”feature” of the signal, the right-to-left diagonals, is inverted by aliasing when sampling at an insufficient rate.
Linking aliasing to adversarial examples
Our hypothesis is that adversarial attacks work in part by exploiting the phenomenon of aliasing. See the above figure for a simple but enlightening toy example. As we can see, the dirty image is indistinguishable from the original clean image by a human, and yet their outputs are completely different. The culprit for this bizarre effect is the convolution stride (green box) that carelessly down-samples the input.
An attacker with knowledge about it is able to construct a perturbation focused on manipulating the surviving samples (pixels at even rows and columns). The discarded samples (pixels at an odd row or column) serve as extra degrees of freedom that can be used to make the attack less noticeable and more powerful.
The behavior of these analytically constructed attacks is remarkably similar to low-amplitude, gradient-driven attacks: they are imperceptible by humans and drastically change the feature maps of a network. It is thus plausible that attacks may be exploiting aliasing.
Neural Networks without aliasing
To supress aliasing in neural networks, we expand on existing blurring based approaches such as in [1, 2, 3] by using theory to derive the exact blurring strength necessary, which coincides with the experimentally derived strength as was done in [1]. Furthermore, we also introduce the Quantile ReLu anti-aliasing modification, which is a new way of anti-aliasing independent of and synergistic with blurring-based approaches. The above figure shows a summary graphic of the adaptations used in our experiments.
[1] Tero Karras, Miika Aittala, Samuli Laine, Erik H ̈ark ̈onen,
Janne Hellsten, Jaakko Lehtinen, and Timo Aila. Alias-Free
Generative Adversarial Networks. In Proc. NeurIPS, 2021
[2] Cristina Vasconcelos, Hugo Larochelle, Vincent Dumoulin,
Rob Romijnders, Nicolas Le Roux, and Ross Goroshin.
Impact of Aliasing on Generalization in Deep Convolu-
tional Networks. In Proceedings of the IEEE/CVF In-
ternational Conference on Computer Vision (ICCV), pages
10529–10538, Oct. 2021.
[3] Richard Zhang. Making Convolutional Networks Shift-
Invariant Again. In ICML, 2019.
How effective is anti-aliasing as a defense?
The above figure plots adversarial strength vs accuracy curves for the five defenses against various attacks on each architecture and dataset.
- Vanilla: No defense.
- Initial Blur: Naive initial blur with [1 4 6 4 1]
- AA(5): Anti-aliasing all five blocks of the network.
- AT: Adversarial Training with PGD
- AT+AA(2): Combining adversarial training with anti-aliasing the first two blocks of the network.
Simple anti-aliasing measures are sufficient by themselves to increase the robustness of networks significantly for low-amplitude and single-step attacks, especially for the $L_2$ variant.
Furthermore, we observe that the AT+AA(2) defense, which combines anti-aliasing with robust training, consistently outperforms the robust training defense AT on $L_2$ attacks, sometimes by a very wide margin such as in Resnet-50+TinyImagenet where both $L_2$ attacks appear completelybeaten, while maintaining statistically equal robustness on $L_\infty$ attacks.
Is aliasing the only reason for adversarial attacks?
No, anti-aliasing is not the full picture. Anti-aliasing by itself is not sufficient to reach the level of robustness of adversarial training: there is still a significant performance gap for high-amplitude attacks. Combining anti-aliasing with adversarial training, while better than adversarial training alone, is also not $100\%$ robust, as accuracy still decays significantly as a function of epsilon for $L_\infty$ attacks.
Instead, the results show that aliasing plays a significant role in the vulnerability of vanilla networks to adversarial attacks. Simply adding anti-aliasing is enough to obtain a significant measure of robustness over the undefended network, and combining anti-aliasing with adversarial training yields a more globally robust model over base AT. Indeed, the combined model seems almost completely impervious to FGSM-$L_2$ and PGD-$L_2$.
In this work, we prove that aliasing is a driver of adversarial attacks: one vulnerability that attack algorithms leverage to confound neural networks. Our conclusion is that anti-aliasing is a component that a successful defense must have.
Related work
[1] Tero Karras and Miika Aittala and Samuli Laine and Erik Härkönen and Janne Hellsten and Jaakko Lehtinen and Timo Aila. Alias-Free
Generative Adversarial Networks. In Proc. NeurIPS, 2021
[2] Cristina Vasconcelos, Hugo Larochelle, Vincent Dumoulin,
Rob Romijnders, Nicolas Le Roux, and Ross Goroshin.
Impact of Aliasing on Generalization in Deep Convolu-
tional Networks. In Proceedings of the IEEE/CVF International Conference on
Computer Vision (ICCV), pages 10529–10538, Oct. 2021.
[3] Richard Zhang. Making Convolutional Networks Shift-
Invariant Again. In ICML, 2019.
[4] Anadi Chaman and Ivan Dokmanić. Truly shift-invariant
convolutional neural networks. 2021 IEEE/CVF Conference
on Computer Vision and Pattern Recognition (CVPR), pages
3772–3782, 2021.
[5] Alvin Chan, Y. Ong, and Clement Tan. How does frequency
bias affect the robustness of neural image classifiers against
common corruption and adversarial perturbations? In IJCAI,
2022.
[6] Julia Grabinski, Janis Keuper, and Margret Keuper. Aliasing coincides
with CNNs vulnerability towards adversarial attacks. In The AAAI-22 Workshop
on Adversarial Machine Learning and Beyond, 2022,
[7] Md Tahmid Hossain, Shyh Wei Teng, Ferdous Sohel, and
Guojun Lu. Anti-aliasing deep image classifiers using
novel depth adaptive blurring and activation function. arXiv
preprint arXiv:2110.00899, 2021.
[8] Antônio H. Ribeiro and Thomas Schon. How convolutional
neural networks deal with aliasing. ICASSP 2021 - 2021
IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP), pages 2755–2759, 2021.
[9] Yusuke Tsuzuku and Issei Sato. On the structural sensitivity
of deep convolutional networks to the directions of fourier
basis functions. 2019 IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR), pages 51–60, 2019.
[10] Dong Yin, Raphael Gontijo Lopes, Jon Shlens, Ekin Dogus
Cubuk, and Justin Gilmer. A fourier perspective on model
robustness in computer vision. Advances in Neural Informa-
tion Processing Systems, 32, 2019.
[11] X Zou, F Xiao, Z Yu, and YJ Lee. Delving deeper into anti-
aliasing in convnets. In Proceedings of the British Machine
Vision Conference (BMVC), 2020.
Bibtex citation
@misc{rodriguezmunoz2022driver, title = {Aliasing is a Driver of Adversarial Attacks}, author = {Rodríguez-Muñoz, Adrián and Torralba, Antonio}, year={2022}, eprint={2106.05963}, archivePrefix={arXiv}, primaryClass={cs.CV}, url = {https://arxiv.org/abs/2212.11760}, }
Acknowledgements
Experiments were conducted using computation resources from MIT's Supercloud cluster.