We train representation models with procedural data only, and apply them on visual similarity, classification, and semantic segmentation tasks without further training by using visual memory—an explicit database of reference image embeddings. Unlike prior work on visual memory, our approach achieves full compartmentalization with respect to all real-world images while retaining strong performance. Compared to a model trained on Places, our procedural model performs within 1% on NIGHTS visual similarity, outperforms by 8% and 15% on CUB200 and Flowers102 fine-grained classification, and is within 10% on ImageNet-1K classification. It also demonstrates strong zero-shot segmentation, achieving an \(R^2\) on COCO within 10% of the models trained on real data. Finally, we analyze procedural versus real data models, showing that parts of the same object have dissimilar representations in procedural models, resulting in incorrect searches in memory and explaining the remaining performance gap.
[1] Wightman, R.: PyTorch Image Models (2019). https://doi.org/10.5281/zenodo.4414861, https://github.com/rwightman/pytorch-image-models
[2] Liu, C., Dong, Y., Xiang, W., Yang, X., Su, H., Zhu, J., Chen, Y., He, Y., Xue, H., Zheng, S.:
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and
Rethinking (Feb 2023). https://doi.org/10.48550/arXiv.2302.14301, http://arxiv.org/abs/
2302.14301, arXiv:2302.14301 [cs]
[3] Xie, C., Tan, M., Gong, B., Yuille, A., Le, Q.V.: Smooth Adversarial Training (Jul 2021),
http://arxiv.org/abs/2006.14536, arXiv:2006.14536 [cs]
[4] Simon-Gabriel, C.J., Ollivier, Y., Bottou, L., Schölkopf, B., Lopez-Paz, D.: First-order Adversarial
Vulnerability of Neural Networks and Input Dimension (Jun 2019), http://arxiv.
org/abs/1802.01421, arXiv:1802.01421 [cs, stat]
[5] Ross, A.S., Doshi-Velez, F.: Improving the Adversarial Robustness and Interpretability of Deep
Neural Networks by Regularizing their Input Gradients (Nov 2017), http://arxiv.org/abs/
1711.09404, arXiv:1711.09404 [cs]
[6] Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards Deep Learning Models
Resistant to Adversarial Attacks. In: International Conference on Learning Representations
(2018), https://arxiv.org/abs/1706.06083
@inproceedings{rodriguezmunoz2024characterizing,
title={Characterizing model robustness via natural input gradients},
author={Adrián Rodríguez-Muñoz and Tongzhou Wang and Antonio Torralba},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
year={2024},
url={}
}