Abstract
In this paper, we present a strategy for training convolutional neural networks to effectively resolve interference arising from competing hypotheses relating to inter-categorical information throughout the network. In this work, this is accomplished for the task of dense image labelling by blending images based on (i) categorical clustering or (ii) the co-occurrence likelihood of categories. We then train a source separation network which simultaneously segments and separates the blended images. Subsequent feature denoising to suppress noisy activations reveals additional desirable properties and high degrees of successful predictions. Through this process, we reveal a general mechanism, distinct from any prior methods, for boosting the performance of the base segmentation and salient object detection network while simultaneously increasing robustness to adversarial attacks.
Similar content being viewed by others
References
Zhang, H., Cisse, M., Dauphin, Y. N., Lopez-Paz, D. (2018). Mixup: Beyond empirical risk minimization. In ICLR.
Yun, S., Han, D., Oh, S. J., Chun, S., Choe, J., Yoo, Y. (2019). Cutmix: Regularization strategy to train strong classifiers with localizable features. In ICCV.
Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., Zisserman, A. (2015). The Pascal visual object classes challenge: A retrospective. IJCV.
Long, J., Shelhamer, E., Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In CVPR.
Noh, H., Hong, S., Han, B. (2015). Learning deconvolution network for semantic segmentation. In ICCV.
Badrinarayanan, V., Kendall, A., Cipolla, R. (2017). Segnet: A deep convolutional encoder-decoder architecture for scene segmentation. In: TPAMI.
Ghiasi, G., & Fowlkes, C. C. (2016). Laplacian pyramid reconstruction and refinement for semantic segmentation. In ECCV.
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J. (2017). Pyramid scene parsing network. In CVPR.
Islam, M. A., Rochan, M.M., Bruce, N. D. B., Wang, Y. (2017). Gated feedback refinement network for dense image labeling. In CVPR.
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A. L. (2018). DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI.
Islam, M. A., Kalash, M., Bruce, N. D. (2018). Revisiting salient object detection: Simultaneous detection, ranking, and subitizing of multiple salient objects. In CVPR.
He, K., Gkioxari, G., Dollár, P., Girshick, R. (2017). Mask R-CNN. In ICCV.
Li, K., Hariharan, B., Malik, J. (2016). Iterative instance segmentation. In CVPR.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M. (2015). ImageNet large scale visual recognition challenge. IJCV.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C. L. (2014). Microsoft COCO: Common objects in context. In ECCV.
Singh, K. K., Mahajan, D., Grauman, K., Lee, Y.J., Feiszli, M., Ghadiyaram, D. (2020). Don’t judge an object by its context: Learning to overcome contextual bias. In CVPR.
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A. L. (2015). Semantic image segmentation with deep convolutional nets and fully connected CRFs. In ICLR.
Georgiev, P., Theis, F., & Cichocki, A. (2005). Sparse component analysis and blind source separation of underdetermined mixtures. TNN, 16(4), 992–996.
Huang, P.-S., Kim, M., Hasegawa-Johnson, M., Smaragdis, P. (2015). Joint optimization of masks and deep recurrent neural networks for monaural source separation. ASLP.
Islam, M. A., Kowal, M., Derpanis, K. G., Bruce, K. G. (2020). Feature binding with category-dependant mixup for semantic segmentation and adversarial robustness. In BMVC.
Chen, L.-C., Papandreou, G., Schroff, F., Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587.
Takikawa, T., Acuna, D., Jampani, V., Fidler, S. (2019). Gated-SCNN: Gated shape CNNs for semantic segmentation. In ICCV.
Krizhevsky, A., Sutskever, I., Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In NIPS.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going deeper with convolutions. In CVPR.
He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. In CVPR.
Yu, F., & Koltun, V. (2016). Multi-scale context aggregation by dilated convolutions. In ICLR.
Bishop, C. M. (1995). Training with noise is equivalent to tikhonov regularization. Neural computation.
Hendrycks, D., Mu, N., Cubuk, E. D., Zoph, B., Gilmer, J., Lakshminarayanan, B. (2019). AUGMIX: A simple data processing method to improve robustness and uncertainty. In ICLR.
Kim, J.-H., Choo, W., Song, H. O. (2020). Puzzle mix: Exploiting saliency and local statistics for optimal mixup. In ICML.
Tokozume, Y., Ushiku, Y., Harada, T. (2018). Between-class learning for image classification. In CVPR.
Inoue, H. (2018). Data augmentation by pairing samples for images classification. arXiv:1801.02929.
Cubuk, E. D., Zoph, B., Mane, D., Vasudevan, V., Le, Q. V. (2019). Autoaugment: Learning augmentation strategies from data. In CVPR.
French, G., Aila, T., Laine, S., Mackiewicz, M., Finlayson, G. (2020). Semi-supervised semantic segmentation needs strong, high-dimensional perturbations. In BMVC.
Harris, E., Marcu, A., Painter, M., Niranjan, M., Hare, A. P.-B. J. (2020). FMix: Enhancing mixed sample data augmentation. arXiv preprint arXiv:2002.12047
Chou, H.-P., Chang, S.-C., Pan, J.-Y., Wei, W., Juan, D.-C. (2020). Remix: Rebalanced mixup. In ECCVW.
Verma, V., Lamb, A., Beckham, C., Najafi, A., Mitliagkas, I., Lopez-Paz, D., Bengio, Y. (2019). Manifold mixup: Better representations by interpolating hidden states. In ICML.
Guo, H., Mao, Y., Zhang, R. (2019). Mixup as locally linear out-of-manifold regularization. In AAAI.
DeVries,T., Taylor, G. W. (2017). Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552.
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A. (2017). Automatic differentiation in pytorch.
Lin, G., Milan, A., Shen, C., Reid, I. (2017). RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. In CVPR.
Hariharan, B., Arbeláez, P., Bourdev, L., Maji, S., Malik, J. (2011). Semantic contours from inverse detectors. In ICCV.
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In NIPS.
Choi, M. J., Torralba, A., & Willsky, A. S. (2012). Context models and out-of-context objects. Pattern Recognition Letters.
Peyre, J., Sivic, J., Laptev, I., Schmid, C. (2017). Weakly-supervised learning of visual relations. In ICCV.
Arnab, A., Miksik, O., Torr, P. H. (2018). On the robustness of semantic segmentation models to adversarial attacks. In CVPR.
Xie, C., Wang, J., Zhang, Z., Zhou, Y., Xie, L., Yuille, A. (2017). Adversarial examples for semantic segmentation and object detection. In ICCV.
Guo, C., Rana, M., Cisse, M., van der Maaten, L. (2018). Countering adversarial images using input transformations. In ICLR.
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. In ICLR.
Goodfellow, I. J., Shlens, J., Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv:1412.6572.
Kurakin, A., Goodfellow, I., Bengio, S. (2016). Adversarial examples in the physical world. arXiv:1607.02533.
Moosavi-Dezfooli, S.-M., Fawzi, A., Frossard, P. (2016). DeepFool: a simple and accurate method to fool deep neural networks. In CVPR.
Moosavi-Dezfooli, S.-M., Fawzi, A., Fawzi, O., Frossard, P. (2017). Universal adversarial perturbations. In CVPR.
Mopuri, K. R., Ganeshan, A., Babu, R. V. (2018). Generalizable data-free objective for crafting universal adversarial perturbations. TPAMI.
Yan, Q., Xu, L., Shi, J., Jia, J. (2013). Hierarchical saliency detection. In CVPR.
Wang, L., Lu, H., Wang, Y., Feng, M., Wang, D., Yin, B., Ruan, X. (2017). Learning to detect salient objects with image-level supervision. In CVPR.
Shi, J., Yan, Q., Xu, L., Jia, J. (2016). Hierarchical image saliency detection on extended CSSD. TPAMI.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Oisin Mac Aodha.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Islam, M.A., Kowal, M., Derpanis, K.G. et al. SegMix: Co-occurrence Driven Mixup for Semantic Segmentation and Adversarial Robustness. Int J Comput Vis 131, 701–716 (2023). https://doi.org/10.1007/s11263-022-01720-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-022-01720-7