Abstract
Training Convolutional Neural Networks that do well in one-shot learning settings can have wide range of impacts on real-world datasets. In this paper, we explore an adversarial training method that learns a Siamese neural network in an end-to-end fashion for two models—ConvNets model that learns image embeddings from input image pair, and head model that further learns the distance between those embeddings. We further present an adversarial mining approach that efficiently identifies and selects harder examples during training, which significantly boosts the model performance over difficult cases. We have done a comprehensive evaluation using a public whale identification dataset hosted on Kaggle platform, and report state-of-the-art performance benchmarking against result of other 2000 participants.
Similar content being viewed by others
REFERENCES
https://www.kaggle.com/c/humpback-whale-identification/.
Person Re-Identification, Gong, S., et al., Eds., Springer Science & Business Media, 2014. https://doi.org/10.1007/978-1-4471-6296-4_1
Zhang, X., Luo, H., Fan, X., Xiang, W., Sun, Y., Xiao, Q., and Sun, J., Alignedreid: Surpassing human-level performance in person re-identification, arXiv preprint arXiv:1711.08184, 2017.
Farenzena, M., Bazzani, L., Perina, A., Murino, V., and Cristani, M., Person re-identification by symmetry-driven accumulation of local features, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 2360–2367.
https://www.kaggle.com/martinpiotte/whale-recognition-model-with-score-0-78563.
Koch, G., Zemel, R., and Salakhutdinov, R., Siamese neural networks for one-shot image recognition, ICML Deep Learning Workshop, 2015, vol. 2.
Bromley, J., Bentz, J.W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., Säckinger, E., and Shah, R., Signature verification using a Siamese time delay neural network, Int. J. Pattern Recognit. Artif. Intell., 1993, vol. 7, no. 4, pp. 669–688.
Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q., Densely connected convolutional networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4700–4708.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z., Rethinking the inception architecture for computer vision, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2818–2826.
He, K., Zhang, X., Ren, S., and Sun, J., Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
Xie, S., Girshick, R., Dollár, P., Tu, Z., and He, K., Aggregated residual transformations for deep neural networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1492–1500.
Jia Deng, Wei Dong, Socher, R., Li-Jia Li, Kai Li, and Li Fei-Fei, ImageNet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255.
Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollár, P., Focal loss for dense object detection, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2980–2988.
https://github.com/ZFTurbo/Keras-RetinaNet-for-Open-Images-Challenge-2018.
Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., and Ferrari, V., The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale, arXiv preprint arXiv:1811.00982, 2018.
Smith, L.N., Cyclical learning rates for training neural networks, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), 2017, pp. 464–472.
Schroff, F., Kalenichenko, D., and Philbin, J., Facenet: A unified embedding for face recognition and clustering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 815–823.
Burkard, R.E. and Cela, E., Linear assignment problems and extensions, in Handbook of Combinatorial Optimization, Boston, MA: Springer, 1999, pp. 75–149.
https://www.kaggle.com/c/whale-categorization-playground.
Buslaev, A., Parinov, A., Khvedchenya, E., Iglovikov, V.I., and Kalinin, A.A., Albumentations: Fast and flexible image augmentations, arXiv preprint arXiv:1809.06839, 2018.
https://github.com/aaxwaz/Humpback-whale-identification-challenge.
https://www.kaggle.com/zfturbo/visualisation-of-siamese-net.
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., and Tian, Q., Scalable person re-identification: A benchmark, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1116–1124.
Liao, W., Ying Yang, M., Zhan, N., and Rosenhahn, B., Triplet-based deep similarity learning for person re-identification, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 385–393.
Jonker, R. and Volgenant, A., A shortest augmenting path algorithm for dense and spare linear assignment problems, Computing, 1987, vol. 38, pp. 325–340.
Chung, D., Tahboub, K., and Delp, E.J., A two stream Siamese convolutional neural network for person re-identification, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 1983–1991.
Zheng, M., Karanam, S., Wu, Z., and Radke, R.J., Re-identification with consistent attentive Siamese networks, arXiv preprint arXiv:1811.07487, 2018.
Shen, C., Jin, Z., Zhao, Y., Fu, Z., Jiang, R., Chen, Y., and Hua, X.S, Deep Siamese network with multi-level similarity perception for person re-identification, Proceedings of the 25th ACM International Conference on Multimedia, 2017, pp. 1942–1950.
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A., Learning deep features for discriminative localization, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2921–2929.
Ioffe, S. and Szegedy, C., Batch normalization: Accelerating deep network training by reducing internal covariate shift, arXiv preprint arXiv:1502.03167, 2015.
Hu, J., Shen, L., and Sun, G, Squeeze-and-excitation networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132–7141.
Sanakoyeu, A., Tschernezki, V., Büchler, U., and Ommer, B., Divide and conquer the embedding space for metric learning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Oh Song, H., Xiang, Y., Jegelka, S., and Savarese, S., Deep metric learning via lifted structured feature embedding, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4004–4012.
Schroff, F., Kalenichenko, D., and Philbin, J., Facenet: A unified embedding for face recognition and clustering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 815–823.
Sanakoyeu, A., Bautista, M.A., and Ommer, B., Deep unsupervised learning of visual similarities, Pattern Recognit., 2018, vol. 78, pp. 331–343.
Bautista, M.A., Sanakoyeu, A., and Ommer, B., Deep unsupervised similarity learning using partially ordered sets, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1923–1932.
Zagoruyko, S. and Komodakis, N., Learning to compare image patches via convolutional neural networks, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
Iglovikov, V. and Shvets A., U-net with VGG11 encoder pre-trained on ImageNet for image segmentation, arXiv preprint arXiv:1801.05746.33, 2018.
Iglovikov, V., Seferbekov, S., Buslaev, A., and Shvets, A., TernausNetV2: Fully convolutional network for instance segmentation, arXiv:1806.00844 [cs.CV], 2018.
Solovyev, R.A., Stempkovsky, A.L., and Telpukhov, D.V., Study of fault tolerance methods for hardware implementations of convolutional neural networks, Opt. Mem. Neural Networks, 2019, vol. 28, no. 2, pp. 82–88.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that they have no conflicts of interest.
About this article
Cite this article
Wang, W., Solovyev, R.A., Stempkovsky, A.L. et al. Method for Whale Re-identification Based on Siamese Nets and Adversarial Training. Opt. Mem. Neural Networks 29, 118–132 (2020). https://doi.org/10.3103/S1060992X20020058
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1060992X20020058