Detection of Toy Soldiers Taken from a Bird’s Perspective Using Convolutional Neural Networks

Sambolek, Saša; Ivašić-Kos, Marina

doi:10.1007/978-3-030-33110-8_2

Saša Sambolek⁸ &
Marina Ivašić-Kos⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1110))

Included in the following conference series:

International Conference on ICT Innovations

700 Accesses
1 Citations

Abstract

This paper describes the use of two different deep-learning approaches for object detection to recognize a toy soldier. We use recordings of toy soldiers in different poses under different scenarios to simulate appearance of persons on footage taken by drones. Recordings from a bird’s eye view are today widely used in the search for missing persons in non-urban areas, border control, animal movement control, and the like. We have compared the single-shot multi-box detector (SSD) with the MobileNet or Inception V2 as a backbone, SSDLite with MobileNet and Faster R-CNN combined with Inception V2 and ResNet50. The results show that Faster R-CNN detects small object such as toy soldiers more successfully than SSD, and the training time of Faster R-CNN is much shorter than that of SSD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Andriluka, M., et al.: Vision based victim detection from unmanned aerial vehicles. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1740–1747. IEEE (2010)
Google Scholar
Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Montavon, G., Orr, Geneviève B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 437–478. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_26
Chapter Google Scholar
Bondi, E., et al.: Spot poachers in action: augmenting conservation drones with automatic detection in near real time. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Boureau, Y., Ponce, J., LeCun, Y.: A theoretical analysis of feature pooling in vision algorithms. In: Proceedings of International Conference on Machine learning (ICML10), vol. 10 (2010)
Google Scholar
Burić, M., Pobar, M., Ivašić-Kos, M.: Ball detection using YOLO and mask R-CNN. In: 2018 International Conference on Computational Science and Computational Intelligence (CSCI 2018), pp. 319–323 (2018)
Google Scholar
Burić, M., Pobar, M., Ivašić-Kos, M.: Adapting YOLO network for ball and player detection. In: Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2019), pp. 845–851. SciTePress, Portugal (2019)
Google Scholar
Cocodataset. http://cocodataset.org/
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision & Pattern Recognition, pp. 886–893 (2005)
Google Scholar
Gallego, A.J., Pertusa, A., Gil, P., Fisher, R.B.: Detection of bodies in maritime rescue operations using unmanned aerial vehicles with multispectral cameras. J. Field Robot. 36(4), 782–796 (2019)
Article Google Scholar
Gao, H., Cheng, B., Wang, J., Li, K., Zhao, J., Li, D.: Object classification using CNN-based fusion of vision and LIDAR in autonomous vehicle environment. IEEE Trans. Ind. Inform. 14(9), 4224–4231 (2018)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 580–587 (2014)
Google Scholar
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Gu, J., et al.: Recent advances in convolutional neural networks. Pattern Recogn. 77, 354–377 (2018)
Article Google Scholar
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Hrga, I., Ivašić-Kos, M.: Deep mage captioning: an overview. In: 42nd International ICT Convention–MIPRO 2019-CIS-Intelligent Systems (2019)
Google Scholar
Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7310–7311 (2017)
Google Scholar
Ivašić-Kos, M., Ipšić, I., Ribarić, S.: A knowledge-based multi-layered image annotation system. Expert Syst. Appl. 42(24), 9539–9553 (2015)
Article Google Scholar
Ivašic-Kos, M., Krišto, M., Pobar, M.: Human detection in thermal imaging using YOLO. In: Proceedings of the 2019 5th International Conference on Computer and Technology Applications, pp. 20–24. ACM (2019)
Google Scholar
Johnson, J., Karpathy, A.: Convolutional Neural Networks, Stanford Computer Science. https://cs231n.github.io/convolutional-networks
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Radovic, M., Adarkwa, O., Wang, Q.: Object recognition in aerial images using convolutional neural networks. J. Imaging 3(2), 21 (2017)
Article Google Scholar
Ramcharan, A., et al.: Assessing a mobile-based deep learning model for plant disease surveillance (2018)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Tensorflow object detection models zoo. https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Article Google Scholar
Visual Object Classes Challenge 2012 (VOC2012). http://host.robots.ox.ac.uk/pascal/VOC/voc2012/
Wang, X., Cheng, P., Liu, X., Uzochukwu, B.: Fast and accurate, convolutional neural network based approach for object detection from UAV. In: IECON 2018-44th Annual Conference of the IEEE Industrial Electronics Society. pp. 3171–3175. IEEE (2018)
Google Scholar

Download references

Acknowledgment

This research was supported by Croatian Science Foundation under the project IP-2016-06-8345 “Automatic recognition of actions and activities in multimedia content from the sports domain” (RAASS) and by the University of Rijeka under the project number 18-222-1385.

Author information

Authors and Affiliations

Department of Informatics, University of Rijeka, Rijeka, Croatia
Saša Sambolek & Marina Ivašić-Kos

Authors

Saša Sambolek
View author publications
You can also search for this author in PubMed Google Scholar
Marina Ivašić-Kos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marina Ivašić-Kos .

Editor information

Editors and Affiliations

Saints Cyril and Methodius University of Skopje, Skopje, North Macedonia
Sonja Gievska
Saints Cyril and Methodius University of Skopje, Skopje, North Macedonia
Gjorgji Madjarov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sambolek, S., Ivašić-Kos, M. (2019). Detection of Toy Soldiers Taken from a Bird’s Perspective Using Convolutional Neural Networks. In: Gievska, S., Madjarov, G. (eds) ICT Innovations 2019. Big Data Processing and Mining. ICT Innovations 2019. Communications in Computer and Information Science, vol 1110. Springer, Cham. https://doi.org/10.1007/978-3-030-33110-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-33110-8_2
Published: 14 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33109-2
Online ISBN: 978-3-030-33110-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics