Abstract
This paper describes the use of two different deep-learning approaches for object detection to recognize a toy soldier. We use recordings of toy soldiers in different poses under different scenarios to simulate appearance of persons on footage taken by drones. Recordings from a bird’s eye view are today widely used in the search for missing persons in non-urban areas, border control, animal movement control, and the like. We have compared the single-shot multi-box detector (SSD) with the MobileNet or Inception V2 as a backbone, SSDLite with MobileNet and Faster R-CNN combined with Inception V2 and ResNet50. The results show that Faster R-CNN detects small object such as toy soldiers more successfully than SSD, and the training time of Faster R-CNN is much shorter than that of SSD.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Andriluka, M., et al.: Vision based victim detection from unmanned aerial vehicles. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1740–1747. IEEE (2010)
Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Montavon, G., Orr, Geneviève B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 437–478. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_26
Bondi, E., et al.: Spot poachers in action: augmenting conservation drones with automatic detection in near real time. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Boureau, Y., Ponce, J., LeCun, Y.: A theoretical analysis of feature pooling in vision algorithms. In: Proceedings of International Conference on Machine learning (ICML10), vol. 10 (2010)
Burić, M., Pobar, M., Ivašić-Kos, M.: Ball detection using YOLO and mask R-CNN. In: 2018 International Conference on Computational Science and Computational Intelligence (CSCI 2018), pp. 319–323 (2018)
Burić, M., Pobar, M., Ivašić-Kos, M.: Adapting YOLO network for ball and player detection. In: Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2019), pp. 845–851. SciTePress, Portugal (2019)
Cocodataset. http://cocodataset.org/
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision & Pattern Recognition, pp. 886–893 (2005)
Gallego, A.J., Pertusa, A., Gil, P., Fisher, R.B.: Detection of bodies in maritime rescue operations using unmanned aerial vehicles with multispectral cameras. J. Field Robot. 36(4), 782–796 (2019)
Gao, H., Cheng, B., Wang, J., Li, K., Zhao, J., Li, D.: Object classification using CNN-based fusion of vision and LIDAR in autonomous vehicle environment. IEEE Trans. Ind. Inform. 14(9), 4224–4231 (2018)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 580–587 (2014)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Gu, J., et al.: Recent advances in convolutional neural networks. Pattern Recogn. 77, 354–377 (2018)
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Hrga, I., Ivašić-Kos, M.: Deep mage captioning: an overview. In: 42nd International ICT Convention–MIPRO 2019-CIS-Intelligent Systems (2019)
Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7310–7311 (2017)
Ivašić-Kos, M., Ipšić, I., Ribarić, S.: A knowledge-based multi-layered image annotation system. Expert Syst. Appl. 42(24), 9539–9553 (2015)
Ivašic-Kos, M., Krišto, M., Pobar, M.: Human detection in thermal imaging using YOLO. In: Proceedings of the 2019 5th International Conference on Computer and Technology Applications, pp. 20–24. ACM (2019)
Johnson, J., Karpathy, A.: Convolutional Neural Networks, Stanford Computer Science. https://cs231n.github.io/convolutional-networks
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Radovic, M., Adarkwa, O., Wang, Q.: Object recognition in aerial images using convolutional neural networks. J. Imaging 3(2), 21 (2017)
Ramcharan, A., et al.: Assessing a mobile-based deep learning model for plant disease surveillance (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Tensorflow object detection models zoo. https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Visual Object Classes Challenge 2012 (VOC2012). http://host.robots.ox.ac.uk/pascal/VOC/voc2012/
Wang, X., Cheng, P., Liu, X., Uzochukwu, B.: Fast and accurate, convolutional neural network based approach for object detection from UAV. In: IECON 2018-44th Annual Conference of the IEEE Industrial Electronics Society. pp. 3171–3175. IEEE (2018)
Acknowledgment
This research was supported by Croatian Science Foundation under the project IP-2016-06-8345 “Automatic recognition of actions and activities in multimedia content from the sports domain” (RAASS) and by the University of Rijeka under the project number 18-222-1385.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Sambolek, S., Ivašić-Kos, M. (2019). Detection of Toy Soldiers Taken from a Bird’s Perspective Using Convolutional Neural Networks. In: Gievska, S., Madjarov, G. (eds) ICT Innovations 2019. Big Data Processing and Mining. ICT Innovations 2019. Communications in Computer and Information Science, vol 1110. Springer, Cham. https://doi.org/10.1007/978-3-030-33110-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-33110-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33109-2
Online ISBN: 978-3-030-33110-8
eBook Packages: Computer ScienceComputer Science (R0)