Abstract
In order to accurately count the number of animals grazing on grassland, we present a livestock detection algorithm using modified versions of U-net and Google Inception-v4 net. This method works well to detect dense and touching instances. We also introduce a dataset for livestock detection in aerial images, consisting of 89 aerial images collected by quadcopter. Each image has resolution of about 3000×4000 pixels, and contains livestock with varying shapes, scales, and orientations.
We evaluate our method by comparison against Faster RCNN and Yolo-v3 algorithms using our aerial livestock dataset. The average precision of our method is better than Yolo-v3 and is comparable to Faster RCNN.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Everingham, M.; van Gool, L.; Williams, C. K. I.; Winn, J.; Zisserman, A. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision Vol. 88, No. 2, 303–338, 2010.
Lin, T. Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C. L. Microsoft COCO: Common objects in context. In: Computer Vision—ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 740–755, 2014.
Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248–255, 2009.
Russell, B.; Torralba, A.; Murphy, K.; Freeman, W. LabeMe: A database and web-based tool for image annotation. International Journal of Computer Vision Vol. 77, 157–173, 2008.
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster RCNN: Towards real-time object detection with region proposal networks. In: Proceedings of the International Conference on Neural Information Processing Systems, 91–99, 2015.
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 779–788, 2016.
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A. C. SSD: Single shot multibox detector. In: Computer Vision—ECCV 2016. Lecture Notes in Computer Science, Vol. 9905. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 21–37, 2016.
He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2961–2969, 2017.
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015. Lecture Notes in Computer Science, Vol. 9351. Navab, N.; Hornegger, J.; Wells, W.; Frangi, A. Eds. Springer Cham, 234–241, 2015.
Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. A. Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 4278–4284, 2016.
Xia, G.-S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3974–3983, 2018.
Cheng, G.; Zhou, P.; Han, J. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing Vol. 54, No. 12, 7405–7415, 2016.
Zhu, H.; Chen, X.; Dai, W.; Fu, K.; Ye, Q.; Jiao, J. Orientation robust object detection in aerial images using deep convolutional neural network. In: Proceedings of the IEEE International Conference on Image Processing, 3735–3739, 2015.
Heitz, G.; Koller, D. Learning spatial context: Using stuff to find things. In: Computer Vision—ECCV 2008. Lecture Notes in Computer Science, Vol. 5302. Forsyth, D.; Torr, P.; Zisserman, A. Eds. Springer Berlin Heidelberg, 30–43, 2008.
Razakarivony, S.; Jurie, F. Vehicle detection in aerial imagery: A small target detection benchmark. Journal of Visual Communication and Image Representation Vol. 34, 187–203, 2016.
Mundhenk, T. N.; Konjevod, G.; Sakla, W. A.; Boakye, K. A large contextual dataset for classification, detection and counting of cars with deep learning. In: Computer Vision—ECCV 2016. Lecture Notes in Computer Science, Vol. 9907. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 785–800, 2016.
Liu, K.; Mattyus, G. Fast multiclass vehicle detection on aerial images.IEEE Geoscience and Remote Sensing Letters Vol. 12, No. 9, 1938–1942, 2015.
Liu, Z.; Wang, H.; Weng, L.; Yang, Y. Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds. IEEE Geoscience and Remote Sensing Letters Vol. 13, No. 8, 1074–1078, 2017.
Cheng, M.-M.; Zhang, F.-L.; Mitra, N. J.; Huang, X.; Hu, S.-M. RepFinder: Finding approximately repeated scene elements for image editing. ACM Transactions on Graphics Vol. 29, No. 4, Article No. 83, 2010.
Krizhevsky, A.; Sutskever, I.; Hinton, G. E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the International Conference on Neural Information Processing Systems, 1097–1105, 2012.
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 580–587, 2014.
Girshick, R. Fast R-CNN In: Proceedings of the IEEE International Conference on Computer Vision, 1440–1448, 2015.
Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7263–7271, 2017.
Redmon, J.; Farhadi, A. YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
Zhang, F.-L.; Xian, W.; Li, R.-L.; Zheng, Z.-H.; Wang, J.; Hu, S.-M. Detecting and removing visual distractors for video aesthetic enhancement. IEEE Transactions on Multimedia Vol. 20, No. 8, 1987–1999, 2018.
Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3431–3440, 2015.
Sakla, W.; Konjevod, G.; Mundhenk, T. N. Deep multi-modal vehicle detection in aerial ISR imagery. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 916–923, 2017.
Acknowledgements
This work was supported by the Scientific and Technological Achievements Transformation Project of Qinghai, China (Project No. 2018-SF-110), and the National Natural Science Foundation of China (Projects Nos. 61866031 and 61862053).
Author information
Authors and Affiliations
Corresponding author
Additional information
Liang Han is a lecturer in the Department of Computer Technology and Application, Qinghai University. He received his bachelor and master degrees from Lanzhou University in 2010 and 2012 respectively. His research interests include computer vision and machine learning.
Pin Tao is an associate professor in the Computer Science and Technology Department of Tsinghua University. He received his B.S. degree from the Computer Science and Technology Department of Tsinghua University in 1997. In 1999 and 2002, he received his M.S. and Ph.D. degrees in computer applications from Tsinghua University. His research interests are in embedded media processing.
Ralph R. Martin is an emeritus professor of Cardiff University. He has served on the editorial boards of various journals including Computer-Aided Design, Computer Aided Geometric Design, and Geometric Models. In 2014, he was awarded the Friendship Award, China’s highest award for foreign nationals. In 2016, he was awarded the title of Solid Modeling Pioneer by the Solid Modeling Association.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Han, L., Tao, P. & Martin, R.R. Livestock detection in aerial images using a fully convolutional network. Comp. Visual Media 5, 221–228 (2019). https://doi.org/10.1007/s41095-019-0132-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-019-0132-5