Skip to main content
Log in

Small object detection in diverse application landscapes: a survey

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The importance of object detection within computer vision, especially in the context of detecting small objects, has notably increased. This thorough survey extensively examines small object detection across various applications, consolidating and outlining the available methodologies. Traditional papers on small object detection have focused on specific domains. However, this survey paper incorporates insights from a multitude of domains, providing a comprehensive understanding of the versatility and applicability of small object detection techniques. This paper sheds light on the key challenges faced and delves into potential solutions to address the challenges, offering insights into viable solutions to enhance small object detection performance, setting it apart from existing literature. The strategies identified in our survey encompass a spectrum of approaches, categorized as transformer-based, CNN, and traditional methods. Also, this paper collates prevalent datasets relevant to small object detection, simplifying access to these resources. Further, it provides a succinct overview of diverse evaluation metrics used for performance assessment in this field, enhancing understanding of the effectiveness and proficiency of these methods. This survey paper not only consolidates established knowledge but also highlights innovative viewpoints, providing a comprehensive and enlightening compilation that contributes to the advancement of small object detection in the field of computer vision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availibility Statement

This article does not entail data sharing, as it did not involve the creation or analysis of any datasets during the study.

References

  1. Amit Y, Felzenszwalb P, Girshick R (2020) Object detection. A Reference Guide, Computer Vision, pp 1–9

    Google Scholar 

  2. Zaidi SSA, Ansari MS, Aslam A, Kanwal N, Asghar M, Lee B (2022) A survey of modern deep learning based object detection models. Digit Signal Process 126:103514

    Article  Google Scholar 

  3. Liu Y, Sun P, Wergeles N, Shang Y (2021) A survey and performance evaluation of deep learning methods for small object detection. Expert Syst Appl 172:114602

    Article  Google Scholar 

  4. Wang Q, Zhang L, Bertinetto L, Hu W, Torr PH (2019) Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1328–1338

  5. Hossain MZ, Sohel F, Shiratuddin MF, Laga H (2019) A comprehensive survey of deep learning for image captioning. ACM Comput Surv (CsUR) 51(6):1–36

    Article  Google Scholar 

  6. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223

  7. Gould S, Baumstarck P, Quigley M, Ng AY, Koller D (2008) Integrating visual and range data for robotic object detection. In: Workshop on multi-camera and multi-modal sensor fusion algorithms and applications-M2SFA2, 2008

  8. Zhu P, Wen L, Du D, Bian X, Ling H, Hu Q, Nie Q, Cheng H, Liu C, Liu X et al (2018) Visdrone-det2018: the vision meets drone object detection in image challenge results. In: Proceedings of the european conference on computer vision (ECCV) workshops, pp 0–0

  9. Yundong L, Han D, Hongguang L, Zhang X, Zhang B, Zhifeng X (2020) Multi-block ssd based on small object detection for uav railway scene surveillance. Chin J Aeronaut 33(6):1747–1755

    Article  Google Scholar 

  10. Tong K, Wu Y, Zhou F (2020) Recent advances in small object detection based on deep learning: a review. Image Vis Comput 97:103910

    Article  Google Scholar 

  11. Cheng G, Yuan X, Yao X, Yan K, Zeng Q, Xie X, Han J (2023) Towards large-scale small object detection: Survey and benchmarks. IEEE Trans Pattern Anal Mach Intell

  12. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, pp 740–755

  13. Lee J, Bang J, Yang S-I (2017) Object detection with sliding window in images including multiple similar objects. In: 2017 international conference on information and communication technology convergence (ICTC). IEEE, pp 803–806

  14. Hashemi NS, Aghdam RB, Ghiasi ASB, Fatemi P (2016) Template matching advances and applications in image analysis. arXiv:1610.07231

  15. Choi C, Christensen HI (2012) 3d textureless object detection and tracking: an edge-based approach. In: 2012 IEEE/RSJ International conference on intelligent robots and systems. IEEE, pp 3877–3884

  16. Jeelani Z, Qadir F (2022) Cellular automata-based approach for salt-and-pepper noise filtration. J King Saud University - Comp Inf Sci 34(2):365–374. https://doi.org/10.1016/j.jksuci.2018.12.006

  17. Jeelani Z, Gani G, Qadir F (2023) Linear cellular automata-based impulse noise identification and filtration of degraded images. SIViP 17(6):2679–2687. https://doi.org/10.1007/s11760-023-02484-4

  18. Papageorgiou CP, Oren M, Poggio T (1998) A general framework for object detection. In: Sixth international conference on computer vision (IEEE Cat. No. 98CH36271). IEEE, pp 555–562

  19. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. Ieee, pp 886–893

  20. Piccinini P, Prati A, Cucchiara R (2012) Real-time object detection and localization with sift-based clustering. Image Vis Comput 30(8):573–587

    Article  Google Scholar 

  21. Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR. IEEE Comput Soc. https://doi.org/10.1109/cvpr.2001.990517

  22. Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. In: 2011 International conference on computer vision. IEEE, pp 89–96

  23. Paisitkriangkrai S, Shen C, van den Hengel A (2015) Pedestrian detection with spatially pooled features and structured ensemble learning. IEEE Trans Pattern Anal Mach Intell 38(6):1243–1257

    Article  Google Scholar 

  24. Rashid Y, Bhat JI (2023) Topological to deep learning era for identifying influencers in online social networks: a systematic review. Multimed Tools Appl 1–44

  25. Rashid Y, Iqbal Bhat J (2023) Unlocking the power of social networks with community detection techniques for isolated and overlapped communities: a review. Indian J Sci Technol 16(25):1857–1871

    Article  Google Scholar 

  26. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE. [Online]. Available: https://doi.org/10.1109/cvpr.2016.91

  27. Lou H, Duan X, Guo J, Liu H, Gu J, Bi L, Chen H (2023) Dc-yolov8: small-size object detection algorithm based on camera sensor. Electronics 12(10):2323

    Article  Google Scholar 

  28. Girshick R (2015) Fast r-CNN. In: 2015 IEEE international conference on computer vision (ICCV). IEEE. https://doi.org/10.1109/iccv.2015.169

  29. Meng J, Jiang P, Wang J, Wang K (2022) A mobilenet-ssd model with fpn for waste detection. J Electr Engineer Technol 17(2):1425–1431

    Article  Google Scholar 

  30. Bosquet B, Mucientes M, Brea VM (2021) Stdnet-st: spatio-temporal convnet for small object detection. Pattern Recog 116:107929

    Article  Google Scholar 

  31. Bai Y, Zhang Y, Ding M, Ghanem B (2018) Sod-mtgan: small object detection via multi-task generative adversarial network. In: Proceedings of the European conference on computer vision (ECCV), pp 206–221

  32. Xu X, Zhang H, Ma Y, Liu K, Bao H, Qian X (2023) Transdet: toward effective transfer learning for small-object detection. Remote Sens 15(14)3525

  33. Tang Y-P, Wei X-S, Zhao B, Huang S-J (2021) Qbox: partial transfer learning with active querying for object detection. IEEE transactions on neural networks and learning systems

  34. Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229

  35. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929

  36. Gong H, Mu T, Li Q, Dai H, Li C, He Z, Wang W, Han F, Tuniyazi A, Li H et al (2022) Swin-transformer-enabled yolov5 with attention mechanism for small object detection on satellite images. Remote Sens 14(12):2861

    Article  Google Scholar 

  37. Chen G, Wang H, Chen K, Li Z, Song Z, Liu Y, Chen W, Knoll A (2020) A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal. IEEE Trans Syst Man Cybern Syst 52(2):936–953

    Article  Google Scholar 

  38. Mushtaq S, Singh O (2024) Convolution neural networks for disease prediction: applications and challenges. Scalable Comput: Pract Experience 25(1):615–636

    Google Scholar 

  39. Tan K, Ding S, Wu S, Tian K, Ren J et al (2023) A small object detection network based on multiple feature enhancement and feature fusion. Sci Program 2023

  40. Modegi T (2008) Small object recognition techniques based on structured template matching for high-resolution satellite images. In: 2008 SICE Annual Conference. IEEE, pp 2168–2173

  41. Nagaraj S, Muthiyan B, Ravi S, Menezes V, Kapoor K, Jeon H (2017) Edge-based street object detection. In: 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE, pp 1–4

  42. Arunmozhi A, Park J (2018) Comparison of hog, lbp and haar-like features for on-road vehicle detection. In: 2018 IEEE international conference on Electro/Information Technology (EIT). IEEE, pp 0362–0367

  43. Ren Y, Zhu C, Xiao S (2018) Small object detection in optical remote sensing images via modified faster r-cnn. Appl Sci 8(5):813

    Article  Google Scholar 

  44. Kisantal M, Wojna Z, Murawski J, Naruniec J, Cho K (2019) Augmentation for small object detection. arXiv:1902.07296

  45. Lim J-S, Astrid M, Yoon H-J, Lee S-I (2021) Small object detection using context and attention. In: 2021 International conference on artificial intelligence in information and communication (ICAIIC). IEEE, pp 181–186

  46. Wu X, Hong D, Chanussot J (2022) Uiu-net: U-net in u-net for infrared small object detection. IEEE transactions on image processing 32:364–376

    Article  Google Scholar 

  47. Mahaur B, Mishra K (2023) Small-object detection based on yolov5 in autonomous driving systems. Pattern Recogn Lett 168:115–122

    Article  Google Scholar 

  48. Chen C, Gong W, Chen Y, Li W (2019) Object detection in remote sensing images based on a scene-contextual feature pyramid network. Remote Sens 11(3):339

    Article  Google Scholar 

  49. Leng J, Ren Y, Jiang W, Sun X, Wang Y (2021) Realize your surroundings: exploiting context information for small object detection. Neurocomputing 433:287–299

    Article  Google Scholar 

  50. Hamdi A, Chan YK, Koo VC (2021) A new image enhancement and super resolution technique for license plate recognition. Heliyon 7(11)

  51. Li J, Liang X, Wei Y, Xu T, Feng J, Yan S (2017) Perceptual generative adversarial networks for small object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1222–1230

  52. Krishna H, Jawahar C (2017) Improving small object detection. In: 2017 4th IAPR Asian conference on pattern recognition (ACPR). IEEE, pp 340–345

  53. Li C, Zhang Y, Gao G, Liu Z, Liao L (2022) Context-aware cross-level attention fusion network for infrared small target detection. J Appl Remote Sens 16(4):046 506–046 506

  54. Hoiem D, Chodpathumwan Y, Dai Q (2012) Diagnosing error in object detectors. In: European conference on computer vision. Springer, pp 340–353

  55. Huang J, Murphy K (2015) Efficient inference in occlusion-aware generative models of images. arXiv:1511.06362

  56. Chen Y-T, Liu X, Yang M-H (2015) Multi-instance object segmentation with occlusion handling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3470–3478

  57. Zoph B, Cubuk ED, Ghiasi G, Lin T-Y, Shlens J, Le QV (2020) Learning data augmentation strategies for object detection. In: Computer Vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, proceedings, Part XXVII 16. Springer, pp 566–583

  58. Deepak S, Ameer P (2023) Brain tumor categorization from imbalanced mri dataset using weighted loss and deep feature fusion. Neurocomputing 520:94–102

    Article  Google Scholar 

  59. Zhong Z, Sun L, Huo Q (2019) An anchor-free region proposal network for faster r-cnn-based text detection approaches. Int J Doc Anal Recognit (IJDAR) 22(3):315–327

    Article  Google Scholar 

  60. Zhang H, Li F, Liu S, Zhang L, Su H, Zhu J, Ni LM, Shum H-Y (2022) Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv:2203.03605

  61. Doon R, Rawat TK, Gautam S (2018) Cifar-10 classification using deep convolutional neural network. In: 2018 IEEE Punecon. IEEE. https://doi.org/10.1109/punecon.2018.8745428

  62. Truong T-D, Nguyen V-T, Tran M-T (2018) Lightweight deep convolutional network for tiny object recognition. In: ICPRAM, pp 675–682

  63. Mogelmose A, Liu D, Trivedi MM (2015) Detection of u.s. traffic signs. IEEE Trans Intell Transp Syst 16(6):3116–3125. https://doi.org/10.1109/tits.2015.2433019

  64. Kuznetsova A, Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, Kamali S, Popov S, Malloci M, Kolesnikov A, Duerig T, Ferrari V (2020) The open images dataset v4. Int J Comput Vis 128(7):1956–1981. https://doi.org/10.1007/s11263-020-01316-z

  65. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft COCO: common objects in context. In: Computer vision – ECCV 2014. Springer International Publishing, pp 740–755. https://doi.org/10.1007/978-3-319-10602-1_48

  66. Loh YP, Chan CS (2019) Getting to know low-light images with the exclusively dark dataset. Comp Vision Image Underst 178:30–42. https://doi.org/10.1016/j.cviu.2018.10.010

  67. Wang X, Yang M, Zhu S, Lin Y (2013) Regionlets for generic object detection. In: 2013 IEEE international conference on computer vision. IEEE. https://doi.org/10.1109/iccv.2013.10

  68. Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, Madhavan V, Darrell T (2020) BDD100k: a diverse driving dataset for heterogeneous multitask learning. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE. https://doi.org/10.1109/cvpr42600.2020.00271

  69. Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (2018) DOTA: a large-scale dataset for object detection in aerial images. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. IEEE. https://doi.org/10.1109/cvpr.2018.00418

  70. Krizhevsky A, Hinton G (2010) Convolutional deep belief networks on cifar-10. Unpublished manuscript 40(7):1–9

  71. Recht B, Roelofs R, Schmidt L, Shankar V (2018) Do cifar-10 classifiers generalize to cifar-10? arXiv:1806.00451

  72. Møgelmose A, Liu D, Trivedi MM (2014) Traffic sign detection for us roads: remaining challenges and a case for tracking. In: 17th International IEEE conference on intelligent transportation systems (ITSC). IEEE, pp 1394–1399

  73. Crowder J, Cornish NJ (2007) Solution to the galactic foreground problem for Lisa. Phys Rev D 75(4):043008

    Article  Google Scholar 

  74. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, proceedings, Part V 13. Springer, pp 740–755

  75. Barbu A, Mayo D, Alverio J, Luo W, Wang C, Gutfreund D, Tenenbaum J, Katz B (2019) Objectnet: a large-scale bias-controlled dataset for pushing the limits of object recognition models. Adv Neural Inf Process Syst 32

  76. Dabov K, Foi A, Katkovnik V, Egiazarian K (2006) Image denoising with block-matching and 3d filtering. In: Image processing: algorithms and systems, neural networks, and machine learning, vol 6064. SPIE, pp 354–365

  77. Loh YP, Chan CS (2019) Getting to know low-light images with the exclusively dark dataset. Comp Vision Image Underst 178:30–42

    Article  Google Scholar 

  78. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. Ieee, pp 248–255

  79. Barbu A, Mayo D, Alverio J, Luo W, Wang C, Gutfreund D, Tenenbaum J, Katz B (2019) Objectnet: a large-scale bias-controlled dataset for pushing the limits of object recognition models. Adv Neural Inf Process Syst 32

  80. Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, Madhavan V, Darrell T (2020) Bdd100k: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2636–2645

  81. Haris M, Glowacz A (2021) Road object detection: a comparative study of deep learning-based algorithms. Electronics 10(16):1932

  82. Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (2018) Dota: A large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3974–3983

  83. Xu C, Wang J, Yang W, Yu H, Yu L, Xia G-S (2022) Detecting tiny objects in aerial images: a normalized wasserstein distance and a new benchmark. ISPRS J Photogramm Remote Sens 190:79–93

    Article  Google Scholar 

  84. Xu C, Wang J, Yang W, Yu H, Yu L, Xia G-S (2022) Detecting tiny objects in aerial images: A normalized wasserstein distance and a new benchmark. ISPRS J Photogramm Remote Sens 190:79–93

    Article  Google Scholar 

  85. Yu X, Gong Y, Jiang N, Ye Q, Han Z (2020) Scale match for tiny person detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1257–1265

  86. Yu X, Han Z, Gong Y, Jan N, Zhao J, Ye Q, Chen J, Feng Y, Zhang B, Wang X et al (2020) The 1st tiny object detection challenge: methods and results. In: Computer vision–ECCV 2020 workshops: Glasgow, UK, August 23–28, 2020, proceedings, Part V 16. Springer, 315–323

  87. Kuznetsova A, Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, Kamali S, Popov S, Malloci M, Kolesnikov A et al (2020) The open images dataset v4: unified image classification, object detection, and visual relationship detection at scale. Int J Comput Vis 128(7):1956–1981

    Article  Google Scholar 

  88. Du D, Qi Y, Yu H, Yang Y, Duan K, Li G, Zhang W, Huang Q, Tian Q (2018) The unmanned aerial vehicle benchmark: object detection and tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 370–386

  89. Yu W, Yang T, Chen C (2021) Towards resolving the challenge of long-tail distribution in uav images for object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3258–3267

  90. Wang J, Yang W, Guo H, Zhang R, Xia G-S (2021) Tiny object detection in aerial images. In: 2020 25th international conference on pattern recognition (ICPR). IEEE, pp 3791–3798

  91. Du D, Qi Y, Yu H, Yang Y, Duan K, Li G, Zhang W, Huang Q, Tian Q (2018) The unmanned aerial vehicle benchmark: object detection and tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 370–386

  92. Li K, Wan G, Cheng G, Meng L, Han J (2020) Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J Photogramm Remote Sens 159:296–307

    Article  Google Scholar 

  93. Yang S, Luo P, Loy C-C, Tang X (2016) Wider face: a face detection benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5525–5533

  94. Prasad S, Li Y, Lin D, Sheng D (2021) maskedFaceNet: a progressive semi-supervised masked face detector. In: 2021 IEEE Winter conference on applications of computer vision (WACV). IEEE. https://doi.org/10.1109/wacv48630.2021.00343

  95. Wang Q-J, Zhang S-Y, Dong S-F, Zhang G-C, Yang J, Li R, Wang H-Q (2020) Pest24: a large-scale very small object data set of agricultural pests for multi-target detection. Comput Electron Agric 175:105585

    Article  Google Scholar 

  96. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot MultiBox detector. In: Computer vision – ECCV 2016. Springer International Publishing, pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2

  97. Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162

  98. Farhadi A, Redmon J (2018) Yolov3: an incremental improvement. In: Computer vision and pattern recognition, pp 1804–02 767

  99. Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: International conference on machine learning. PMLR, pp 647–655

  100. Mathew MP, Mahesh TY (2022) Leaf-based disease detection in bell pepper plant using yolo v5. SIViP 1–7

  101. Yang R, Yu Y (2021) Artificial convolutional neural network in object detection and semantic segmentation for medical imaging analysis. Front Oncol 11:638182

    Article  Google Scholar 

  102. Shah SNA, Parveen R (2023) An extensive review on lung cancer diagnosis using machine learning techniques on radiological data: state-of-the-art and perspectives. Arch Comput Meth Engineer 1–14

  103. Mushtaq S, Singh O (2023) Implementing image processing and deep learning techniques to analyze skin cancer images. Int J Comput Digit Syst 14(1):1–xx

    Google Scholar 

  104. Sushanki S, Bhandari AK, Singh AK (2023) A review on computational methods for breast cancer detection in ultrasound images using multi-image modalities. Arch Comput Meth Engineer 1–20

  105. Sahoo PK, Mishra S, Panigrahi R, Bhoi AK, Barsocchi P (2022) An improvised deep-learning-based mask r-cnn model for laryngeal cancer detection using ct images. Sensors 22(22):8834

    Article  Google Scholar 

  106. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  107. Abhisheka B, Biswas SK, Purkayastha B (2023) A comprehensive review on breast cancer detection, classification and segmentation using deep learning. Arch Comput Meth Engineer 1–30

  108. Khosravan N, Bagci U (2018) S4nd: Single-shot single-scale lung nodule detection. In: Medical image computing and computer assisted intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part II 11. Springer, pp 794–802

  109. Van Etten A (2018) You only look twice: rapid multi-scale object detection in satellite imagery. arXiv:1805.09512

  110. Nina W, Condori W, Machaca V, Villegas J, Castro E (2020) Small ship detection on optical satellite imagery with yolo and yolt. In: Advances in information and communication: proceedings of the 2020 future of information and communication conference (FICC), vol 2. Springer, pp 664–677

  111. Wang J, Yang W, Guo H, Zhang R, Xia G-S (2021) Tiny object detection in aerial images. In: 2020 25th international conference on pattern recognition (ICPR). IEEE, pp 3791–3798

  112. Wang G, Chen Y, An P, Hong H, Hu J, Huang T (2023) Uav-yolov8: a small-object-detection model based on improved yolov8 for uav aerial photography scenarios. Sensors 23(16):7190

    Article  Google Scholar 

  113. Javid I, Ghazali R, Saeed W, Batool T, Al-Wajih E (2023) Cnn with new spatial pyramid pooling and advanced filter-based techniques: revolutionizing traffic monitoring via aerial images. Sustainability 16(1):117

    Article  Google Scholar 

  114. Zhai X, Huang Z, Li T, Liu H, Wang S (2023) Yolo-drone: an optimized yolov8 network for tiny uav object detection. Electronics 12(17):3664

    Article  Google Scholar 

  115. Sun W, Dai L, Zhang X, Chang P, He X (2021) Rsod: real-time small object detection algorithm in uav-based traffic monitoring. Appl Intell 1–16

  116. Gould S, Baumstarck P, Quigley M, Ng AY, Koller D (2008) Integrating visual and range data for robotic object detection. In: Workshop on multi-camera and multi-modal sensor fusion algorithms and applications-M2SFA2 2008

  117. Wang Y, Sun Q, Liu Z, Gu L (2022) Visual detection and tracking algorithms for minimally invasive surgical instruments: a comprehensive review of the state-of-the-art. Robot Auton Syst 149:103945

    Article  Google Scholar 

  118. Koskinopoulou M, Raptopoulos F, Papadopoulos G, Mavrakis N, Maniadakis M (2021) Robotic waste sorting technology: toward a vision-based categorization system for the industrial robotic separation of recyclable waste. IEEE Robot Autom Mag 28(2):50–60

    Article  Google Scholar 

  119. Farooq AS, Zhang P (2022) A comprehensive review on the prospects of next-generation wearable electronics for individualized health monitoring, assistive robotics, and communication. Sensors Actuators A Phys 113715

  120. Kulik S, Shtanko A (2020) Experiments with neural net object detection system yolo on small training datasets for intelligent robotics. In: Advanced technologies in robotics and intelligent systems: proceedings of ITR 2019. Springer, pp 57–162

  121. Liu Y, Li W, Tan L, Huang X, Zhang H, Jiang X (2023) Db-yolov5: a uav object detection model based on dual backbone network for security surveillance. Electronics 12(15):3296

    Article  Google Scholar 

  122. Lin K, Chen S-C, Chen C-S, Lin D-T, Hung Y-P (2015) Abandoned object detection via temporal consistency modeling and back-tracing verification for visual surveillance. IEEE Trans Inf Forensic Secur 10(7):1359–1370

    Article  Google Scholar 

  123. Xu S, Zhang M, Song W, Mei H, He Q, Liotta A (2023) A systematic review and analysis of deep learning-based underwater object detection. Neurocomputing

  124. Gunes A, Guldogan MB (2016) Joint underwater target detection and tracking with the bernoulli filter using an acoustic vector sensor. Digit Signal Process 48:246–258

    Article  MathSciNet  Google Scholar 

  125. Chen L, Zhou F, Wang S, Dong J, Li N, Ma H, Wang X, Zhou H (2022) Swipenet: object detection in noisy underwater scenes. Pattern Recog 132:108926

    Article  Google Scholar 

  126. Chen G, Mao Wang K, Shen J (2023) Htdet: a hybrid transformer-based approach for underwater small object detection. Remote Sens 15(4):1076

    Article  Google Scholar 

  127. Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48

    Article  Google Scholar 

  128. Cai Y, Luan T, Gao H, Wang H, Chen L, Li Y, Sotelo MA, Li Z (2021) Yolov4-5d: an effective and efficient object detector for autonomous driving. IEEE Trans Instrum Meas 70:1–13

    Google Scholar 

  129. Dipu MTA, Hossain SS, Arafat Y, Rafiq FB (2021) Real-time driver drowsiness detection using deep learning. Int J Adv Comput Sci Appl 12(7)

  130. Malkoff DB, Oliver WR (2000) Hyperspectral imaging applied to forensic medicine. In: Spectral imaging: instrumentation, applications, and analysis 3920. SPIE, pp 108–116

  131. Wetzer E, Lohninger H (2018) Image processing using color space models for forensic fiber detection. IFAC-PapersOnLine 51(2):445–450

    Article  Google Scholar 

  132. Turtiainen H, Costin A, Hämäläinen T, Lahtinen T, Sintonen L (2022) Cctvcv: computer vision model/dataset supporting cctv forensics and privacy applications. In: 2022 IEEE international conference on trust, security and privacy in computing and communications (TrustCom). IEEE, pp 1219–1226

  133. Akyon FC, Altinuc SO, Temizel A (2022) Slicing aided hyper inference and fine-tuning for small object detection. In: 2022 IEEE international conference on image processing (ICIP). IEEE, pp 966–970

  134. Wang S (2011) A review of gradient-based and edge-based feature extraction methods for object detection. In: 2011 IEEE 11th international conference on computer and information technology. IEEE, pp 277–282

  135. Choi C, Christensen HI (2012) 3d textureless object detection and tracking: an edge-based approach. In: 2012 IEEE/RSJ international conference on intelligent robots and systems. IEEE, pp 3877–3884

  136. Wang Y-Q (2014) An analysis of the viola-jones face detection algorithm. Image Process Line 4:128–148

    Article  Google Scholar 

  137. Dabhi MK, Pancholi BK (2016) Face detection system based on viola-jones algorithm. Int J Sci Res (IJSR) 5(4):62–64

    Article  Google Scholar 

  138. Ebrahimzadeh R, Jampour M (2014) Efficient handwritten digit recognition based on histogram of oriented gradients and svm. Int J Comp Appl 104(9)

  139. Psyllos AP, Anagnostopoulos C-NE, Kayafas E (2010) Vehicle logo recognition using a sift-based enhanced matching scheme. IEEE Trans Intell Transp Syst 11(2):322–328

    Article  Google Scholar 

  140. Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE. https://doi.org/10.1109/cvpr.2008.4587597

  141. Uricár M, Franc V, Hlavác V (2015) Facial landmark tracking by tree-based deformable part model based detector. In: Proceedings of the IEEE international conference on computer vision workshops, pp 10–17

  142. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition. IEEE. https://doi.org/10.1109/cvpr.2014.81

  143. Zhang S, Wu R, Xu K, Wang J, Sun W (2019) R-cnn-based ship detection from high resolution remote sensing imagery. Remote Sens 11(6):631

    Article  Google Scholar 

  144. Li J, Liang X, Shen S, Xu T, Feng J, Yan S (2017) Scale-aware fast r-cnn for pedestrian detection. IEEE Trans Multimed Comput 20(4):985–996

    Google Scholar 

  145. Ren S, He K, Girshick R, Sun J (2017) Faster r-CNN: towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence 39(6):1137–1149. [Online]. Available: https://doi.org/10.1109/tpami.2016.2577031

  146. Su Y, Li D, Chen X (2021) Lung nodule detection based on faster r-cnn framework. Comput Methods Prog Biomed 200:105866

    Article  Google Scholar 

  147. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: Computer vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, proceedings, Part I 14. Springer, pp 21–37

  148. Nagrath P, Jain R, Madan A, Arora R, Kataria P, Hemanth J (2021) Ssdmnv2: A real time dnn-based face mask detection system using single shot multibox detector and mobilenetv2. Sustain Cities Soc 66:102692

    Article  Google Scholar 

  149. Shinde S, Kothari A, Gupta V (2018) Yolo based human action recognition and localization. Procedia Comput Sci 133:831–838

    Article  Google Scholar 

  150. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE. https://doi.org/10.1109/cvpr.2017.690

  151. Wu X, Sun S, Chen N, Fu M, Hou X (2020) Real-time vehicle color recognition based on yolo9000. In: Communications, signal processing, and systems: proceedings of the 2018 CSPS Volume II: Signal Processing 7th. Springer, pp 82–89

  152. Xianbao C, Guihua Q, Yu J, Zhaomin Z (2021) An improved small object detection method based on yolo v3. Pattern Anal Applic 24:1347–1355

    Article  Google Scholar 

  153. Lawal MO (2021) Tomato detection based on modified yolov3 framework. Sci Rep 11(1):1447

    Article  Google Scholar 

  154. Liu H, Fan K, Ouyang Q, Li N (2021) Real-time small drones detection based on pruned yolov4. Sensors 21(10):3374

    Article  Google Scholar 

  155. Hu X, Liu Y, Zhao Z, Liu J, Yang X, Sun C, Chen S, Li B, Zhou C (2021) Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved yolo-v4 network. Comput Electron Agric 185:106135

    Article  Google Scholar 

  156. Wu W, Liu H, Li L, Long Y, Wang X, Wang Z, Li J, Chang Y (2021) Application of local fully convolutional neural network combined with yolo v5 algorithm in small target detection of remote sensing image. PloS one 16(10):e0259283

    Article  Google Scholar 

  157. Wu W, Liu H, Li L, Long Y, Wang X, Wang Z, Li J, Chang Y (2021) Application of local fully convolutional neural network combined with yolo v5 algorithm in small target detection of remote sensing image. PloS one 16(10):e0259283

    Article  Google Scholar 

  158. Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W et al (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv:2209.02976

  159. Norkobil Saydirasulovich S, Abdusalomov A, Jamil MK, Nasimov R, Kozhamzharova D, Cho Y-I (2023) A yolov6-based improved fire detection approach for smart city environments. Sensors 23(6):3161

    Article  Google Scholar 

  160. Zhao H, Zhang H, Zhao Y (2023) Yolov7-sea: object detection of maritime uav images based on improved yolov7. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 233–238

  161. Wei G, Wan F, Zhou W, Xu C, Ye Z, Liu W, Lei G, Xu L (2023) Bfd-yolo: a yolov7-based detection method for building façade defects. Electronics 12(17):3612

    Article  Google Scholar 

  162. Sohan M, Sai Ram T, Reddy R, Venkata C (2024) A review on yolov8 and its advancements. In: International conference on data intelligence and cognitive informatics. Springer, pp 529–545

  163. Huang Z, Li L, Krizek GC, Sun L (2023) Research on traffic sign detection based on improved yolov8. J Comput Commun 11(7):226–232

    Article  Google Scholar 

  164. Yi H, Liu B, Zhao B, Liu E (2023) Small object detection algorithm based on improved yolov8 for remote sensing. IEEE J Sel Top Appl Earth Obs Remote Sens

  165. Chaturvedi A, Rajpoot V (2020) An optimized deep vision framework. Solid State Technol 63(6):561–569

    Google Scholar 

  166. Lin M, Li C, Bu X, Sun M, Lin C, Yan J, Ouyang W, Deng Z (2020) Detr for crowd pedestrian detection. arXiv:2012.06785

  167. Sivapriya M, Suresh S (2023) Vit-dexinet: a vision transformer-based edge detection operator for small object detection in sar images. Int J Remote Sens 44(22):7057–7084

    Article  Google Scholar 

  168. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10 012–10 022

  169. Gao L, Zhang J, Yang C, Zhou Y (2022) Cas-vswin transformer: a variant swin transformer for surface-defect detection. Comput Ind 140:103689

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Department of Science & Technology (DST), New Delhi, Govt. of India, for their support under the DST INSPIRE Fellowship Scheme and also extend sincere appreciation to the Department of Computer Science at the Islamic University of Science & Technology, India, for their invaluable assistance and provision of essential resources, which significantly contributed to the successful execution of this work.

Author information

Authors and Affiliations

Authors

Contributions

Iqra - Conceptualization, Writing - Original Draft, Writing - Review & Editing. Kaisar Javeed Giri - Methodology, Writing - Review & Editing, Supervision. Mohammed Javed - Data curation, Validation, Supervision.

Corresponding author

Correspondence to Kaisar J. Giri.

Ethics declarations

Conflicts of interest

There are no conflicts of interest to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Iqra, Giri, K.J. & Javed, M. Small object detection in diverse application landscapes: a survey. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18866-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-18866-w

Keywords

Navigation