Abstract
The importance of object detection within computer vision, especially in the context of detecting small objects, has notably increased. This thorough survey extensively examines small object detection across various applications, consolidating and outlining the available methodologies. Traditional papers on small object detection have focused on specific domains. However, this survey paper incorporates insights from a multitude of domains, providing a comprehensive understanding of the versatility and applicability of small object detection techniques. This paper sheds light on the key challenges faced and delves into potential solutions to address the challenges, offering insights into viable solutions to enhance small object detection performance, setting it apart from existing literature. The strategies identified in our survey encompass a spectrum of approaches, categorized as transformer-based, CNN, and traditional methods. Also, this paper collates prevalent datasets relevant to small object detection, simplifying access to these resources. Further, it provides a succinct overview of diverse evaluation metrics used for performance assessment in this field, enhancing understanding of the effectiveness and proficiency of these methods. This survey paper not only consolidates established knowledge but also highlights innovative viewpoints, providing a comprehensive and enlightening compilation that contributes to the advancement of small object detection in the field of computer vision.
Similar content being viewed by others
Data Availibility Statement
This article does not entail data sharing, as it did not involve the creation or analysis of any datasets during the study.
References
Amit Y, Felzenszwalb P, Girshick R (2020) Object detection. A Reference Guide, Computer Vision, pp 1–9
Zaidi SSA, Ansari MS, Aslam A, Kanwal N, Asghar M, Lee B (2022) A survey of modern deep learning based object detection models. Digit Signal Process 126:103514
Liu Y, Sun P, Wergeles N, Shang Y (2021) A survey and performance evaluation of deep learning methods for small object detection. Expert Syst Appl 172:114602
Wang Q, Zhang L, Bertinetto L, Hu W, Torr PH (2019) Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1328–1338
Hossain MZ, Sohel F, Shiratuddin MF, Laga H (2019) A comprehensive survey of deep learning for image captioning. ACM Comput Surv (CsUR) 51(6):1–36
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Gould S, Baumstarck P, Quigley M, Ng AY, Koller D (2008) Integrating visual and range data for robotic object detection. In: Workshop on multi-camera and multi-modal sensor fusion algorithms and applications-M2SFA2, 2008
Zhu P, Wen L, Du D, Bian X, Ling H, Hu Q, Nie Q, Cheng H, Liu C, Liu X et al (2018) Visdrone-det2018: the vision meets drone object detection in image challenge results. In: Proceedings of the european conference on computer vision (ECCV) workshops, pp 0–0
Yundong L, Han D, Hongguang L, Zhang X, Zhang B, Zhifeng X (2020) Multi-block ssd based on small object detection for uav railway scene surveillance. Chin J Aeronaut 33(6):1747–1755
Tong K, Wu Y, Zhou F (2020) Recent advances in small object detection based on deep learning: a review. Image Vis Comput 97:103910
Cheng G, Yuan X, Yao X, Yan K, Zeng Q, Xie X, Han J (2023) Towards large-scale small object detection: Survey and benchmarks. IEEE Trans Pattern Anal Mach Intell
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, pp 740–755
Lee J, Bang J, Yang S-I (2017) Object detection with sliding window in images including multiple similar objects. In: 2017 international conference on information and communication technology convergence (ICTC). IEEE, pp 803–806
Hashemi NS, Aghdam RB, Ghiasi ASB, Fatemi P (2016) Template matching advances and applications in image analysis. arXiv:1610.07231
Choi C, Christensen HI (2012) 3d textureless object detection and tracking: an edge-based approach. In: 2012 IEEE/RSJ International conference on intelligent robots and systems. IEEE, pp 3877–3884
Jeelani Z, Qadir F (2022) Cellular automata-based approach for salt-and-pepper noise filtration. J King Saud University - Comp Inf Sci 34(2):365–374. https://doi.org/10.1016/j.jksuci.2018.12.006
Jeelani Z, Gani G, Qadir F (2023) Linear cellular automata-based impulse noise identification and filtration of degraded images. SIViP 17(6):2679–2687. https://doi.org/10.1007/s11760-023-02484-4
Papageorgiou CP, Oren M, Poggio T (1998) A general framework for object detection. In: Sixth international conference on computer vision (IEEE Cat. No. 98CH36271). IEEE, pp 555–562
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. Ieee, pp 886–893
Piccinini P, Prati A, Cucchiara R (2012) Real-time object detection and localization with sift-based clustering. Image Vis Comput 30(8):573–587
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR. IEEE Comput Soc. https://doi.org/10.1109/cvpr.2001.990517
Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. In: 2011 International conference on computer vision. IEEE, pp 89–96
Paisitkriangkrai S, Shen C, van den Hengel A (2015) Pedestrian detection with spatially pooled features and structured ensemble learning. IEEE Trans Pattern Anal Mach Intell 38(6):1243–1257
Rashid Y, Bhat JI (2023) Topological to deep learning era for identifying influencers in online social networks: a systematic review. Multimed Tools Appl 1–44
Rashid Y, Iqbal Bhat J (2023) Unlocking the power of social networks with community detection techniques for isolated and overlapped communities: a review. Indian J Sci Technol 16(25):1857–1871
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE. [Online]. Available: https://doi.org/10.1109/cvpr.2016.91
Lou H, Duan X, Guo J, Liu H, Gu J, Bi L, Chen H (2023) Dc-yolov8: small-size object detection algorithm based on camera sensor. Electronics 12(10):2323
Girshick R (2015) Fast r-CNN. In: 2015 IEEE international conference on computer vision (ICCV). IEEE. https://doi.org/10.1109/iccv.2015.169
Meng J, Jiang P, Wang J, Wang K (2022) A mobilenet-ssd model with fpn for waste detection. J Electr Engineer Technol 17(2):1425–1431
Bosquet B, Mucientes M, Brea VM (2021) Stdnet-st: spatio-temporal convnet for small object detection. Pattern Recog 116:107929
Bai Y, Zhang Y, Ding M, Ghanem B (2018) Sod-mtgan: small object detection via multi-task generative adversarial network. In: Proceedings of the European conference on computer vision (ECCV), pp 206–221
Xu X, Zhang H, Ma Y, Liu K, Bao H, Qian X (2023) Transdet: toward effective transfer learning for small-object detection. Remote Sens 15(14)3525
Tang Y-P, Wei X-S, Zhao B, Huang S-J (2021) Qbox: partial transfer learning with active querying for object detection. IEEE transactions on neural networks and learning systems
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929
Gong H, Mu T, Li Q, Dai H, Li C, He Z, Wang W, Han F, Tuniyazi A, Li H et al (2022) Swin-transformer-enabled yolov5 with attention mechanism for small object detection on satellite images. Remote Sens 14(12):2861
Chen G, Wang H, Chen K, Li Z, Song Z, Liu Y, Chen W, Knoll A (2020) A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal. IEEE Trans Syst Man Cybern Syst 52(2):936–953
Mushtaq S, Singh O (2024) Convolution neural networks for disease prediction: applications and challenges. Scalable Comput: Pract Experience 25(1):615–636
Tan K, Ding S, Wu S, Tian K, Ren J et al (2023) A small object detection network based on multiple feature enhancement and feature fusion. Sci Program 2023
Modegi T (2008) Small object recognition techniques based on structured template matching for high-resolution satellite images. In: 2008 SICE Annual Conference. IEEE, pp 2168–2173
Nagaraj S, Muthiyan B, Ravi S, Menezes V, Kapoor K, Jeon H (2017) Edge-based street object detection. In: 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE, pp 1–4
Arunmozhi A, Park J (2018) Comparison of hog, lbp and haar-like features for on-road vehicle detection. In: 2018 IEEE international conference on Electro/Information Technology (EIT). IEEE, pp 0362–0367
Ren Y, Zhu C, Xiao S (2018) Small object detection in optical remote sensing images via modified faster r-cnn. Appl Sci 8(5):813
Kisantal M, Wojna Z, Murawski J, Naruniec J, Cho K (2019) Augmentation for small object detection. arXiv:1902.07296
Lim J-S, Astrid M, Yoon H-J, Lee S-I (2021) Small object detection using context and attention. In: 2021 International conference on artificial intelligence in information and communication (ICAIIC). IEEE, pp 181–186
Wu X, Hong D, Chanussot J (2022) Uiu-net: U-net in u-net for infrared small object detection. IEEE transactions on image processing 32:364–376
Mahaur B, Mishra K (2023) Small-object detection based on yolov5 in autonomous driving systems. Pattern Recogn Lett 168:115–122
Chen C, Gong W, Chen Y, Li W (2019) Object detection in remote sensing images based on a scene-contextual feature pyramid network. Remote Sens 11(3):339
Leng J, Ren Y, Jiang W, Sun X, Wang Y (2021) Realize your surroundings: exploiting context information for small object detection. Neurocomputing 433:287–299
Hamdi A, Chan YK, Koo VC (2021) A new image enhancement and super resolution technique for license plate recognition. Heliyon 7(11)
Li J, Liang X, Wei Y, Xu T, Feng J, Yan S (2017) Perceptual generative adversarial networks for small object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1222–1230
Krishna H, Jawahar C (2017) Improving small object detection. In: 2017 4th IAPR Asian conference on pattern recognition (ACPR). IEEE, pp 340–345
Li C, Zhang Y, Gao G, Liu Z, Liao L (2022) Context-aware cross-level attention fusion network for infrared small target detection. J Appl Remote Sens 16(4):046 506–046 506
Hoiem D, Chodpathumwan Y, Dai Q (2012) Diagnosing error in object detectors. In: European conference on computer vision. Springer, pp 340–353
Huang J, Murphy K (2015) Efficient inference in occlusion-aware generative models of images. arXiv:1511.06362
Chen Y-T, Liu X, Yang M-H (2015) Multi-instance object segmentation with occlusion handling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3470–3478
Zoph B, Cubuk ED, Ghiasi G, Lin T-Y, Shlens J, Le QV (2020) Learning data augmentation strategies for object detection. In: Computer Vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, proceedings, Part XXVII 16. Springer, pp 566–583
Deepak S, Ameer P (2023) Brain tumor categorization from imbalanced mri dataset using weighted loss and deep feature fusion. Neurocomputing 520:94–102
Zhong Z, Sun L, Huo Q (2019) An anchor-free region proposal network for faster r-cnn-based text detection approaches. Int J Doc Anal Recognit (IJDAR) 22(3):315–327
Zhang H, Li F, Liu S, Zhang L, Su H, Zhu J, Ni LM, Shum H-Y (2022) Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv:2203.03605
Doon R, Rawat TK, Gautam S (2018) Cifar-10 classification using deep convolutional neural network. In: 2018 IEEE Punecon. IEEE. https://doi.org/10.1109/punecon.2018.8745428
Truong T-D, Nguyen V-T, Tran M-T (2018) Lightweight deep convolutional network for tiny object recognition. In: ICPRAM, pp 675–682
Mogelmose A, Liu D, Trivedi MM (2015) Detection of u.s. traffic signs. IEEE Trans Intell Transp Syst 16(6):3116–3125. https://doi.org/10.1109/tits.2015.2433019
Kuznetsova A, Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, Kamali S, Popov S, Malloci M, Kolesnikov A, Duerig T, Ferrari V (2020) The open images dataset v4. Int J Comput Vis 128(7):1956–1981. https://doi.org/10.1007/s11263-020-01316-z
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft COCO: common objects in context. In: Computer vision – ECCV 2014. Springer International Publishing, pp 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
Loh YP, Chan CS (2019) Getting to know low-light images with the exclusively dark dataset. Comp Vision Image Underst 178:30–42. https://doi.org/10.1016/j.cviu.2018.10.010
Wang X, Yang M, Zhu S, Lin Y (2013) Regionlets for generic object detection. In: 2013 IEEE international conference on computer vision. IEEE. https://doi.org/10.1109/iccv.2013.10
Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, Madhavan V, Darrell T (2020) BDD100k: a diverse driving dataset for heterogeneous multitask learning. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE. https://doi.org/10.1109/cvpr42600.2020.00271
Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (2018) DOTA: a large-scale dataset for object detection in aerial images. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. IEEE. https://doi.org/10.1109/cvpr.2018.00418
Krizhevsky A, Hinton G (2010) Convolutional deep belief networks on cifar-10. Unpublished manuscript 40(7):1–9
Recht B, Roelofs R, Schmidt L, Shankar V (2018) Do cifar-10 classifiers generalize to cifar-10? arXiv:1806.00451
Møgelmose A, Liu D, Trivedi MM (2014) Traffic sign detection for us roads: remaining challenges and a case for tracking. In: 17th International IEEE conference on intelligent transportation systems (ITSC). IEEE, pp 1394–1399
Crowder J, Cornish NJ (2007) Solution to the galactic foreground problem for Lisa. Phys Rev D 75(4):043008
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, proceedings, Part V 13. Springer, pp 740–755
Barbu A, Mayo D, Alverio J, Luo W, Wang C, Gutfreund D, Tenenbaum J, Katz B (2019) Objectnet: a large-scale bias-controlled dataset for pushing the limits of object recognition models. Adv Neural Inf Process Syst 32
Dabov K, Foi A, Katkovnik V, Egiazarian K (2006) Image denoising with block-matching and 3d filtering. In: Image processing: algorithms and systems, neural networks, and machine learning, vol 6064. SPIE, pp 354–365
Loh YP, Chan CS (2019) Getting to know low-light images with the exclusively dark dataset. Comp Vision Image Underst 178:30–42
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. Ieee, pp 248–255
Barbu A, Mayo D, Alverio J, Luo W, Wang C, Gutfreund D, Tenenbaum J, Katz B (2019) Objectnet: a large-scale bias-controlled dataset for pushing the limits of object recognition models. Adv Neural Inf Process Syst 32
Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, Madhavan V, Darrell T (2020) Bdd100k: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2636–2645
Haris M, Glowacz A (2021) Road object detection: a comparative study of deep learning-based algorithms. Electronics 10(16):1932
Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (2018) Dota: A large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3974–3983
Xu C, Wang J, Yang W, Yu H, Yu L, Xia G-S (2022) Detecting tiny objects in aerial images: a normalized wasserstein distance and a new benchmark. ISPRS J Photogramm Remote Sens 190:79–93
Xu C, Wang J, Yang W, Yu H, Yu L, Xia G-S (2022) Detecting tiny objects in aerial images: A normalized wasserstein distance and a new benchmark. ISPRS J Photogramm Remote Sens 190:79–93
Yu X, Gong Y, Jiang N, Ye Q, Han Z (2020) Scale match for tiny person detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1257–1265
Yu X, Han Z, Gong Y, Jan N, Zhao J, Ye Q, Chen J, Feng Y, Zhang B, Wang X et al (2020) The 1st tiny object detection challenge: methods and results. In: Computer vision–ECCV 2020 workshops: Glasgow, UK, August 23–28, 2020, proceedings, Part V 16. Springer, 315–323
Kuznetsova A, Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, Kamali S, Popov S, Malloci M, Kolesnikov A et al (2020) The open images dataset v4: unified image classification, object detection, and visual relationship detection at scale. Int J Comput Vis 128(7):1956–1981
Du D, Qi Y, Yu H, Yang Y, Duan K, Li G, Zhang W, Huang Q, Tian Q (2018) The unmanned aerial vehicle benchmark: object detection and tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 370–386
Yu W, Yang T, Chen C (2021) Towards resolving the challenge of long-tail distribution in uav images for object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3258–3267
Wang J, Yang W, Guo H, Zhang R, Xia G-S (2021) Tiny object detection in aerial images. In: 2020 25th international conference on pattern recognition (ICPR). IEEE, pp 3791–3798
Du D, Qi Y, Yu H, Yang Y, Duan K, Li G, Zhang W, Huang Q, Tian Q (2018) The unmanned aerial vehicle benchmark: object detection and tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 370–386
Li K, Wan G, Cheng G, Meng L, Han J (2020) Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J Photogramm Remote Sens 159:296–307
Yang S, Luo P, Loy C-C, Tang X (2016) Wider face: a face detection benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5525–5533
Prasad S, Li Y, Lin D, Sheng D (2021) maskedFaceNet: a progressive semi-supervised masked face detector. In: 2021 IEEE Winter conference on applications of computer vision (WACV). IEEE. https://doi.org/10.1109/wacv48630.2021.00343
Wang Q-J, Zhang S-Y, Dong S-F, Zhang G-C, Yang J, Li R, Wang H-Q (2020) Pest24: a large-scale very small object data set of agricultural pests for multi-target detection. Comput Electron Agric 175:105585
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot MultiBox detector. In: Computer vision – ECCV 2016. Springer International Publishing, pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Farhadi A, Redmon J (2018) Yolov3: an incremental improvement. In: Computer vision and pattern recognition, pp 1804–02 767
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: International conference on machine learning. PMLR, pp 647–655
Mathew MP, Mahesh TY (2022) Leaf-based disease detection in bell pepper plant using yolo v5. SIViP 1–7
Yang R, Yu Y (2021) Artificial convolutional neural network in object detection and semantic segmentation for medical imaging analysis. Front Oncol 11:638182
Shah SNA, Parveen R (2023) An extensive review on lung cancer diagnosis using machine learning techniques on radiological data: state-of-the-art and perspectives. Arch Comput Meth Engineer 1–14
Mushtaq S, Singh O (2023) Implementing image processing and deep learning techniques to analyze skin cancer images. Int J Comput Digit Syst 14(1):1–xx
Sushanki S, Bhandari AK, Singh AK (2023) A review on computational methods for breast cancer detection in ultrasound images using multi-image modalities. Arch Comput Meth Engineer 1–20
Sahoo PK, Mishra S, Panigrahi R, Bhoi AK, Barsocchi P (2022) An improvised deep-learning-based mask r-cnn model for laryngeal cancer detection using ct images. Sensors 22(22):8834
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Abhisheka B, Biswas SK, Purkayastha B (2023) A comprehensive review on breast cancer detection, classification and segmentation using deep learning. Arch Comput Meth Engineer 1–30
Khosravan N, Bagci U (2018) S4nd: Single-shot single-scale lung nodule detection. In: Medical image computing and computer assisted intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part II 11. Springer, pp 794–802
Van Etten A (2018) You only look twice: rapid multi-scale object detection in satellite imagery. arXiv:1805.09512
Nina W, Condori W, Machaca V, Villegas J, Castro E (2020) Small ship detection on optical satellite imagery with yolo and yolt. In: Advances in information and communication: proceedings of the 2020 future of information and communication conference (FICC), vol 2. Springer, pp 664–677
Wang J, Yang W, Guo H, Zhang R, Xia G-S (2021) Tiny object detection in aerial images. In: 2020 25th international conference on pattern recognition (ICPR). IEEE, pp 3791–3798
Wang G, Chen Y, An P, Hong H, Hu J, Huang T (2023) Uav-yolov8: a small-object-detection model based on improved yolov8 for uav aerial photography scenarios. Sensors 23(16):7190
Javid I, Ghazali R, Saeed W, Batool T, Al-Wajih E (2023) Cnn with new spatial pyramid pooling and advanced filter-based techniques: revolutionizing traffic monitoring via aerial images. Sustainability 16(1):117
Zhai X, Huang Z, Li T, Liu H, Wang S (2023) Yolo-drone: an optimized yolov8 network for tiny uav object detection. Electronics 12(17):3664
Sun W, Dai L, Zhang X, Chang P, He X (2021) Rsod: real-time small object detection algorithm in uav-based traffic monitoring. Appl Intell 1–16
Gould S, Baumstarck P, Quigley M, Ng AY, Koller D (2008) Integrating visual and range data for robotic object detection. In: Workshop on multi-camera and multi-modal sensor fusion algorithms and applications-M2SFA2 2008
Wang Y, Sun Q, Liu Z, Gu L (2022) Visual detection and tracking algorithms for minimally invasive surgical instruments: a comprehensive review of the state-of-the-art. Robot Auton Syst 149:103945
Koskinopoulou M, Raptopoulos F, Papadopoulos G, Mavrakis N, Maniadakis M (2021) Robotic waste sorting technology: toward a vision-based categorization system for the industrial robotic separation of recyclable waste. IEEE Robot Autom Mag 28(2):50–60
Farooq AS, Zhang P (2022) A comprehensive review on the prospects of next-generation wearable electronics for individualized health monitoring, assistive robotics, and communication. Sensors Actuators A Phys 113715
Kulik S, Shtanko A (2020) Experiments with neural net object detection system yolo on small training datasets for intelligent robotics. In: Advanced technologies in robotics and intelligent systems: proceedings of ITR 2019. Springer, pp 57–162
Liu Y, Li W, Tan L, Huang X, Zhang H, Jiang X (2023) Db-yolov5: a uav object detection model based on dual backbone network for security surveillance. Electronics 12(15):3296
Lin K, Chen S-C, Chen C-S, Lin D-T, Hung Y-P (2015) Abandoned object detection via temporal consistency modeling and back-tracing verification for visual surveillance. IEEE Trans Inf Forensic Secur 10(7):1359–1370
Xu S, Zhang M, Song W, Mei H, He Q, Liotta A (2023) A systematic review and analysis of deep learning-based underwater object detection. Neurocomputing
Gunes A, Guldogan MB (2016) Joint underwater target detection and tracking with the bernoulli filter using an acoustic vector sensor. Digit Signal Process 48:246–258
Chen L, Zhou F, Wang S, Dong J, Li N, Ma H, Wang X, Zhou H (2022) Swipenet: object detection in noisy underwater scenes. Pattern Recog 132:108926
Chen G, Mao Wang K, Shen J (2023) Htdet: a hybrid transformer-based approach for underwater small object detection. Remote Sens 15(4):1076
Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48
Cai Y, Luan T, Gao H, Wang H, Chen L, Li Y, Sotelo MA, Li Z (2021) Yolov4-5d: an effective and efficient object detector for autonomous driving. IEEE Trans Instrum Meas 70:1–13
Dipu MTA, Hossain SS, Arafat Y, Rafiq FB (2021) Real-time driver drowsiness detection using deep learning. Int J Adv Comput Sci Appl 12(7)
Malkoff DB, Oliver WR (2000) Hyperspectral imaging applied to forensic medicine. In: Spectral imaging: instrumentation, applications, and analysis 3920. SPIE, pp 108–116
Wetzer E, Lohninger H (2018) Image processing using color space models for forensic fiber detection. IFAC-PapersOnLine 51(2):445–450
Turtiainen H, Costin A, Hämäläinen T, Lahtinen T, Sintonen L (2022) Cctvcv: computer vision model/dataset supporting cctv forensics and privacy applications. In: 2022 IEEE international conference on trust, security and privacy in computing and communications (TrustCom). IEEE, pp 1219–1226
Akyon FC, Altinuc SO, Temizel A (2022) Slicing aided hyper inference and fine-tuning for small object detection. In: 2022 IEEE international conference on image processing (ICIP). IEEE, pp 966–970
Wang S (2011) A review of gradient-based and edge-based feature extraction methods for object detection. In: 2011 IEEE 11th international conference on computer and information technology. IEEE, pp 277–282
Choi C, Christensen HI (2012) 3d textureless object detection and tracking: an edge-based approach. In: 2012 IEEE/RSJ international conference on intelligent robots and systems. IEEE, pp 3877–3884
Wang Y-Q (2014) An analysis of the viola-jones face detection algorithm. Image Process Line 4:128–148
Dabhi MK, Pancholi BK (2016) Face detection system based on viola-jones algorithm. Int J Sci Res (IJSR) 5(4):62–64
Ebrahimzadeh R, Jampour M (2014) Efficient handwritten digit recognition based on histogram of oriented gradients and svm. Int J Comp Appl 104(9)
Psyllos AP, Anagnostopoulos C-NE, Kayafas E (2010) Vehicle logo recognition using a sift-based enhanced matching scheme. IEEE Trans Intell Transp Syst 11(2):322–328
Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE. https://doi.org/10.1109/cvpr.2008.4587597
Uricár M, Franc V, Hlavác V (2015) Facial landmark tracking by tree-based deformable part model based detector. In: Proceedings of the IEEE international conference on computer vision workshops, pp 10–17
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition. IEEE. https://doi.org/10.1109/cvpr.2014.81
Zhang S, Wu R, Xu K, Wang J, Sun W (2019) R-cnn-based ship detection from high resolution remote sensing imagery. Remote Sens 11(6):631
Li J, Liang X, Shen S, Xu T, Feng J, Yan S (2017) Scale-aware fast r-cnn for pedestrian detection. IEEE Trans Multimed Comput 20(4):985–996
Ren S, He K, Girshick R, Sun J (2017) Faster r-CNN: towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence 39(6):1137–1149. [Online]. Available: https://doi.org/10.1109/tpami.2016.2577031
Su Y, Li D, Chen X (2021) Lung nodule detection based on faster r-cnn framework. Comput Methods Prog Biomed 200:105866
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: Computer vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, proceedings, Part I 14. Springer, pp 21–37
Nagrath P, Jain R, Madan A, Arora R, Kataria P, Hemanth J (2021) Ssdmnv2: A real time dnn-based face mask detection system using single shot multibox detector and mobilenetv2. Sustain Cities Soc 66:102692
Shinde S, Kothari A, Gupta V (2018) Yolo based human action recognition and localization. Procedia Comput Sci 133:831–838
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE. https://doi.org/10.1109/cvpr.2017.690
Wu X, Sun S, Chen N, Fu M, Hou X (2020) Real-time vehicle color recognition based on yolo9000. In: Communications, signal processing, and systems: proceedings of the 2018 CSPS Volume II: Signal Processing 7th. Springer, pp 82–89
Xianbao C, Guihua Q, Yu J, Zhaomin Z (2021) An improved small object detection method based on yolo v3. Pattern Anal Applic 24:1347–1355
Lawal MO (2021) Tomato detection based on modified yolov3 framework. Sci Rep 11(1):1447
Liu H, Fan K, Ouyang Q, Li N (2021) Real-time small drones detection based on pruned yolov4. Sensors 21(10):3374
Hu X, Liu Y, Zhao Z, Liu J, Yang X, Sun C, Chen S, Li B, Zhou C (2021) Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved yolo-v4 network. Comput Electron Agric 185:106135
Wu W, Liu H, Li L, Long Y, Wang X, Wang Z, Li J, Chang Y (2021) Application of local fully convolutional neural network combined with yolo v5 algorithm in small target detection of remote sensing image. PloS one 16(10):e0259283
Wu W, Liu H, Li L, Long Y, Wang X, Wang Z, Li J, Chang Y (2021) Application of local fully convolutional neural network combined with yolo v5 algorithm in small target detection of remote sensing image. PloS one 16(10):e0259283
Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W et al (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv:2209.02976
Norkobil Saydirasulovich S, Abdusalomov A, Jamil MK, Nasimov R, Kozhamzharova D, Cho Y-I (2023) A yolov6-based improved fire detection approach for smart city environments. Sensors 23(6):3161
Zhao H, Zhang H, Zhao Y (2023) Yolov7-sea: object detection of maritime uav images based on improved yolov7. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 233–238
Wei G, Wan F, Zhou W, Xu C, Ye Z, Liu W, Lei G, Xu L (2023) Bfd-yolo: a yolov7-based detection method for building façade defects. Electronics 12(17):3612
Sohan M, Sai Ram T, Reddy R, Venkata C (2024) A review on yolov8 and its advancements. In: International conference on data intelligence and cognitive informatics. Springer, pp 529–545
Huang Z, Li L, Krizek GC, Sun L (2023) Research on traffic sign detection based on improved yolov8. J Comput Commun 11(7):226–232
Yi H, Liu B, Zhao B, Liu E (2023) Small object detection algorithm based on improved yolov8 for remote sensing. IEEE J Sel Top Appl Earth Obs Remote Sens
Chaturvedi A, Rajpoot V (2020) An optimized deep vision framework. Solid State Technol 63(6):561–569
Lin M, Li C, Bu X, Sun M, Lin C, Yan J, Ouyang W, Deng Z (2020) Detr for crowd pedestrian detection. arXiv:2012.06785
Sivapriya M, Suresh S (2023) Vit-dexinet: a vision transformer-based edge detection operator for small object detection in sar images. Int J Remote Sens 44(22):7057–7084
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10 012–10 022
Gao L, Zhang J, Yang C, Zhou Y (2022) Cas-vswin transformer: a variant swin transformer for surface-defect detection. Comput Ind 140:103689
Acknowledgements
The authors would like to thank the Department of Science & Technology (DST), New Delhi, Govt. of India, for their support under the DST INSPIRE Fellowship Scheme and also extend sincere appreciation to the Department of Computer Science at the Islamic University of Science & Technology, India, for their invaluable assistance and provision of essential resources, which significantly contributed to the successful execution of this work.
Author information
Authors and Affiliations
Contributions
Iqra - Conceptualization, Writing - Original Draft, Writing - Review & Editing. Kaisar Javeed Giri - Methodology, Writing - Review & Editing, Supervision. Mohammed Javed - Data curation, Validation, Supervision.
Corresponding author
Ethics declarations
Conflicts of interest
There are no conflicts of interest to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Iqra, Giri, K.J. & Javed, M. Small object detection in diverse application landscapes: a survey. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18866-w
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-18866-w