DENS-YOLOv6: a small object detection model for garbage detection on water surface

Li, Ning; Wang, Mingliang; Yang, Gaochao; Li, Bo; Yuan, Baohua; Xu, Shoukun

doi:10.1007/s11042-023-17679-7

DENS-YOLOv6: a small object detection model for garbage detection on water surface

Published: 30 November 2023

Volume 83, pages 55751–55771, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ning Li^1,2,
Mingliang Wang¹,
Gaochao Yang¹,
Bo Li^1,3,
Baohua Yuan¹ &
…
Shoukun Xu¹

473 Accesses
3 Citations
Explore all metrics

Abstract

The study of garbage detection on water surface is of great significance for the development of water surface garbage monitoring and automated water surface garbage salvage. However, in water surface garbage scenes, the proportion of water background is relatively large, while the proportion of detection objects is relatively small. Moreover, the objects are easily affected by noise interference such as lighting, water waves, and reflections, which makes it difficult to extract object features and affects detection accuracy. In this paper, we propose a Detail Enhancement Noise Suppression YOLOv6 (DENS-YOLOv6) detection algorithm based on YOLOv6. Firstly, to better capture the detailed feature information of small objects, we design a Detail Information Enhancement Module (DIEM) based on atrous convolution. Secondly, to suppress noise interference on small objects, we develop an Adaptive Noise Suppression Module (ANSM). Finally, in order to improve the stability and convergence speed of the model training, we employ a regression loss function based on the Normalized Wasserstein Distance(NWD) metric. Experiments were conducted on the Flow+ dataset with a large number of small objects and the publicly available Pascal VOC2007 dataset. The mAP\(_S\) indicators reached 40.6% and 11.4%, respectively. Compared with other models, DENS-YOLOv6 achieved the highest small object detection accuracy

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature augmentation and scale penalty for tiny floating detection

Article 23 September 2023

SDGC-YOLOv5: A More Accurate Model for Small Object Detection

Road disease detection algorithm based on YOLOv5s-DSG

Article 18 May 2023

Data availability and access

Data is available on request from the authors.

References

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inform Process Syst 30
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229
Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable detr: deformable transformers for end-to-end object detection. arXiv:2010.04159
Zhang H, Li F, Liu S, Zhang L, Su H, Zhu J, Ni LM, Shum H-Y (2022) Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv:2203.03605
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inform Process Syst 28
Cai Z, Vasconcelos N (2018) Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934
Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W et al (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv:2209.02976
Wang C-Y, Bochkovskiy A, Liao H-YM (2023) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7464–7475
Wang J, Xu C, Yang W, Yu L (2021) A normalized gaussian wasserstein distance for tiny object detection. arXiv:2110.13389
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vision 88:303–338
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv Neural Inform Process Syst 27
Bai Y, Zhang Y, Ding M, Ghanem B (2018) Sod-mtgan: small object detection via multi-task generative adversarial network. In: Proceedings of the European conference on computer vision (ECCV), pp 206–221
Li J, Liang X, Wei Y, Xu T, Feng J, Yan S (2017) Perceptual generative adversarial networks for small object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1222–1230
Noh J, Bae W, Lee W, Seo J, Kim G (2019) Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9725–9734
Hu H, Gu J, Zhang Z, Dai J, Wei Y (2018) Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3588–3597
Lim J-S, Astrid M, Yoon H-J, Lee S-I (2021) Small object detection using context and attention. In: 2021 international conference on artificial intelligence in information and communication (ICAIIC). IEEE, pp 181–186
Xu S, Gu J, Hua Y, Liu Y (2023) Dktnet: dual-key transformer network for small object detection. Neurocomputing 525:29–41
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: Computer vision–ECCV 2016: 14th European conference, proceedings, Part I 14. Springer, pp 21–37
Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) S3fd: single shot scale-invariant face detector. In: Proceedings of the IEEE international conference on computer vision, pp 192–201
Xu C, Wang J, Yang W, Yu L (2021) Dot distance for tiny object detection in aerial images. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1192–1201
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768
Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: a single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 9259–9266
Ghiasi G, Lin T-Y, Le QV (2019) Nas-fpn: learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7036–7045
Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11534–11542
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722
Yang L, Zhang R-Y, Li L, Xie X (2021) Simam: a simple, parameter-free attention module for convolutional neural networks. In: International conference on machine learning. PMLR, pp 11863–11874
Zhang Q-L, Yang Y-B (2021) Sa-net: shuffle attention for deep convolutional neural networks. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2235–2239
Shao Z, Han J, Debattista K, Pang Y (2023) Textual context-aware dense captioning with diverse words. IEEE Trans Multimed
Gupta A, Narayan S, Joseph K, Khan S, Khan FS, Shah M (2022) Ow-detr: open-world detection transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9235–9244
Chu F, Cao J, Shao Z, Pang Y (2022) Illumination-guided transformer-based network for multispectral pedestrian detection. In: CAAI international conference on artificial intelligence. Springer, pp 343–355
Cheng Y, Zhu J, Jiang M, Fu J, Pang C, Wang P, Sankaran K, Onabola O, Liu Y, Liu D et al (2021) Flow: a dataset and benchmark for floating waste detection in inland waters. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10953–10962
Yang X, Zhao J, Zhao L, Zhang H, Li L, Ji Z, Ganchev I (2022) Detection of river floating garbage based on improved yolov5. Math 10(22):4366
Article Google Scholar
Jiang Z, Wu B, Ma L, Lian J (2023) Faster-rcnn water-floating garbage recognition based on multi-scale feature and polarized self-attention. J Comput Appl 0
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Zhang L, Wei Y, Wang H, Shao Y, Shen J (2021) Real-time detection of river surface floating object based on improved refinedet. IEEE Access 9:81147–81160
Article Google Scholar
Ma L, Wu B, Deng J, Lian J (2023) Small-target water-floating garbage detection and recognition based on unet-yolov5s. In: 2023 5th international conference on communications, information system and computer engineering (CISCE). IEEE, pp 391–395
Ding X, Zhang X, Ma N, Han J, Ding G, Sun J (2021) Repvgg: making vgg-style convnets great again. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13733–13742
Ge Z, Liu S, Wang F, Li Z, Sun J (2021) Yolox: exceeding yolo series in 2021. arXiv:2107.08430
Gevorgyan Z (2022) Siou loss: more powerful learning for bounding box regression. arXiv:2205.12740
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer vision–ECCV 2014: 13th European conference, proceedings, part V 13. Springer, pp 740–755
Antonelli S, Avola D, Cinque L, Crisostomi D, Foresti GL, Galasso F, Marini MR, Mecca A, Pannone D (2022) Few-shot object detection: a survey. ACM Computing Surveys (CSUR) 54(11s):1–37
Article Google Scholar
Wang J, Pang Y, Cao J, Sun H, Shao Z, Li X (2023) Deep intra-image contrastive learning for weakly supervised one-step person search. arXiv:2302.04607
Wu H, Wu G, Hu J, Xu S, Zhang S, Liu Y (2023) Cityuplaces: a new dataset for efficient vision-based recognition. J Real-Time Image Proc 20(6):109
Article Google Scholar
Liu Y, Zhang D, Zhang Q, Han J (2021) Part-object relational visual saliency. IEEE Trans Pattern Anal Mach Intell 44(7):3688–3704
Google Scholar
Liu Y, Zhang D, Liu N, Xu S, Han J (2022) Disentangled capsule routing for fast part-object relational saliency. IEEE Trans Image Process 31:6719–6732
Article Google Scholar
Liu Y, Dong X, Zhang D, Xu S (2023) Deep unsupervised part-whole relational visual saliency. Neurocomputing 126916
Liu Y, Zhang D, Zhang Q, Han J (2021) Integrating part-object relationship and contrast for camouflaged object detection. IEEE Trans Inf Forensics Secur 16:5154–5166
Article Google Scholar
Gao A, Pang Y, Nie J, Shao Z, Cao J, Guo Y, Li X (2022) Esgn: efficient stereo geometry network for fast 3d object detection. IEEE Trans Circ Syst Vid Technol

Download references

Acknowledgements

This work was supported by Jiangsu Petrochemical Process Key Equipment Digital Twin Technology Engineering Research Center Open Project (DTEC202103).

Author information

Authors and Affiliations

School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou University, Changzhou, 213164, China
Ning Li, Mingliang Wang, Gaochao Yang, Bo Li, Baohua Yuan & Shoukun Xu
School of Computer and Information Engineering, HoHai University, Nanjing, 210098, China
Ning Li
Jiangsu Petrochemical Process Key Equipment Digital Twin Technology Engineering Research Center, Changzhou University, Changzhou, 213164, China
Bo Li

Authors

Ning Li
View author publications
You can also search for this author in PubMed Google Scholar
Mingliang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Gaochao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Li
View author publications
You can also search for this author in PubMed Google Scholar
Baohua Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Shoukun Xu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Ning Li and Mingliang Wang have led the conception and design of the work, as well as the acquisition and interpretation of data. They’ve also been instrumental in drafting and revising the content to ensure intellectual value. The final version has been approved by Shoukun Xu and reviewed by Gaochao Yang, Bo Li and Baohua Yuan.

Corresponding author

Correspondence to Shoukun Xu.

Ethics declarations

Conflict of interest

All of us here attest that there are no competing interests with this study.

Ethical and informed consent for data used

Ethical and informed consent for data used.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, N., Wang, M., Yang, G. et al. DENS-YOLOv6: a small object detection model for garbage detection on water surface. Multimed Tools Appl 83, 55751–55771 (2024). https://doi.org/10.1007/s11042-023-17679-7

Download citation

Received: 11 September 2023
Revised: 01 November 2023
Accepted: 18 November 2023
Published: 30 November 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11042-023-17679-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DENS-YOLOv6: a small object detection model for garbage detection on water surface

Abstract

Access this article

Similar content being viewed by others

Feature augmentation and scale penalty for tiny floating detection

SDGC-YOLOv5: A More Accurate Model for Small Object Detection

Road disease detection algorithm based on YOLOv5s-DSG

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DENS-YOLOv6: a small object detection model for garbage detection on water surface

Abstract

Access this article

Similar content being viewed by others

Feature augmentation and scale penalty for tiny floating detection

SDGC-YOLOv5: A More Accurate Model for Small Object Detection

Road disease detection algorithm based on YOLOv5s-DSG

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation