Abstract
Intelligent detection and classification of kitchen waste can promote ecological sustainability by replacing inefficient manual processes. However, the presence of non-degradable waste mixed in kitchen waste often follows a long-tailed distribution, making it challenging to train convolutional neural network-based object detectors, which results in the unsatisfactory detection of tail-class waste. To address this challenge, we propose a class-instance balanced detector (CIB-Det) for intelligent detection and classification of kitchen waste. CIB-Det implements two strategies for the loss function: the class-balanced strategy (CBS) and the instance-balanced strategy (IBS). The CBS focuses more on tail classes, and the IBS concentrates on hard-to-classify instances adaptively during training. Consequently, CIB-Det comprehensively and adaptively addresses the long-tailed issue. Our experiments on a real dataset of kitchen waste images support the effectiveness of CIB-Det for kitchen waste detection.
Similar content being viewed by others
References
Hoornweg D, Bhada-Tata P. What a waste: A global review of solid waste management. World Bank’s Urban Development Local Govt, 2012
Li J, Chen J, Sheng B, et al. Automatic detection and classification system of domestic waste via multimodel cascaded convolutional neural network. IEEE Trans Ind Inf, 2021, 18: 163–173
Wang S, Wang J, Yang S, et al. From intention to behavior: Comprehending residents’ waste sorting intention and behavior formation process. Waste Manage, 2020, 113: 41–50
Zhu M W, Ma H B, He J, et al. Metal recycling from waste memory modules efficiently and environmentally friendly by low-temperature alkali melts. Sci China Tech Sci, 2020, 63: 2275–2282
Yue S, Shi X. Analysis of government roles in garbage classification. In: Proceedings of the IOP Conference Series: Earth and Environmental Science. London, 2020. 440: 042084
Yuan J H, Wu Y, Lu X, et al. Recent advances in deep learning based sentiment analysis. Sci China Tech Sci, 2020, 63: 1947–1970
Felzenszwalb P F, Girshick R B, McAllester D, et al. Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell, 2009, 32: 1627–1645
Liu L, Ouyang W, Wang X, et al. Deep learning for generic object detection: A survey. Int J Comput Vis, 2020, 128: 261–318
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 770–778
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. Commun ACM, 2017, 60: 84–90
Li S, Song W, Fang L, et al. Deep learning for hyperspectral image classification: An overview. IEEE Trans Geosci Remote Sens, 2019, 57: 6690–6709
Han H G, Zhen Q, Yang H Y, et al. Mobile phone recognition method based on bilinear convolutional neural network. Sci China Tech Sci, 2021, 64: 2477–2484
Cheng S Y, Chu B F, Zhong B N, et al. DRNet: Towards fast, accurate and practical dish recognition. Sci China Tech Sci, 2021, 64: 2651–2661
Sermanet P, Eigen D, Zhang X, et al. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv: 1312.6229
Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 779–788
Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector. In: Proceedings of the European Conference on Computer Vision. Amsterdam, 2016. 21–37
Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Colombia, 2014. 580–587
Girshick R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision. Santiago, 2015. 1440–1448
Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, 2015
Redmon J, Farhadi A. YOLOV3: An incremental improvement. arXiv: 1804.02767
Fu C Y, Liu W, Ranga A, et al. DSSD: Deconvolutional single shot detector. arXiv: 1701.06659
Cai Z, Vasconcelos N. Cascade R-CNN: Delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018. 6154–6162
He K, Gkioxari G, Dollar P, et al. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, 2017. 2961–2969
Domingo J L, Nadal M. Domestic waste composting facilities: A review of human health risks. Environ Int, 2009, 35: 382–389
Karthikeyan M, Subashini T S, Jebakumar R. SSD based waste separation in smart garbage using augmented clustering NMS. Autom Softw Eng, 2021, 28: 1–7
Lu G, Wang Y B, Xu H X, et al. Deep multimodal learning for municipal solid waste sorting. Sci China Tech Sci, 2022, 65: 324–335
Zhang Q, Yang Q, Zhang X, et al. A multi-label waste detection model based on transfer learning. Resources Conservat Recycl, 2022, 181: 106235
Bochkovskiy A, Wang C Y, Liao H Y M. YOLOV4: Optimal speed and accuracy of object detection. arXiv: 2004.10934
Liu C, Xie N, Yang X, et al. A domestic trash detection model based on improved YOLOX. Sensors, 2022, 22: 6974
Ge Z, Liu S, Wang F, et al. YOLOX: Exceeding yolo series in 2021. arXiv: 2107.08430
Feng C, Zhong Y, Gao Y, et al. TOOD: Task-aligned one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, 2021. 3490–3499
Tian Z, Shen C, Chen H, et al. FCOS: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, 2019. 9627–9636
Everingham M, Van Gool L, Williams C K I, et al. The pascal visual object classes (VOC) challenge. Int J Comput Vis, 2010, 88: 303–338
Lin T Y, Maire M, Belongie S, et al. Microsoft COCO: Common objects in context. In: Proceedings ofthe European Conference on Computer Vision. Zurich, 2014. 740–755
Oksuz K, Cam B C, Kalkan S, et al. Imbalance problems in object detection: A review. IEEE Trans Pattern Anal Mach Intell, 2020, 43: 3388–3415
Ma J, Shao W, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimedia, 2018, 20: 3111–3122
Han J, Ding J, Li J, et al. Align deep features for oriented object detection. IEEE Trans Geosci Remote Sens, 2021, 60: 1–11
Yang X, Yan J, Feng Z, et al. R3det: Refined single-stage detector with feature refinement for rotating object. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vancouver, 2021. 35: 3163–3171
Ding J, Xue N, Long Y, et al. Learning roi transformer for oriented object detection in aerial images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019. 2849–2858
Xie X, Cheng G, Wang J, et al. Oriented R-CNN for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, 2021. 3520–3529
Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017. 2117–2125
Wang Y X, Ramanan D, Hebert M. Learning to model the tail. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. Long Beach, 2017
Huang C, Li Y, Loy C C, et al. Learning deep representation for im-balanced classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 5375–5384
Cui Y, Jia M, Lin T Y, et al. Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019. 9268–9277
Zhang X, Fang Z, Wen Y, et al. Range loss for deep face recognition with long-tailed training data. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, 2017. 5409–5418
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. Honolulu, 2017. 2980–2988
Frankle J, Schwab D J, Morcos A S. The early phase of neural network training. arXiv: 2002.10365
Xu Y, Fu M, Wang Q, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans Pattern Anal Mach Intell, 2020, 43: 1452–1459
Sato I, Nishimura H, Yokoi K. APAC: Augmented pattern classification with neural networks. arXiv: 1505.03229
Fawcett T. An introduction to ROC analysis. Pattern Recognition Lett, 2006, 27: 861–874
Zhou Y, Yang X, Zhang G, et al. Mmrotate: A rotated object detection benchmark using pytorch. arXiv: 2204.13317
Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Miami, 2009. 248–255
Liu Z, Mao H, Wu C Y, et al. A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, 2022. 11976–11986
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Key Research and Development Program of China (Grant No. 2021YFC1910402).
Rights and permissions
About this article
Cite this article
Fang, L., Tang, Q., Ouyang, L. et al. Long-tailed object detection of kitchen waste with class-instance balanced detector. Sci. China Technol. Sci. 66, 2361–2372 (2023). https://doi.org/10.1007/s11431-023-2400-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11431-023-2400-1