Abstract
In this paper, we propose a twofold methodology for visual detection and recognition of different types of city dumpsters, with minimal human labeling of the image data set. Firstly, we carry out transfer learning by using Google Inception-v3 convolutional neural network, which is retrained with only a small subset of labeled images out of the whole data set. This first classifier is then improved with a semi-supervised learning based on retraining for two more rounds, each one increasing the number of labeled images but without human supervision. We compare our approach against both to a baseline case, with no incremental retraining, and the best case, assuming we had a fully labeled data set. We use a data set of 27,624 labeled images of dumpsters provided by Ecoembes, a Spanish nonprofit organization that cares for the environment through recycling and the eco-design of packaging in Spain. Such a data set presents a number of challenges. As in other outdoor visual tasks, there are occluding objects such as vehicles, pedestrians and street furniture, as well as other dumpsters whenever they are placed in groups. In addition, dumpsters have different degrees of deterioration which may affect their shape and color. Finally, 35% of the images are classified according to the capacity of the container, which contains a feature which is hard to assess in a snapshot. Since the data set is fully labeled, we can compare our approach both against a baseline case, doing only the transfer learning using a minimal set of labeled images, and against the best case, using all the labels. The experiments show that the proposed system provides an accuracy of 88%, whereas in the best case it is 93%. In other words, the method proposed attains 94% of the best performance.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Please contact {ivan.ramirez, alfredo.cuesta, juanjose.pantrigo}@urjc.es
References
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: large-scale machine learning on heterogeneous systems. http://tensorflow.org/. Accessed 15 Mar 2018. Software available from tensorflow.org
Ba J, Mnih V, Kavukcuoglu K (2014) Multiple object recognition with visual attention. In: Proceedings of international conference on learning representations
Brinez LJC, Rengifo A, Escobar M (2015) Automatic waste classification using computer vision as an application in colombian high schools. In: 6th Latin-American conference on networked and electronic media (LACNEM 2015), pp 1–5. https://doi.org/10.1049/ic.2015.0316
Cao Z, Simon T, Wei SE, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of IEEE conference on computer vision and pattern recognition
Dai W, Xue GR, Yang Q, Yu, Y (2007) Transferring naive bayes classifiers for text classification. In: Proceedings of the 22nd national conference on artificial intelligence—volume 1, AAAI’07. AAAI Press, pp 540–545
Deng J, Krause J, Berg AC, Fei-Fei L (2012) Hedging your bets: optimizing accuracy-specificity trade-offs in large scale visual recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), 2012. IEEE, pp 3450–3457
Fang H, Gupta S, Iandola F, Srivastava R, Deng L, Dollr P, Gao J, He X, Mitchell M, Platt JC, Zitnick CL, Zweig G (2015) From captions to visual concepts and back. In: Proceedings of IEEE conference on computer vision and pattern recognition
Fukui A, Park DH, Yang D, Rohrbach A, Darrell T, Rohrbach M (2016) Multimodal compact bilinear pooling for visual question answering and visual grounding. In: Conference on empirical methods in natural language processing (EMNLP), Austin
Gao Y, Ma J, Zhao M, Yuille AL (2017) Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans Image Process 99:1–1
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. CoRR arXiv:abs/1512.03385
Hong I, Park S, Lee B, Lee J, Jeong D, Park S (2014) IoT-based smart garbage system for efficient food waste management. Sci World J. https://doi.org/10.1155/2014/646953
Idwan S, Zubairi JA, Mahmood I (2016) Smart solutions for smart cities: Using wireless sensor network for smart dumpster management. In: 2016 International conference on collaboration technologies and systems (CTS), pp 493–497. https://doi.org/10.1109/CTS.2016.0092
Kolesnikov A, Lampert CH (2016) Seed, expand and constrain: three principles for weakly-supervised image segmentation. Springer, Berlin, pp 695–711
Krizhevsky A, Sutskever I, Hinton CE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc, Red Hook, pp 1097–1105
Lebret R, Pinheiro P, Collobert R (2015) Phrase-based image captioning. In: Blei D, Bach F (eds) Proceedings of the 32nd international conference on machine learning (ICML-15). JMLR workshop and conference proceedings, pp 2085–2094
LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time-series. In: Arbib MA (ed) The handbook of brain theory and neural networks. MIT Press, Cambridge
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Li H, Li Y, Porikli F (2014) Deeptrack: learning discriminative feature representations by convolutional neural networks for visual tracking. In: Proceedings of the British machine vision conference. BMVA Press
Li YF, Zhou ZH (2015) Towards making unlabeled data never hurt. IEEE Trans Pattern Anal Mach Intell 37(1):175–188. https://doi.org/10.1109/TPAMI.2014.2299812
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. Springer, Berlin, pp 21–37
Mujumdar S, Rajamani N, Subramaniam LV, Porat D (2013) Efficient multi-stage image classification for mobile sensing in urban environments. In: 2013 IEEE international symposium on multimedia, pp 237–240. https://doi.org/10.1109/ISM.2013.45
Paul MK, Pal B (2016) Gaussian mixture based semi supervised boosting for imbalanced data classification. In: 2016 2nd International conference on electrical, computer telecommunication engineering (ICECTE), pp 1–4. https://doi.org/10.1109/ICECTE.2016.7879620
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, Lecun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. In: International conference on learning representations (ICLR), Banff
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR arXiv:abs/1409.1556
Sudha S, Vidhyalakshmi M, Pavithra K, Sangeetha K, Swaathi V (2016) An automatic classification method for environment: friendly waste segregation using deep learning. In: 2016 IEEE technological innovations in ICT for agriculture and rural development (TIAR), pp 65–70. https://doi.org/10.1109/TIAR.2016.7801215
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna, Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Taylor ME, Kuhlmann G, Stone P (2007) Accelerating search with transferred heuristics. In: ICAPS-07 workshop on AI planning and learning
Wang J, Deng B, Ren H (2016) Municipal solid waste classification using microwave nondestructive testing technique. In: 2016 13th International conference on ubiquitous robots and ambient intelligence (URAI), pp 599–603. https://doi.org/10.1109/URAI.2016.7625787
Zhang R, Rudnicky AI (2006) A new data selection principle for semi-supervised incremental learning. In: 18th International conference on pattern recognition (ICPR’06), vol 2, pp 780–783. https://doi.org/10.1109/ICPR.2006.115
Zheng H, Chen M, Liu W, Yang Z, Liang S (2014) Improving deep neural networks by using sparse dropout strategy. In: 2014 IEEE China summit international conference on signal and information processing (ChinaSIP), pp 21–26. https://doi.org/10.1109/ChinaSIP.2014.6889194
Acknowledgements
This research has been supported by the Spanish Government research funding TIN-2015-69542-C2-1-R(MINECO/FEDER) and the Banco de Santander funding grant for the Computer Vision and Image Processing (CVIP) Excellence research group.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Ramírez, I., Cuesta-Infante, A., Pantrigo, J.J. et al. Convolutional neural networks for computer vision-based detection and recognition of dumpsters. Neural Comput & Applic 32, 13203–13211 (2020). https://doi.org/10.1007/s00521-018-3390-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3390-8