Framework Comparison of Neural Networks for Automated Counting of Vehicles and Pedestrians

  • Galo Lalangui
  • Jorge Cordero
  • Omar Ruiz-Vivanco
  • Luis Barba-Guamán
  • Jessica Guerrero
  • Fátima Farías
  • Wilmer Rivas
  • Nancy Loja
  • Andrés HerediaEmail author
  • Gabriel Barros-Gavilanes
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1096)


This paper presents a comparison of three neural network frameworks used to make volumetric counts in an automated and continuous way. In addition to cars, the application count pedestrians. Frameworks used are: SSD Mobilenet re-trained, SSD Mobilenet pre-trained, and GoogLeNet pre-trained. The evaluation data set has a total duration of 60 min and comes from three different cameras. Images from the real deployment videos are included when training to enrich the detectable cases. Traditional detection models applied to vehicle counting systems usually provide high values for cars seen from the front. However, when the observer or camera is on the side, some models have lower detection and classification values. A new data set with fewer classes reach similar performance values as trained methods with default data sets. Results show that for the class cars, recall and precision values are 0.97 and 0.90 respectively in the best case, making use of a trained model by default, while for the class people the use of a re-trained model provides better results with precision and recall values of 1 and 0.82.


Convolutional Neural Networks Learning transfer Automatic counter Classification Tracking Single shot detector Mobilenet 


  1. 1.
    Abouelnaga, Y., Eraqi, H.M., Moustafa, M.N.: Real-time distracted driver posture classification. CoRR abs/1706.09498., June 2017
  2. 2.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)CrossRefGoogle Scholar
  3. 3.
    Bhaskar, P.K., Yong, S.: Image processing based vehicle detection and tracking method. In: 2014 International Conference on Computer and Information Sciences (ICCOINS), pp. 1–5, June 2014.
  4. 4.
    Biswas, D., Su, H., Wang, C., Stevanovic, A., Wang, W.: An automatic traffic density estimation using single shot detection (SSD) and MobileNet-SSD. Phys. Chem. Earth Parts A/B/C 110, 176–184 (2018)CrossRefGoogle Scholar
  5. 5.
    Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). Scholar
  6. 6.
    Campo, W.Y., Arciniegas, J.L., García, R., Melendi, D.: Análisis de tráfico para un servicio de vídeo bajo demanda sobre recles HFC usando el protocolo RTMP. Información Tecnológica 21(6), 37–48 (2010)CrossRefGoogle Scholar
  7. 7.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision & Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893. IEEE Computer Society (2005)Google Scholar
  8. 8.
    Dominguez-Sanchez, A., Cazorla, M., Orts-Escolano, S.: A new dataset and performance evaluation of a region-based cnn for urban object detection. Electronics 7(11), 1–16 (2018). Scholar
  9. 9.
    Farias, I.S., Fernandes, B.J.T., Albuquerque, E.Q., Leite, B.L.D.: Tracking and counting of vehicles for flow analysis from urban traffic videos. In: 2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI), pp. 1–6, November 2017.
  10. 10.
    Han, F., Shan, Y., Cekander, R., Sawhney, H.S., Kumar, R.: A two-stage approach to people and vehicle detection with hog-based SVM. In: Performance Metrics for Intelligent Systems 2006 Workshop, pp. 133–140 (2006)Google Scholar
  11. 11.
    Han, J., Liao, Y., Zhang, J., Wang, S., Li, S.: Target fusion detection of LiDAR and camera based on the improved YOLO algorithm. Sci. China, Ser. A Math. 6(10), 213 (2018)Google Scholar
  12. 12.
    Haralick, R.M., Shanmugam, K., et al.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. 6, 610–621 (1973)CrossRefGoogle Scholar
  13. 13.
    Hu, M.K.: Visual pattern recognition by moment invariants. IRE Trans. Inf. Theory 8(2), 179–187 (1962)CrossRefGoogle Scholar
  14. 14.
    Huang, J., Kumar, S.R., Mitra, M., Zhu, W.J., Zabih, R.: Image indexing using color correlograms. In: CVPR, p. 762. IEEE (1997)Google Scholar
  15. 15.
    Instituto Nacional de Estadística y Censos: Transporte. Accessed 24 Feb 2019
  16. 16.
    Khotanzad, A., Hong, Y.H.: Invariant image recognition by zernike moments. IEEE Trans. Pattern Anal. Mach. Intell. 12(5), 489–497 (1990)CrossRefGoogle Scholar
  17. 17.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  18. 18.
    Lee, S.C., Nevatia, R.: Robust camera calibration tool for video surveillance camera in urban environment. In: CVPR 2011 Workshops, pp. 62–67, June 2011Google Scholar
  19. 19.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV, p. 1150. IEEE (1999)Google Scholar
  20. 20.
    Massiris, M., Delrieux, C., Fernández, J.Á.: Detección de equipos de proteccÓin personal mediante red neuronal convolucional yolo. Accessed 25 Feb 2019
  21. 21.
    Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 7, 971–987 (2002)CrossRefGoogle Scholar
  22. 22.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 91–99. Curran Associates, Inc. (2015)Google Scholar
  23. 23.
    Rosebrock, A.: Deep Learning for Computer Vision with Python: Starter Bundle. Pyimagesearch (2017)Google Scholar
  24. 24.
    Rosten, E., Drummond, T.: Fusing points and lines for high performance tracking. In: Null, pp. 1508–1515. IEEE (2005)Google Scholar
  25. 25.
    i Serrano, A.S., PeñaL, A.M.L.: YOLO object detector for onboard driving images. Enginyeria Informàtica 958 (2017).
  26. 26.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  27. 27.
    Tang, P., Wang, H., Kwong, S.: G-MS2F: GoogLeNet based multi-stage feature fusion of deep CNN for scene recognition. Neurocomputing 225, 188–197 (2017)CrossRefGoogle Scholar
  28. 28.
    Torres-Espinoza, F., Barros-Gavilanes, G.: Sound noise monitoring platform: Smart-phones as sensors. In: European Wireless 2017; 23th European Wireless Conference, pp. 1–6, May 2017.
  29. 29.
    Torres-Espinoza, F., Barros-Gavilanes, G., Barros, M.J.: Computer vision classifier and platform for automatic counting: more than cars. In: 2nd ETCM Conference, Ecuador Technical Chapters Meeting 2017, pp. 1–6, October 2017.
  30. 30.
    Welch, G., Bishop, G.: An Introduction to the Kalman Filter. Technical Report 1. UNC, Chapel Hill, NC, USA, July 2006.
  31. 31.
    Zmudzinski, L.: Deep learning guinea pig image classification using Nvidia DIGITS and GoogLeNet. In: Proceedings of the 27th International Workshop on Concurrency, September 2018Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.LIDIUniversidad del AzuayCuencaEcuador
  2. 2.Universidad Técnica Particular de LojaLojaEcuador
  3. 3.Universidad Técnica de MachalaMachalaEcuador

Personalised recommendations