Detection and Tracking of Motorcycles in Congested Urban Environments Using Deep Learning and Markov Decision Processes

  • Jorge E. EspinosaEmail author
  • Sergio A. Velastin
  • John W. Branch
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11524)


This research describes “EspiNet”, a Deep Learning Convolutional Neural Network model, in conjunction with a Markov Decision Process (MDP) tracker for detection and tracking of occluded motorcycles in urban environments. The model is trained and evaluated, using a new public dataset with up to 10,000 annotated images, created for this research, and captured in real urban traffic scenes. Images were captured using a moving camera mounted in a drone, where more than 60% of the motorcycles are affected by occlusions. The network design involves many tests, where a promising result of 88.84% in average precision (AP) is achieved, despite the considerable number of occluded vehicles, the movement of the camera and the low angle used for capture. The model predictions are used as input to an MDP tracker, reaching results up to 85.2% in Multiple Object Tracking Accuracy (MOTA). The proposed network architecture outperforms state of the art YOLO (You Look Only Once) v3.0 and Faster R-CNN (VGG16 based) detection models, producing also better tracking results in comparison with the use of the other two models as detector base for the MDP tracker.


Motorcycle detection Motorcycle tracking Faster R-CNN Region based detector CNN Deep learning Occluded images Markov Decision Process 



This work was partially supported by COLCIENCIAS project: Reduccion de Emisiones Vehiculares Mediante el Modelado y Gestion Optima de Trafico en Areas Metropolitanas - Caso Medellin - Area Metropolitana del Valle de Aburra, codigo 111874558167, CT 049-2017. Universidad Nacional de Colombia. Proyecto HERMES 25374. The authors gratefully acknowledge the support of NVIDIA Corporation with the donation of GPUs used for this research.


  1. 1.
    Accidentes de tránsito en la Comunidad Andina, 2007–2016.
  2. 2.
  3. 3.
    Adu-Gyamfi, Y.O., Asare, S.K., Sharma, A., Titus, T.: Automated vehicle recognition with deep convolutional neural networks. Transp. Res. Rec.: J. Transp. Res. Board 2645, 113–122 (2017)CrossRefGoogle Scholar
  4. 4.
    Bazargani, H.S., Vahidi, R.G., Abhari, A.A.: Predictors of survival in motor vehicle accidents among motorcyclists, bicyclists and pedestrians. Trauma Mon. 22(2) (2017). Accessed 26 Sept 2017
  5. 5.
    Buch, N., Orwell, J., Velastin, S.A.: Urban road user detection and classification using 3D wire frame models. IET Comput. Vis. 4(2), 105–116 (2010). Scholar
  6. 6.
    Chen, Z., Ellis, T., Velastin, S.A.: Vehicle detection, tracking and classification in urban traffic. In: 2012 15th International IEEE Conference on Intelligent Transportation Systems, September, pp. 951–956 (2012).
  7. 7.
    Duan, B., Liu, W., Fu, P., Yang, C., Wen, X., Yuan, H.: Real-time on-road vehicle and motorcycle detection using a single camera, pp. 1–6. IEEE (2009).
  8. 8.
    Dupuis, Y., Subirats, P., Vasseur, P.: Robust image segmentation for overhead real time motorbike counting. In: 2014 IEEE 17th International Conference on Intelligent Transportation Systems (ITSC), October, pp. 3070–3075 (2014).
  9. 9.
    Espinosa, J.E., Velastin, S.A., Branch, J.W.: Motorcycle detection and classification in urban Scenarios using a model based on Faster R-CNN. arXiv preprint arXiv:1808.02299 (2018)
  10. 10.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite, p. 3354–3361. IEEE (2012). Accessed 27 Oct 2016
  11. 11.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, June, pp. 580–587 (2014).
  12. 12.
    Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. arXiv:1611.10012, November 2016
  13. 13.
    Huynh, C.K., Le, T.S., Hamamoto, K.: Convolutional neural network for motorbike detection in dense traffic. In: 2016 IEEE Sixth International Conference on Communications and Electronics (ICCE), July, pp. 369–374 (2016).
  14. 14.
    Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv:1804.02767, April 2018
  15. 15.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks, pp. 91–99 (2015).
  16. 16.
    Silva, R.R., Aires, K.R., Veras, R.M.S.: Detection of helmets on motorcyclists. Multimed. Tools Appl. 77, 1–25 (2017)Google Scholar
  17. 17.
    Thai, N.D., Le, T.S., Thoai, N., Hamamoto, K.: Learning bag of visual words for motorbike detection. In: 2014 13th International Conference on Control Automation Robotics Vision (ICARCV), December, pp. 1045–1050 (2014).
  18. 18.
    Vishnu, C., Singh, D., Mohan, C.K., Babu, S.: Detection of motorcyclists without helmet in videos using convolutional neural network. In: 2017 International Joint Conference on Neural Networks (IJCNN), May, pp. 3036–3041 (2017)Google Scholar
  19. 19.
    Walsh, M.P.: PM 2.5: global progress in controlling the motor vehicle contribution. Front. Environ. Sci. Eng. 8(1), 1–17 (2014)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Xiang, Y., Alahi, A., Savarese, S.: Learning to track: online multi-object tracking by decision making. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4705–4713 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Politécnico Colombiano Jaime Isaza CadavidMedellínColombia
  2. 2.Cortexica Vision Systems Ltd.LondonUK
  3. 3.Queen Mary University of LondonLondonUK
  4. 4.University Carlos III MadridMadridSpain
  5. 5.Universidad Nacional de Colombia – Sede Medellí­nMedellínColombia

Personalised recommendations