Skip to main content

Advertisement

Log in

Early anticipation of driver’s maneuver in semiautonomous vehicles using deep learning

  • Regular Paper
  • Published:
Progress in Artificial Intelligence Aims and scope Submit manuscript

Abstract

Making machines to anticipate human action is a complex research problem. Some of the recent research studies on computer vision and assistive driving have reported that the anticipation of driver’s action few seconds in advance is a challenging problem. These studies are based on the driver’s head movement tracking, eye gaze tracking, and spatiotemporal interest points. The study is aimed to address an important question of how to anticipate a driver’s action while driving and improve the anticipation time. The goal of this study is to review the existing deep learning framework for assistive driving. This paper differs from the existing solutions in two ways. First, it proposes a simplified framework using the driver’s inside video data and develops a driver’s movement tracking (DMT) algorithm. Majority of the existing state of the art is based on inside and outside features of the vehicles. Second, the proposed work tends to improve the image pattern recognition by introducing a fusion of spatiotemporal data points (STIPs) for movement tracking along with eye cuboids and then action anticipation by using deep learning. The proposed DMT algorithm tracks the driver’s movement using STIPs from the input video. Also, a fast eye gaze algorithm tracks eye movements. The features extracted from STIP and eye gaze are fused and analyzed by a deep recurrent neural network to improve the prediction time, thereby giving a few extra seconds to anticipate the driver’s correct action. The performance of the DMT algorithm is compared with the previous algorithms and found that DMT offers 30% improvement with regards to anticipating driver’s action over two recently proposed deep learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Uddin, M.T., Uddiny, M.A.: Human activity recognition from wearable sensors using extremely randomized trees. In: 2015 International Conference on Electrical Engineering and Information Communication Technology (ICEEICT). IEEE (2015)

  2. Jalal, A., Kim, J.T., Kim, T.-S.: Human activity recognition using the labeled depth body parts information of depth silhouettes. In: Proceedings of the 6th International Symposium on Sustainable Healthy Buildings, Seoul, Korea. vol. 27 (2012)

  3. Farooq, F., Ahmed, J., Zheng, L.: Facial expression recognition using hybrid features and self-organizing maps. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), IEEE (2017)

  4. Jalal, A., et al.: Human activity recognition via recognized body parts of human depth silhouettes for residents monitoring services at smart home. Indoor Built Environ 22(1), 271–279 (2013)

    Article  Google Scholar 

  5. Ahad, M.A.R., Kobashi, S., Tavares, J.M.R.S.: Advancements of image processing and vision in healthcare. J. Healthc. Eng. 2018, 8458024 (2018)

  6. Jalal, A.: Security architecture for third generation (3 g) using gmhs cellular network. In: International Conference on Emerging Technologies, 2007. ICET 2007. IEEE

  7. Jalal, A., Rasheed, Y.A.: Collaboration achievement along with performance maintenance in video streaming. In: Proceedings of the IEEE Conference on Interactive Computer Aided Learning, Villach, Austria. vol. 2628 (2007)

  8. Jacobsen, D., Ott, P.: Cloud architecture for industrial image processing: platform for realtime inline quality assurance. In: 2017 IEEE 15th International Conference on Industrial Informatics (INDIN). IEEE (2017)

  9. Mazhar, M., Rathore, U., et al.: Real-time continuous feature extraction in large size satellite images. J. Syst. Archit. 64, 122–132 (2016)

    Article  Google Scholar 

  10. Farooq, A., Jalal, A., Kamal, S.: Dense RGB-D map-based human tracking and activity recognition using skin joints features and self-organizing map. KSII Trans. Internet Inf. Syst. 9(5), 1856–1869 (2015)

    Google Scholar 

  11. Jalal, A., Kim, S.: Global security using human face understanding under vision ubiquitous architecture system. World Acad. Sci. Eng. Technol. 13, 7–11 (2006)

    Google Scholar 

  12. Yoshomoto, H., Date, N., Yonemoto, S.: Vision-based real-time motion capture system using multiple cameras. In: Proceedings IEEE conference on multisensor fusion and integration for intelligent systems (2003)

  13. Kamal, S., Jalal, A.: A hybrid feature extraction approach for human detection, tracking and activity recognition using depth sensors. Arab. J. Sci. Eng. 41(3), 1043–1051 (2016)

    Article  Google Scholar 

  14. Jalal, A., Shahzad, A.: Multiple facial feature detection using vertex-modeling structure. In: Proceedings of the IEEE Conference on Interactive Computer Aided Learning, Villach, Austria. vol. 2628 (2007)

  15. Huang, Q., Yang, J., Qiao, Y.: Person re-identification across multi-camera system based on local descriptors. In: 2012 Sixth International Conference on Distributed Smart Cameras (ICDSC). IEEE (2012)

  16. Li, Y., Xia, R., Huang, Q., Xie, W., Li, X.: Survey of spatio-temporal interest point detection algorithms in video. 5 (2017)

  17. Mur, O., Frigola, M., Casals, A.: Modelling daily actions through hand-based spatio-temporal features. 978-1-4673-7509-2/15 (2015)

  18. Happy, S.L., Routray, A.: Fuzzy histogram of optical flow orientations for micro-expression recognition. (2015) https://doi.org/10.1109/taffc.2017.2723386

  19. Cuong, N.H., Hoang, H.T.: Eye-gaze detection with a single WebCAM based on geometry features extraction. In: 2010 11th International Conference on Control, Automation, Robotics and Vision Singapore, 7–10 December 2010

  20. George, A., Routray, A.: Fast and accurate algorithm for eye localization for gaze tracking in low resolution images. IET Comput. Vis. 10(7), 660–669 (2016)

    Article  Google Scholar 

  21. Hsiao, P.-Y., S.-S. Chou, F.-C. Huang: Generic 2-D Gaussian smoothing filter for noisy image processing. In: TENCON 2007-2007 IEEE Region 10 Conference

  22. Piyathilaka, L., Kodagoda, S.: Gaussian mixture based HMM for human daily activity recognition using 3D skeleton features. In: 2013 8th IEEE Conference on Industrial Electronics and Applications (ICIEA). IEEE (2013)

  23. Jalal, A., Kamal, S., Kim, D.: A depth video sensor-based life-logging human activity recognition system for elderly care in smart indoor environments. Sensors 14(7), 11735–11759 (2014)

    Article  Google Scholar 

  24. Jalal, A., et al.: Robust human activity recognition from depth video using spatiotemporal multi-fused features. Pattern Recognit. 61, 295–308 (2017)

    Article  Google Scholar 

  25. Jalal, A., Kamal, S., Kim, D.: Shape and motion features approach for activity tracking and recognition from kinect video camera. In: 2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops (WAINA). IEEE (2015)

  26. Jalal, A., Kim, Y.: Dense depth maps-based human pose tracking and recognition in dynamic scenes using ridge data. In: 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE (2014)

  27. Jalal, A., Kamal, S., Kim, D.: Individual detection-tracking-recognition using depth activity images. In: 2015 12th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI). IEEE (2015)

  28. Rezaei, M., Klette, R.: Look at the driver, look at the road: no distraction! no accident!. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)

  29. Kumar, P., Perrollaz, M., Lefevre, S., Laugier, C.: Learning-based approach for online lane change intention prediction. In: IEEE International Vehicle Symposium Proceedings (2013)

  30. Frohlich, B., Enzweiler, M., Franke, U.: Will this car change the lane?—Turn signal recognition in the frequency domain. In: IEEE International Vehicle Symposium Proceedings (2014)

  31. Doshi, A., Morris, B., Trivedi, M.M.: On-road prediction of driver’s intent with multimodal sensory cues. IEEE Pervasive Comput. (2011)

  32. Bhatt, D., Gite, S. (2016) Novel driver behavior model analysis using hidden Markov model to increase road safety in smart cities. In: ACM Conference ICTCS’16

  33. Shia, V., Gao, Y., Vasudevan, R., Campbell, K.D., Lin, T., Borrelli, F., Bajcsy, R.: Semiautonomous vehicular control using driver modeling. IEEE Trans. Intell. Transp. Syst. 15(6), 2696–2709 (2014)

    Article  Google Scholar 

  34. Vasudevan, R., Shia, V., Gao, Y., Cervera-Navarro, R., Bajcsy, R., Borrelli, F.: Safe semi-autonomous control with enhanced driver modeling. In: American Control Conference (2012)

  35. Jabon, M.E., Bailenson, J.N., Pontikakis, E.D., Takayama, L., Nass, C.: Facial expression analysis for predicting unsafe driving behavior. IEEE Pervasive Comput. 10, 84–95 (2011)

    Article  Google Scholar 

  36. Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)

  37. Wang, Z., Mülling, K., Deisenroth, M., Amor, H., Vogt, D., Schölkopf, B., Peters, J.: Probabilistic movement modeling for intention inference in human–robot interaction. Int. J. Robot. Res. 32(7) (2013)

  38. Koppula, H., Saxena, A.: Anticipating human activities using object affordances for reactive robotic response. IEEE Trans. Pattern Anal. Mach. Intell. (2015)

  39. Zhang, C., Zhang, Z.: A survey of recent advances in face detection. Technical report, Microsoft Research (2010)

  40. Koppula, H., Saxena, A.: Learning spatio-temporal structure from rgb-d videos for human activity detection and anticipation. In: Proceedings of the International Conference on Machine Learning (2013)

  41. Jain, A., Koppula, H.S., Raghavan, B., Soh, S., Saxena, A.: Car that knows before you do: anticipating maneuvers via learning temporal driving models. In: ICCV, vol. 38, issue 1 (2015)

  42. Jain, A., Koppula, H.S., Soh, S., Raghavan, B., Singh, A., Saxena, A.: Brain4Cars: car that knows before you do via sensory-fusion deep learning architecture. In: ICCV, vol. 38 (2016)

  43. Chen, I.-K., et al.: A real-time system for object detection and location reminding with rgb-d camera. In: 2014 IEEE International Conference on Consumer Electronics (ICCE). IEEE (2014)

  44. Kamal, S., Jalal, A., Kim, D.: Depth images-based human detection, tracking and activity recognition using spatiotemporal features and modified HMM. J. Electr. Eng. Technol 11(3), 1921–1926 (2016)

    Google Scholar 

  45. Maria L, Massaru L, Ferreira E.: Digital image processing in remote sensing. In: Proceedings of Conference on Computer Graphics and Image Processing (2009)

  46. Jalal, A., Kim, Y., Kim, D.: Ridge body parts features for human pose estimation and recognition from RGB-D video data. In: 2014 International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE (2014)

  47. Jalal, A., et al.: Human activity recognition via the features of labeled depth body parts. In: International Conference on Smart Homes and Health Telematics. Springer, Berlin, Heidelberg (2012)

  48. Jalal, A., Kim, J.T., Kim, T.-S.: Development of a life logging system via depth imaging-based human activity recognition for smart homes. In: Proceedings of the International Symposium on Sustainable Healthy Buildings, Seoul, Korea. vol. 19 (2012)

  49. Procházka, A., et al.: Satellite image processing and air pollution detection. In: Proceedings of 2000 IEEE international conference on acoustics, speech, and signal processing, ICASSP’00, vol. 4. IEEE (2000)

  50. Virmani, S., Gite, S.: Developing a novel algorithm for identifying driver’s behaviour in ADAS using deep learning. IJCTA 10(8), 573–579 (2017)

    Google Scholar 

  51. Bergasa, L.M., Almería, D., Almazán, J., Yebes, J.J., Arroyo, R.: Drivesafe: an app for alerting inattentive drivers and scoring driving behaviors. In: Proceedings of the 2014 IEEE Intelligent Vehicles Symposium Proceedings, Dearborn, MI, USA, 8–11 June 2014, pp 240–245

  52. Wijayagunawardhane, N.R.B., Jinasena, S.D., Sandaruwan, C.B., Dharmapriya, W.A.N.S., Samarasinghe, R.: SmartV: intelligent vigilance monitoring based on sensor fusion and driving dynamics. In: Proceedings of the 2013 8th IEEE International Conference on Industrial and Information Systems (ICIIS), Peradeniya, Sri Lanka, 17–20 December 2013, pp. 507–512

  53. Tango, F., Botta, M.: Real-time detection system of driver distraction using machine learning. IEEE Trans. Intell. Transp. Syst. 14, 894–905 (2013)

    Article  Google Scholar 

  54. Dong, Y., Hu, Z., Uchimura, K., Murayama, N.: Driver inattention monitoring system for intelligent vehicles: a review. IEEE Trans. Intell. Transp. Syst. 12, 596–614 (2011)

    Article  Google Scholar 

  55. Bosch urban. http://bit.ly/1feM3JM. Accessed 23 April 2015

  56. Gite, S., Agrawal, H.: Early prediction of driver’s action using deep neural networks. Int. J. Inf. Retr. Res. (2019). https://doi.org/10.4018/ijirr. (in press)

    Google Scholar 

  57. Al-Sultan, S., Al-Bayatti, A., Zedan, H.: Context-aware driver behavior detection system in intelligent transportation systems. IEEE Trans. Veh. Technol. 62, 4264–4275 (2013)

    Article  Google Scholar 

  58. Gite, S., Aggrawal, H.: On context awareness for multisensor data fusion in IOT. In: Proceedings of the Second International Conference on Computer and Communication Technologies, pp. 85–93

  59. Martin, S., et al.: Dynamics of Driver’s gaze: explorations in behavior modeling and maneuver prediction. IEEE Trans. Intell. Veh. 3(2), 141–150 (2018)

    Article  Google Scholar 

  60. Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: Salah, A.A., Lepri, B. (eds.) Human Behavior Understanding. Lecture Notes in Computer Science, vol. 7065, pp. 29–39. Springer, Berlin (2011)

    Google Scholar 

  61. Dong, W., Li, J., Yao, R., Li, C., Yuan, T., Wang, L.: Characterizing driving styles with deep learning. (2016). arXiv:1607.03611

  62. Wu, D., Sharma, N., Blumenstein, M.: Recent advances in video-based human action recognition using deep learning: a review. In: 2017 International Joint Conference on Neural Networks (IJCNN). IEEE (2017)

  63. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)

    Article  Google Scholar 

  64. Jalal, A., et al.: Robust human activity recognition from depth video using spatiotemporal multi-fused features. Pattern Recognit. 61, 295–308 (2017)

    Article  Google Scholar 

  65. Hammerla, N.Y., Halloran, S., Ploetz, T.; Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880 (2016)

  66. Nweke, H.F., et al.: Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: state of the art and research challenges. Expert Syst. Appl. 105 (2018)

  67. Bux, A.: Vision-based human action recognition using machine learning techniques. Dissertation. Lancaster University (2017)

  68. Fridman, Lex, et al. “MIT Autonomous Vehicle Technology Study: Large-Scale Deep Learning Based Analysis of Driver Behavior and Interaction with Automation.” arXiv preprint arXiv:1711.06976 (2017)

  69. Martinez, C.M., et al.: Driving style recognition for intelligent vehicle control and advanced driver assistance: a survey. IEEE Trans. Intell. Transp. Syst. 19(3), 666–676 (2018)

    Article  Google Scholar 

  70. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  71. Wu, H., et al.: Human activity recognition based on the combined svm&hmm. In: 2014 IEEE International Conference on Information and Automation (ICIA). IEEE (2014)

  72. Tang, K., Zhu, S., Xu, Y., Wang, F.: Modeling drivers’ dynamic decision-making behavior during the phase transition period: an analytical approach based on hidden Markov model theory. IEEE Trans. Intell. Transp. Syst. (2015). https://doi.org/10.1109/tits.2015.2462738

    Google Scholar 

  73. Svozil, D., Kvasnicka, V., Pospichal, J.: Introduction to multi-layer feed-forward neural networks. Chem. Intell. Lab. Syst. 39(1), 43–62 (1997)

    Article  Google Scholar 

  74. https://towardsdatascience.com/deeplearning-feedforward-neural-network-26a6705dbdc7. Accessed 3 Sept 2018

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shilpa Gite.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gite, S., Agrawal, H. & Kotecha, K. Early anticipation of driver’s maneuver in semiautonomous vehicles using deep learning. Prog Artif Intell 8, 293–305 (2019). https://doi.org/10.1007/s13748-019-00177-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13748-019-00177-z

Keywords

Navigation