Skip to main content
Log in

Detection and classification of vehicles using audio visual cues

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper presents a software-based vehicle detection and classification system capable of classifying traffic into four different classes, namely two-wheeler, three-wheeler, car, and heavy motor vehicle. It uses traffic video collected by a camera mounted on a vehicle parked by the side of a two-lane undivided road. Video frames containing vehicles are identified, both by automatically detecting peaks in Short Time Energy (STE) of the corresponding audio signal and adaptive background subtraction of the video frames, followed by blob subtraction and morphological operations. This may result in multiple images containing the same vehicle, which is eliminated using a Speeded Up Robust Feature (SURF) matching algorithm. Classification of resulting images is attempted in three different ways. In System 1, an SVM trained with explicit features such as Histogram of Gradient (HOG), Local Binary Pattern (LBP), and KAZE are used as a classifier and their performance is compared. In System 2, the task is performed using a deep neural network namely Single Shot Multibox Detector (SSD). The accuracy of the SSD system deteriorates when it is tested using video collected by another camera in a different environment. This issue is addressed in System 3 by retraining the SSD in the new set of images, without the use of manually labeled images. The effectiveness of all the proposed systems is validated using the collected heterogeneous traffic data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Adu-Gyamfi YO, Asare SK, Sharma A, Titus T (2017) Automated vehicle recognition with deep convolutional neural networks. Transp Res Rec 2645 (1):113–122

    Article  Google Scholar 

  2. Alcantarilla PF, Bartoli A, Davison AJ (2012) Kaze features. In: European conference on computer vision. Springer, pp 214–227

  3. Almusaylim ZA, Zaman N, Jung LT (2018) Proposing a data privacy aware protocol for roadside accident video reporting service using 5G in vehicular cloud networks environment. In: 2018 4th International conference on computer and information sciences (ICCOINS), Kuala Lumpur. IEEE, pp 1–5

  4. Arróspide J, Salgado L (2014) A study of feature combination for vehicle detection based on image processing. Sci World J 2014

  5. Ban XJ, Sun Z et al (2013) Vehicle classification using mobile sensors. University Transportation Research Center, pp 164–174

  6. Barth A, Franke U (2009) Estimating the driving state of oncoming vehicles from a moving platform using stereo vision. IEEE Trans Intell Transp Syst 10 (4):560–571

    Article  Google Scholar 

  7. Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. In: European conference on computer vision. Springer, pp 404–417

  8. Beke M, Haritaoglu E, Davis LS (2000) Real-time multiple vehicle detection and tracking from a moving vehicle. Mach Vis Appl 12(2):69–83

    Article  Google Scholar 

  9. Chellappa R, Qian G, Zheng Q (2004) Vehicle detection and tracking using acoustic and video sensors. In: IEEE international conference on acoustics, speech, and signal processing, Montreal, Canada, vol 3, pp 790–793

  10. Chen Y, Wu Q (2015) Moving vehicle detection based on optical flow estimation of edge. In: International conference on natural computation, Zhangjiajie, China, pp 754–758

  11. Chen Z, Ellis T, Velastin SA (2011) Vehicle type categorization: a comparison of classification schemes. In: International IEEE conference on intelligent transportation systems, pp 74–79

  12. Chen Z, Ellis T, Velastin SA (2012) Vehicle detection, tracking and classification in urban traffic. In: International IEEE conference on intelligent transportation systems, pp 951–956

  13. Cheon M, Lee W, Yoon C, Park M (2012) Vision-based vehicle detection system with consideration of the detecting location. IEEE Trans Intell Transp Syst 13(3):1243–1252

    Article  Google Scholar 

  14. Cheung SY, Coleri S, Dundar B, Ganesh S, Tan C-W, Varaiya P (2005) Traffic measurement and vehicle classification with single magnetic sensor. Transport Res Record Univ California, Berkeley 1917(1):173–181

    Article  Google Scholar 

  15. Chung J, Sohn K (2017) Image-based learning to measure traffic density using a deep convolutional neural network. IEEE Trans Intell Transp Syst 19 (5):1670–1675

    Article  Google Scholar 

  16. Cireşan DC, Meier U, Gambardella LM, Schmidhuber J (2010) Deep, big, simple neural nets for handwritten digit recognition. Neural Comput 22(12):3207–3220

    Article  Google Scholar 

  17. Czapla Z (2016) Video-based vehicle detection on a two-way road. Zeszyty Naukowe Transport:11–22

  18. Daniel C, Mary L (2016) Fusion of audio visual cues for vehicle classification. In: International conference on next generation intelligent systems, Kottayam, India, pp 1–4

  19. Das J, Shah M, Mary L (2017) Bag of feature approach for vehicle classification in heterogeneous traffic. In: IEEE international conference on signal processing, informatics, communication and energy systems, Kollam, India, pp 1–5

  20. Dedeoglu Y, Toreyin BU, Gudukbay U, Cetin AE (2008) Surveillance using both video and audio. In: Multimodal processing and interaction. Springer, New York, pp 1–13

  21. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 88 (2):303–338

    Article  Google Scholar 

  22. Fang J, Meng H, Zhang H, Wang X (2007) A low-cost vehicle detection and classification system based on unmodulated continuous-wave radar. In: IEEE intelligent transportation systems conference, Seattle, pp 715–720

  23. George J, Cyril A, Koshy BI, Mary L (2013) Exploring sound signature for vehicle detection and classification using ANN. Int J Soft Comput 4 (2):29–36

    Article  Google Scholar 

  24. George J, Mary L, Riyas K (2013) Vehicle detection and classification from acoustic signal using ANN and KNN. In: International conference on control communication and computing, Thiruvananthapuram, India, pp 436–439

  25. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  26. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  Google Scholar 

  27. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  28. Hoiem D, Rother C, Winn J (2007) 3D layout CRF for multi-view object class recognition and segmentation. In: IEEE conference on computer vision and pattern recognition, Minneapolis, pp 1–8

  29. Hu Q, Wang H, Li T, Shen C (2017) Deep CNNs with spatially weighted pooling for fine-grained car recognition. IEEE Trans Intell Transp Syst 18 (11):3147–3156

    Article  Google Scholar 

  30. Humayun M, Almufareh MF, Jhanjhi NZ (2022) Autonomous traffic system for emergency vehicles. Electronics 11(4):510

    Article  Google Scholar 

  31. Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2017) Squeezenet: alexnet-level accuracy with 50x fewer parameters and < 0.5 mb model size. arXiv:1602.07360, 1–13

  32. Kazemi FM, Samadi S, Poorreza HR, Akbarzadeh-t M-R (2007) Vehicle recognition based on fourier, wavelet and curvelet transforms-a comparative study. In: Fourth international conference on information technology, Las Vegas, pp 939–940

  33. Kim H, Song B (2013) Vehicle recognition based on radar and vision sensor fusion for automatic emergency braking. In: International conference on control, automation and systems, Gwangju, Korea, pp 1342–1346

  34. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  35. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  36. LeCun Y, Huang FJ, Bottou L (2004) Learning methods for generic object recognition with invariance to pose and lighting. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, vol 2. Washington, DC, pp 100–104

  37. Leutenegger S, Chli M, Siegwart R (2011) BRISK: binary robust invariant scalable keypoints. In: IEEE international conference on computer vision, pp 2548–2555

  38. Li S, Yu H, Zhang J, Yang K, Bin R (2013) Video-based traffic data collection system for multiple vehicle types. IET Intell Transp Syst 8(2):164–174

    Article  Google Scholar 

  39. Liu W, Wen X, Duan B, Yuan H, Wang N (2007) Rear vehicle detection and tracking for lane change assist. In: IEEE intelligent vehicles symposium, pp 252–257

  40. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision, Germany. Springer, pp 21–37

  41. Lowe G (2004) SIFT-the scale invariant feature transform. Int J 2:91–110

    Google Scholar 

  42. Ma X, Grimson WEL (2005) Edge-based rich representation for vehicle classification. In: Tenth IEEE international conference on computer vision, Washington, vol 2, pp 1185–1192

  43. Ma Z, Chang D, Xie J, Ding Y, Wen S, Li X, Si Z, Guo J (2019) Fine-grained vehicle classification with channel max pooling modified CNNs. IEEE Trans Veh Technol 68(4):3224–3233

    Article  Google Scholar 

  44. Mian ZF, Foss RW (2013) Vehicle evaluation using infrared data. US Patent 8,478,480, Google Patents. US Patent 8,478,480

  45. Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Trans Pattern Anal Mach Intell 27(10):1615–1630

    Article  Google Scholar 

  46. Mishra PK, Banerjee B (2013) Vehicle classification using density based multi-feature approach in support vector machine classifier. Int J Comput Appl, vol 71(7)

  47. Mithun NC, Rashid NU, Rahman SM (2012) Detection and classification of vehicles from video using multiple time-spatial images. IEEE Trans Intell Transp Syst 13(3):1215–1225

    Article  Google Scholar 

  48. Nooralahiyan A, Kirby HR, McKeown D (1998) Vehicle classification by acoustic signature. Math Comput Model 27(9–11):205–214

    Article  Google Scholar 

  49. Ntalampiras S (2018) Moving vehicle classification using wireless acoustic sensor networks. IEEE Trans Emerg Top Comput Intell 2(2):129–138

    Article  Google Scholar 

  50. Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147, 1–13

  51. Piyush P, Rajan R, Mary L, Koshy BI (2016) Vehicle detection and classification using audio-visual cues. In: International conference on signal processing and integrated networks, pp 726–730

  52. Prasad SA, Mary L (2019) A comparative study of different features for vehicle classification. In: 2019 International conference on computational intelligence in data science (ICCIDS). IEEE, Chennai, India, pp 1–5

  53. Qiong W, Liao S-B (2017) Single shot multibox detector for vehicles and pedestrians detection and classification. DEStech Transactions on Engineering and Technology Research, Lancaster, pp 22–23

  54. Rachmadi RF, Uchimura K, Koutaki G, Ogata K (2018) Single image vehicle classification using pseudo long short-term memory classifier. J Vis Commun Image Represent 56:265–274

    Article  Google Scholar 

  55. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  56. Ren S, He K, Girshick R, Sun J (2015) Faster r-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  57. Renganathan D, Mary L, George A (2018) Detection and classification of vehicles from heterogeneous traffic video data collected using a probe vehicle. In: 2018 3rd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT). IEEE, Bengaluru, India, pp 1979–1982

  58. Russakovsky O, Deng J, Krause J, Berg A, Fei-Fei L (2013) Large scale visual recognition challenge 2013. In: Proceedings of the international conference of computer vision (ICCV), pp 1–8

  59. Selbes B, Sert M (2017) Multimodal vehicle type classification using convolutional neural network and statistical representations of MFCC. In: IEEE international conference on advanced video and signal based surveillance, pp 1–6

  60. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv:1312.6229, pp 2–20

  61. Shen Y, Xiao T, Li H, Yi S, Wang X (2017) Learning deep neural networks for vehicle re-id with visual-spatio-temporal path proposals. In: Proceedings of the IEEE international conference on computer vision, pp 1900–1909

  62. Shvai N, Hasnat A, Meicler A, Nakib A (2019) Accurate classification for automatic vehicle-type recognition based on ensemble classifiers. IEEE Trans Intell Transp Syst 21(3):1288–1297

    Article  Google Scholar 

  63. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 1–14

  64. Sivaraman S, Trivedi MM (2012) Real-time vehicle detection using parts at intersections. In: International IEEE conference on intelligent transportation systems, pp 1519–1524

  65. Sivaraman S, Trivedi MM (2013) Looking at vehicles on the road: a survey of vision-based vehicle detection, tracking, and behavior analysis. IEEE Trans Intell Transp Syst 14(4):1773–1795

    Article  Google Scholar 

  66. Song K, Chen C, Huang C (2004) Design and experimental study of an ultrasonic sensor system for lateral collision avoidance at low speeds. In: IEEE intelligent vehicles symposium, Parma, pp 647–652

  67. Suhao L, Jinzhao L, Guoquan L, Tong B, Huiqian W, Yu P (2018) Vehicle type detection based on deep learning in traffic scene. Procedia Comput Sci 131:564–572

    Article  Google Scholar 

  68. Szegedy C, Toshev A, Erhan D (2013) Deep neural networks for object detection. In: Advances in neural information processing systems, pp 2553–2561

  69. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  70. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence, pp 4278–4284

  71. Wang T, Zhu Z (2012) Multimodal and multi-task audio-visual vehicle detection and classification. In: IEEE ninth international conference on advanced video and signal-based surveillance, pp 440–446

  72. Wang C, Thorpe C, Suppe A (2003) Ladar-based detection and tracking of moving objects from a ground vehicle at high speeds. In: IEEE IV2003 intelligent vehicles symposium. Proceedings, Columbus, pp 416–421

  73. Wang K, Wang R, Feng Y, Zhang H, Huang Q, Jin Y, Zhang Y (2014) Vehicle recognition in acoustic sensor networks via sparse representation. In: IEEE international conference on multimedia and expo workshops (ICMEW), Chengdu, pp 1–4

  74. Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inf Sci 295:395–406

    Article  Google Scholar 

  75. Zhang R, Ding J (2012) Object tracking and detecting based on adaptive background subtraction. Procedia Eng 29:1351–1355

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Transportation Research Center, College of Engineering, Thiruvananthapuram (Government of Kerala), India, for the support provided for this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anuja Prasad S..

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

S., A.P., Mary, L. & Koshy, B.I. Detection and classification of vehicles using audio visual cues. Multimed Tools Appl 82, 44087–44106 (2023). https://doi.org/10.1007/s11042-023-14868-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14868-2

Keywords

Navigation