Skip to main content
Log in

Detection of 3D bounding boxes of vehicles using perspective transformation for accurate speed measurement

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Detection and tracking of vehicles captured by traffic surveillance cameras is a key component of intelligent transportation systems. We present an improved version of our algorithm for detection of 3D bounding boxes of vehicles, their tracking and subsequent speed estimation. Our algorithm utilizes the known geometry of vanishing points in the surveilled scene to construct a perspective transformation. The transformation enables an intuitive simplification of the problem of detecting 3D bounding boxes to detection of 2D bounding boxes with one additional parameter using a standard 2D object detector. Main contribution of this paper is an improved construction of the perspective transformation which is more robust and fully automatic and an extended experimental evaluation of speed estimation. We test our algorithm on the speed estimation task of the BrnoCompSpeed dataset. We evaluate our approach with different configurations to gauge the relationship between accuracy and computational costs and benefits of 3D bounding box detection over 2D detection. All of the tested configurations run in real time and are fully automatic. Compared to other published state-of-the-art fully automatic results, our algorithm reduces the mean absolute speed measurement error by 32% (1.10 km/h to 0.75 km/h) and the absolute median error by 40% (0.97 km/h to 0.58 km/h).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://github.com/kocurvik/BCS_results.

  2. The dataset contains five subsets of videos. For the subsets which contain an odd number of videos, we use the odd video for testing.

References

  1. Bochinski, E., Eiselein, V., Sikora, T.: High-speed tracking-by-detection without using image information. In: International Workshop on Traffic and Street Surveillance for Safety and Security at IEEE AVSS 2017, pp. 1–6. Lecce, Italy (2017)

  2. Bouguet, J.Y.: Pyramidal implementation of the lucas kanade feature tracker description of the algorithm. OpenCV Document, Intel, Microprocessor Research Labs, vol. 1 (2000)

  3. Cathey, F., Dailey, D.: A novel technique to dynamically measure vehicle speed using uncalibrated roadway cameras. In: Proceedings of the IEEE Intelligent Vehicles Symposium, 2005, pp. 777–782. IEEE (2005)

  4. Chen, Y., Tai, L., Sun, K., Li, M.: Monopair: Monocular 3d object detection using pairwise spatial relationships. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12093–12102 (2020)

  5. Corral-Soto, E.R., Elder, J.H.: Slot cars: 3d modelling for improved visual traffic analytics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 16–24. IEEE (2017)

  6. Do, V.H., Nghiem, L.H., Thi, N.P., Ngoc, N.P.: A simple camera calibration method for vehicle velocity estimation. In: Proceedings of the 12th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, pp. 1–5. IEEE (2015)

  7. Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: keypoint triplets for object detection. arXiv preprint arXiv:1904.08189 (2019)

  8. Dubská, M., Herout, A., Sochor, J.: Automatic camera calibration for traffic understanding. In: Proceedings of the British Machine Vision Conference, vol. 4, p. 8. BMVA Press (2014)

  9. Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Proceedings of the Scandinavian Conference on Image analysis, pp. 363–370. Springer (2003)

  10. Filipiak, P., Golenko, B., Dolega, C.: NSGA-II based auto-calibration of automatic number plate recognition camera for vehicle speed measurement. In: European Conference on the Applications of Evolutionary Computation, pp. 803–818. Springer (2016)

  11. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEE (2012)

  12. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969. IEEE (2017)

  13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE (2016)

  14. Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng. 82(1), 35–45 (1960)

    Article  MathSciNet  Google Scholar 

  15. Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems, pp. 5574–5584 (2017)

  16. Kim, Y., Kum, D.: Deep learning based vehicle position and orientation estimation via inverse perspective mapping image. In: Proceedings of the 2019 IEEE Intelligent Vehicles Symposium, pp. 317–323. IEEE (2019)

  17. Kocur, V.: Perspective transformation for accurate detection of 3d bounding boxes of vehicles in traffic surveillance. In: Proceedings of the 24th Computer Vision Winter Workshop, pp. 33–41 (2019)

  18. Lan, J., Li, J., Hu, G., Ran, B., Wang, L.: Vehicle speed measurement based on gray constraint optical flow algorithm. Optik 125(1), 289–295 (2014)

    Article  Google Scholar 

  19. Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision, pp. 734–750 (2018)

  20. Li, B., Ouyang, W., Sheng, L., Zeng, X., Wang, X.: Gs3d: an efficient 3d object detection framework for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1019–1028 (2019)

  21. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988. IEEE (2017)

  22. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: Proceedings of the European Conference on Computer Vision, pp. 740–755. Springer (2014)

  23. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: Proceedings of the European Conference on Computer Vision, pp. 21–37. Springer (2016)

  24. Liu, Z., Wu, Z., Tóth, R.: Smoke: single-stage monocular 3d object detection via keypoint estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 996–997 (2020)

  25. Luvizon, D.C., Nassu, B.T., Minetto, R.: A video-based system for vehicle speed measurement in urban roadways. IEEE Trans. Intell. Transp. Syst. 18(6), 1393–1404 (2017)

    Google Scholar 

  26. Maduro, C., Batista, K., Peixoto, P., Batista, J.: Estimation of vehicle velocity and traffic intensity using rectified images. In: Proceedings of the 15th IEEE International Conference on Image Processing, pp. 777–780. IEEE (2008)

  27. Minetto, R., Thome, N., Cord, M., Leite, N.J., Stolfi, J.: T-HOG: an effective gradient-based descriptor for single line text regions. Pattern Recogn. 46(3), 1078–1090 (2013)

    Article  Google Scholar 

  28. Mousavian, A., Anguelov, D., Flynn, J., Košecká, J.: 3d bounding box estimation using deep learning and geometry. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7074–7082. IEEE (2017)

  29. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE (2016)

  30. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

  31. Schoepflin, T.N., Dailey, D.J.: Dynamic camera calibration of roadside traffic management cameras for vehicle speed estimation. IEEE Trans. Intell. Transp. Syst. 4(2), 90–98 (2003)

    Article  Google Scholar 

  32. Simonelli, A., Bulo, S.R., Porzi, L., López-Antequera, M., Kontschieder, P.: Disentangling monocular 3d object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1991–1999 (2019)

  33. Sochor, J., Juránek, R., Herout, A.: Traffic surveillance camera calibration by 3d model bounding box alignment for accurate vehicle speed measurement. Comput. Vis. Image Underst. 161, 87–98 (2017)

    Article  Google Scholar 

  34. Sochor, J., Juránek, R., Špaňhel, J., Maršík, L., Široký, A., Herout, A., Zemčík, P.: Comprehensive data set for automatic single camera visual speed measurement. IEEE Trans. Intell. Transp. Syst. 20(5), 1633–1643 (2018)

    Article  Google Scholar 

  35. Sochor, J., Špaňhel, J., Herout, A.: Boxcars: improving fine-grained recognition of vehicles using 3-d bounding boxes in traffic surveillance. IEEE Trans. Intell. Transp. Syst. 20(1), 97–108 (2018)

    Article  Google Scholar 

  36. You, X., Zheng, Y.: An accurate and practical calibration method for roadside camera using two vanishing points. Neurocomputing 204, 222–230 (2016)

    Article  Google Scholar 

  37. Zapletal, D., Herout, A.: Vehicle re-identification for automatic video traffic surveillance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 25–31 (2016)

  38. Zeng, R., Ge, Z., Denman, S., Sridharan, S., Fookes, C.: Geometry-constrained car recognition using a 3d perspective network. arXiv preprint arXiv:1903.07916 (2019)

  39. Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)

Download references

Acknowledgements

The authors would like to thank Adam Herout for his valuable comments. The authors also gratefully acknowledge the support of NVIDIA Corporation with the donation of GPUs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Viktor Kocur.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 261370 KB)

Supplementary material 2 (mp4 263368 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kocur, V., Ftáčnik, M. Detection of 3D bounding boxes of vehicles using perspective transformation for accurate speed measurement. Machine Vision and Applications 31, 62 (2020). https://doi.org/10.1007/s00138-020-01117-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-020-01117-x

Keywords

Navigation