Skip to main content

Advertisement

Log in

Onboard monocular pedestrian detection by combining spatio-temporal hog with structure from motion algorithm

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

In this paper, we brought out a novel pedestrian detection framework for the advanced driver assistance system of mobile platform under the normal urban street environment. Different from the conventional systems that focus on the pedestrian detection at near distance by interfusing multiple sensors (such as radar, laser and infrared camera), our system has achieved the pedestrian detection at all (near, middle and long) distance on a normally driven vehicle (1–40 km/h) with monocular camera under the street scenes. Since pedestrians typically exhibit not only their human-like shape but also the unique human movements generated by their legs and arms, we use the spatio-temporal histogram of oriented gradient (STHOG) to describe the pedestrian appearance and motion features. The shape and movement of a pedestrian will be described by a unique feature produced by concatenating the spatial and temporal histograms. A STHOG detector trained by the AdaBoost algorithm will be applied to the images stabilized by the structure from motion (SfM) algorithm with geometric ground constraint. The main contributions of this work include: (1) ground constraint with monocular camera to reduce the computational cost and false alarms; (2) preprocessing by stabilizing the successive images captured from mobile camera with the SfM algorithm; (3) long-distance (maximum 100 m) pedestrian detection at various velocities (1–40 km/h). Through the extensive experiments under different city scenes, the effectiveness of our algorithm has been proved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

Notes

  1. http://en.wikipedia.org/wiki/Speed_limits_in_United_States.

  2. http://www.web-pbi.com/speed.htm.

  3. http://en.wikipedia.org/wiki/Category:Speed_limits_by_country.

References

  1. Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. In: IEEE Trans. Patt. Analy. and Mach. Intell. (2011)

  2. Geronimo, D., Lopez, A.M., Sappa, A.D., Graf, T.: Survey of pedestrian detection for advanced driver assistance systems. IEEE Trans. Patt. Anal. Mach. Intell. 32(7), 1239–1258 (2010)

    Article  Google Scholar 

  3. Gandhi, T., Trivedi, M.M.: Pedestrian detection systems: issues, survey and challenges. IEEE Trans. Intell. Transp. Syst. 8(3), 413–430 (2007)

    Article  Google Scholar 

  4. Knoll, P., Hoefflinger, B.: HDR vision for driver assistance. In: High-Dynamic-Range (HDR) Vision. Springer, Berlin, pp. 123–136 (2007)

  5. Dang, T., Hoffmann, C.: Stereo calibration in vehicles. In: IEEE Transactions on Intelligent Transportation Systems, pp. 268–273 (2004)

  6. Broggi, A., Bertozzi, M., Fascioli, A.: Self-calibration of s stereo vision system for automative applications. In: IEEE International Conference on Robotics and Automation, pp. 3698–3703 (2001)

  7. Hoiem, D., Efros, A., Heber, M.: Putting objects in perspective. IEEE Conf. Comp. Vis. Patt. Recognit. 2, 2137–2144 (2006)

  8. Labayrade, R., Aubert, D., Tarel, J.: Real time obstacle detection in stereovison on non flat road geometry through ‘V-Disparity‘ representation. IEE Trans. Intell. Transp. Syst. 2, 17–21 (2002)

    Google Scholar 

  9. Sappa, A., Dornaika, F., Ponsa, D., Geronimo, D., Lopez, A.: An efficient approach to onboard stereo vision system pose estimation. IEE Trans. Intle. Transp. Syst. 9(3), 476–490 (2008)

    Article  Google Scholar 

  10. Ess, A., Leibe, B., VanGool, L.: Depth and appearance for mobile scene analysis. In: IEEE International Conference on Computer Vision (2007)

  11. Ess, A., Leibe, B., Schindler, K., VanGool, L.: A mobile vision system for robust multi-person tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)

  12. Ess, A., Leibe, B., Schindler, K., VanGool, L.: Robust multiperson tracking from a mobile platform. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, No. 10, pp. 1831–1846 (2009)

  13. Keller, C.G., Dang, T., Feritz, H., Joos, A., Rabe, C., Gavrila, D.M.: Active pedestrian safety by automatic braking and evasive steering. IEE Trans. Intell. Transp. Syst. 12(4), 1292–1304 (2011)

    Article  Google Scholar 

  14. Agawal, S., Roth, D.: Learning a sparse representation for object detection. In: European Conference on Computer Vision (2002)

  15. Munder, S., Gavrila, D.M.: An experimental study on pedestrian classification. IEEE Trans. Patt. Anal. Mach. Intell. 28(11), 1863–1868 (2006)

    Article  Google Scholar 

  16. Tuzel, O., Porikli, F., Meer, P.: Human detection via classification on Riemannian manifolds. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)

  17. Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian Detection: A Benchmark. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)

  18. Enzweiler, M., Gavrila, D.M.: Monocular pedestrian detection survey and experiments. IEEE Trans. Patt. Anal. Mach. Intell. 31(12), 2179–2195 (2009)

  19. Dollar, P., Belongie, S., Perona, P.: The fastest pedestrian detection in the west. In: Briti. Mach. Visi. Conf. (2010)

  20. Bourdev, L., Maji, S., Brox, T., Malik, J.: Detecting people using mutually consistent poselet activations. In: Euro. Conf. Computer Vision (2010)

  21. Gavrial, D.M., Munder, S.: Multi-cue pedestrian detection and tracking from a moving vehicle. Int. J. Comput. Vis. 73(1), 41–59 (2007)

    Article  Google Scholar 

  22. Gavrial, D.M., Giebel, J., Munder, S.: Vision-based pedestrian detection: the PROCTOR system. In: IEEE Intell. Veh. Sysmp., pp. 13–18 (2004)

  23. Papageorgiou, C., Poggio, T.: A trainable system for object detection. Int. J. Comput. Vis. 38(1), 15–33 (2000)

    Article  MATH  Google Scholar 

  24. Viola, P., Jones, M.: Robust real-time object detection. Int. J. Comput. Vis (2001)

  25. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005), pp. 886–893

  26. Zhu, Q., Avidan, S., Yeh, M., Cheng, K.: Fast human detection using a cascade of histograms of oriented gradients. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1491–1498 (2006)

  27. Yamauchi, Y., Fujiyoshi, H.: People detection based on co-occurrence of appearance and spatiotemporal features. In: International Conference on Pattern Recognition (2008)

  28. Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by bayesian combination of edgelet based part detectors. Int. J. Comput. Vis. 75(2), 247–266 (2007)

    Article  Google Scholar 

  29. Leibe, B., Cornelis, N., Gool, L.V.: Dynamic 3D scene analysis from a moving vehicle. In: IEEE Conference on Computer Vision and Pattern Recogition (2007)

  30. Andriluka, M., Roth, S., Schiele, B.: People-tracking-by-detection and people-detection-by-tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)

  31. Wang, H.Z., Han, X., Yan, S.C.: An HOG-LBP Human Detector with Partial Occlusion Handling. In: IEEE International Conference on Computer Vision (2009)

  32. Park, D., Ramanan, D., Fowlkes, C.: Multiresolution models for object detection. In: European Conference on Computer Vision (2010)

  33. Barinova, O., Lempitsky, V., Kohli, P.: On detection of multiple object instances using hough transforms. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)

  34. Roth, P. M., Sternig, S., Grabner, H., Bischof, H.: Classifier grids for robust adaptive object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)

  35. Sternig, S., Roth, P.M., Bishcof, H.: Learning of scene-specific object detectors for classifier co-grids. In: Intern. Confer. on Advan. Vid. and Sigal-Based Survei. (2010)

  36. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. In: IEEE Trans. Patt. Analy. and Mach. Intell., vol. 32, No. 9, pp. 1627–1645 (2010)

  37. Felzenszwalb, P.F., Girshick, R.B., McAllester, D.: Cascade object detection with deformable part models. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)

  38. Walk, S., Majer, N., Schindler, K., Schiele, B.: New features and insights for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)

  39. Marin, J., Vazquez, D., Geronimo, D., Lopez, A.M.: Learning appearance in virtual scenarios for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)

  40. Wang, M., Wang, X.G.: Automatic adaption of a generic pedestrian detector to a specific traffic scene. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)

  41. Rodriguez, M., Laptev, I., Sivic, J.: Jean-Yves Audibert: density-aware person detection and tracking in crowds. In: IEEE International Conference on Computer Vision (2011)

  42. Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., VanGool, L.: Online multi-person tracking-by-detection from a single, uncalibrated camera. IEEE Trans. Patt. Anal. Mach. Intell. 33(9), 1820–1833 (2011)

    Article  Google Scholar 

  43. Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. Int. J. Comput. Vis. 63(2), 13–161 (2005)

    Article  Google Scholar 

  44. Laptev, I., Perez, P.: Retrieving actions in movie. In: IEEE International Conference on Computer Vision (2007)

  45. Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: International Conference on Multime, pp. 357–360 (2007)

  46. Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Visiion and Pattern Recognition (2008)

  47. Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradient. In: Briti. Mach. Visi. Conf., pp. 995–1004 (2008)

  48. Daniel, W., Mustafa, O., Pascal, F.: Making action recognition robust to occlusions and viewpoint changes. In: European Conference on Computer Vision, pp. 635–648, Berlin Heidelberg (2010)

  49. Rodriguez, M., Sivic, J., Laptev, I., Audibert, J.Y.: Data-driven crowd analysis in videos. In: IEEE International Conference on Computer Vision (2011)

  50. Sun, J., Wu, X., Yan, S.C., Cheong, L.F., Chua, T.S., Li, J.T.: Hierarchical spatio-temporal context modeling for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)

  51. Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: British Mach. Visi. Conf. (2009)

  52. Murai, Y., Fujiyoshi, H., Kanade, T.: Combined Object Detection and Segmentation by Using Space–Time Patches. Asian Conference on Computer Visiion, pp 915–924 (2009)

  53. Klaser, A.: Will Person Detection help Bag-of-features Action Recognition? Technical Report. INRIA, Spe (2010)

  54. Dalal, N., Triggs, B., Schmid, C.: Human Detection using oriented histograms of flow and appearance. In: European Conference on Computer Vision, pp. 428–441 (2006)

  55. Horn, B.K.P., Schunk, B.G.: Determining Optical Flow. Artif. Intell. 17, 185–203 (1981)

  56. Liu, Y., Shan, S., Chen, X., Heikkila, J., Gao, W., Pietikainen, M.: Spatial–temporal granularity-tunable gradients partition (STGGP) descriptor for human detection. In: European Conference on Computer Vision (2010)

  57. Cheng, Y.Z.: Mean shift, mode seeking, and clustering. IEEE Trans. Patt. Anal. Mach. Intell. 17(8), 790–799 (1995)

    Article  Google Scholar 

  58. Goubet, E., Katz, J., Porikli, F.: Pedestrian tracking using thermal infrared imaging. In: SPIE Conference Infrared Technology and Applications, pp. 797–808 (2006)

  59. Fardi, B., Schuener, U., Wanielik, G.: Shape and motion-based pedestrian detection in infrared images: a multi sensor approach. IEEE Trans. on Intle. Transp. Sys., pp. 18–23 (2005)

  60. Milch, S., Behrens, M.: Pedestrian detection with radar and computer vision. In: Proceedings of the Conference on Progress in Automobile Light (2001)

  61. Bertozzi, M., Broggi, A., Felisa, M., Vezzoni, G., DellRose, M.: Low-level pedestrian detection by means of visible and far infra-red tera-vision. In: IEEE Trans. on Intle. Transp. Sys., pp. 231–236 (2006)

  62. Marchal, P., Dehesa, M., Gavrila, D., Meinecke, M.-M., Skellern, N., Viviguerra, R.: SAVE-U. Final Report, Technical Report, Inform. Sco. Techn. Program. of the EU (2005)

  63. Harris, C., Stephens, M.J.: A combined corner and edge detector. Alvey Visi. Confe., pp. 147–152 (1988)

  64. Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Seventh International Joint Conference on Artificial Intelligence (IJCAI), pp. 674–679 (1981)

  65. Hartley, R., Gupta, R., Chang, T.: Stereo from uncalibrated cameras. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 761–764 (1992)

  66. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for mobile fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), pp. 381–395

  67. Triggs, B., Mclauchalan, P., Hartley, R., Fitzgibbon, A.: Bundle adjustment—a moden synthesis vision algorithm: theory & practive. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) Springer, LNCS, Berlin, 1883 (2000)

  68. Richar, H., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)

  69. Viola, P., Jones, M.: Robust real-time face detection. Int. J. Comput. Vis. 57, 137–154 (2004)

    Article  Google Scholar 

  70. Keller, C., Enzweiler, M., Gavrila, D.M.: A New Benchmark for Stereo-based Pedestrian Detection. In: Proc. of the IEEE Intell. Veh. Symp, Baden-Baden, Germany (2011)

  71. Hua, C.S., Makihara, Y., Yagi, Y.: Pedestrian detection by usaing a spatio-temporal histogram of oriented gradients. In: IEICE Trans. on Inform. & Sys., vol. E96-D, No. 6, pp. 1376–1386 (2013)

  72. Ryan, K., Balzano, L., Wright, S.J., Taylor, C.J.: Online algorithms for factorization-based structure from motion. arXiv:1309.6964. http://arxiv.org/abs/1309.6964

Download references

Acknowledgments

This work was partly supported by the National Natural Science Foundation of China project 61433016, JSPS KEKENHI Grant Number 21220003, “R&D Program for Implementation of Anti-Crime and Anti-Terrorism Technologies for a Safe and Secure Society”, Strategic Funds for the Promotion of Science and Technology of the Ministry of Education, Culture, Sports, Science and Technology, the Japanese Government, and the JST CREST “Behavior Understanding based on Intention-Gait Model” project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunsheng Hua.

Additional information

B. Li has left the Honda R& D Co., Ltd, Japan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hua, C., Makihara, Y., Yagi, Y. et al. Onboard monocular pedestrian detection by combining spatio-temporal hog with structure from motion algorithm. Machine Vision and Applications 26, 161–183 (2015). https://doi.org/10.1007/s00138-014-0653-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-014-0653-y

Keywords

Navigation