Advertisement

Pedestrian object detection with fusion of visual attention mechanism and semantic computation

  • Feng XiaoEmail author
  • Baotong Liu
  • Runa Li
Article
  • 40 Downloads

Abstract

In response to the problem that the primary visual features are difficult to effectively address pedestrian detection in complex scenes, we present a method to improve pedestrian detection using a visual attention mechanism with semantic computation. After determining a saliency map with a visual attention mechanism, we can calculate saliency maps for human skin and the human head-shoulders. Using a Laplacian pyramid, the static visual attention model is established to obtain a total saliency map and then complete pedestrian detection. Experimental results demonstrate that the proposed method achieves state-of-the-art performance on the INRIA dataset with 92.78% pedestrian detection accuracy at a very competitive time cost.

Keywords

Visual attention mechanism Semantic computation Pedestrian detection Skin Head-shoulders 

Notes

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 61572392) and Shaanxi Provincial Natural Science Foundation (Grant No. 2017JC2-08).

References

  1. 1.
    Bruce N, Tsotsos J (2006) Saliency based on information maximization. In: Advances in neural information processing systems. MIT Press, Cambridge, pp 155–162Google Scholar
  2. 2.
    Cai Z, Saberian M, Vasconcelos N (2015) Learning complexity-aware cascades for deep pedestrian detection. In: 2015 IEEE international conference on computer vision. IEEE Press, Santiago, pp 3361–3369CrossRefGoogle Scholar
  3. 3.
    Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition. IEEE Press, San Diego, pp 886–893Google Scholar
  4. 4.
    Einhäuser W, Spain M, Perona P (2008) Objects predict fixations better than early saliency. In: Journal of vision. Association for Research in Vision and Ophthalmology, Rockville, pp 1–26Google Scholar
  5. 5.
    Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. In: IEEE transactions on pattern analysis and machine intelligence. IEEE Press, New York, pp 1627–1645Google Scholar
  6. 6.
    Gajjar V, Khandhediya Y, Gurnani A, Mavani V, Raval MS (2018) ViS-HuD: using visual saliency to improve human detection with convolutional neural networks. In: 2018 IEEE conference on computer vision and pattern recognition workshops. IEEE Press, Salt Lake City, pp 1908–1916Google Scholar
  7. 7.
    Harel J, Koch C, Perona P (2007) Graph-based visual saliency. In: Advances in neural information processing systems. MIT Press, Cambridge, pp 545–552Google Scholar
  8. 8.
    Hoai M, Zisserman A (2014) Talking heads: detecting humans and recognizing their interactions. In: 2014 IEEE conference on computer vision and pattern recognition. IEEE Press, Columbus, pp 875–882CrossRefGoogle Scholar
  9. 9.
    Itti L (2000) Models of bottom-up and top-down visual attention. California Institute of Technology Pasadena, State of CaliforniaGoogle Scholar
  10. 10.
    Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. In: IEEE transactions on pattern analysis and machine intelligence. IEEE Press, New York, pp 1254–1259Google Scholar
  11. 11.
    Jing ZL, Xiao G, Li ZH (2007) Image fusion: theories and applications. Higher Education Press, BeijingGoogle Scholar
  12. 12.
    Ketenci S, Gencturk B (2013) Performance analysis in common color spaces of 2D Gaussian color model for skin segmentation. In: Eurocon 2013. IEEE Press, Zagreb, pp 1653–1657CrossRefGoogle Scholar
  13. 13.
    Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: 2005 IEEE computer society conference on computer vision and pattern recognition. IEEE Press, San Diego, pp 878–885Google Scholar
  14. 14.
    Li N, Gong Y, Xu J, Gu X, Xu T, Zhou H (2016) Semantic feature-based visual attention model for pedestrian detection. In: Journal of image and graphics. Journal of Image and Graphics, Beijing, pp 723–733Google Scholar
  15. 15.
    Liu Q, Zhang QZ, Chen WB, Huang ZC (2014) Pedestrian detection based on modeling computation of visual attention. In: Journal of Beijing Information Science & Technology University. Beijing Information Science & Technology University, Beijing, pp 59–65Google Scholar
  16. 16.
    Lu S, Mahadevan V, Vasconcelos N (2014) Learning optimal seeds for diffusion-based salient object detection. In: 2014 IEEE conference on computer vision and pattern Recognitio-n. IEEE Press, Columbus, pp 2790–2797CrossRefGoogle Scholar
  17. 17.
    Lu H, Li Y, Uemura T, Kim H, Serikawa S (2018) Low illumination underwater light field images reconstruction using deep convolutional neural networks. In: Future generation com-puter systems. Elsevier, Amsterdam, pp 142–148Google Scholar
  18. 18.
    Lu H, Li Y, Chen M, Kim H, Serikawa S (2018) Brain intelligence: go beyond artificial intelligence. In: Mobile networks and applications. Springer, New York, pp 368–375Google Scholar
  19. 19.
    Lu H, Wang D, Li Y, Li J, Li X, Kim H, Serikawa S, Humar I (2019) CONet: a Congnitive Ocean network. In: IEEE wireless communications. IEEE Press, PiscatawayGoogle Scholar
  20. 20.
    Maji S, Berg AC, Malik J (2008) Classification using intersection kernel support vector ma-chines is efficient. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE Press, Anchorage, pp 1–8Google Scholar
  21. 21.
    Mao J, Xiao T, Jiang Y, Cao Z (2017) What can help pedestrian detection? In: 2017 IEEE conference on computer vision and pattern recognition. IEEE Press, Honolulu, pp 6034–6043CrossRefGoogle Scholar
  22. 22.
    Navalpakkam V, Itti L (2006) An integrated model of top-down and bottom-up attention for optimizing detection speed. In: 2006 IEEE computer society conference on computer vision and pattern recognition. IEEE Press, New York, pp 2049–2056Google Scholar
  23. 23.
    Shashua A, Gdalyahu Y, Hayun G (2004) Pedestrian detection for driving assistance systems: single-frame classification and system level performance. In: IEEE intelligent vehicles symposium. IEEE Press, Parma, pp 1–6Google Scholar
  24. 24.
    Wang X, Han TX, Yan S (2009) An HOG-LBP human detector with partial occlusion handling. In: 2009 IEEE 12th international conference on computer vision. IEEE Press, Kyoto, pp 32–39CrossRefGoogle Scholar
  25. 25.
    Wang G, Liu Q, Zhang J (2015) Method research on vehicular infrared pedestrian detection based on local features. In: Acta electronica sinica. Acta Electronica Sinica, Beijing, pp 1444–1448Google Scholar
  26. 26.
    Xu Y, Xu XL, Li CN, Jiang JG (2016) Pedestrian detection combining with SVM Classifi-er and HOG feature extraction. In: Computer engineering. Computer Engineering, Shanghai, pp 56–60Google Scholar
  27. 27.
    Xu D, Ouyang W, Ricci E, Wang X, Sebe N (2017) Learning cross-modal deep Representat-ions for robust pedestrian detection. In: 2017 IEEE conference on computer vision and pattern recognition. IEEE Press, Honolulu, pp 5363–5371Google Scholar
  28. 28.
    Zhang S, Bauckhage C, Cremers AB (2014) Informed Haar-like features improve pedestrian detection. In: 2014 IEEE conference on computer vision and pattern recognition. IEEE Press, Columbus, pp 947–954CrossRefGoogle Scholar
  29. 29.
    Zhang S, Benenson R, Schiele B (2017) Citypersons: a diverse dataset for pedestrian detection. In: 2017 IEEE conference on computer vision and pattern recognition. I-EEE Press, Honolulu, pp 3213–3221Google Scholar
  30. 30.
    Zhang P, Wang D, Lu H, Wang H, Ruan X (2017) Amulet: aggregating multi-level convolutional features for salient object detection. In: 2017 IEEE international conference on computer vision. IEEE Press, Venice, pp 202–211CrossRefGoogle Scholar
  31. 31.
    Zhang P, Wang D, Lu H, Wang H, Yin B (2017) Learning uncertain convolutional features for accurate saliency detection. In: 2017 IEEE international conference on computer vision. IEEE Press, Venice, pp 212–221CrossRefGoogle Scholar
  32. 32.
    Zhang S, Benenson R, Omran M, Hosang J, Schiele B (2018) Towards reaching human performance in pedestrian detection. In: IEEE transactions on pattern analysis and machine intelligence. IEEE Press, New York, pp 973–986Google Scholar
  33. 33.
    Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection through guided attention in CNNs. In: 2018 IEEE conference on computer vision and pattern recognition. IEEE Press, Salt Lake City, pp 6995–70 03CrossRefGoogle Scholar
  34. 34.
    Zhang Y, Gravina R, Lu H, Villari M, Fortino G (2018) PEA: parallel electrocardiogram-based authentication for smart healthcare systems. In: Journal of network and computer applications. Elsevier, Amsterdam, pp 10–16Google Scholar
  35. 35.
    Zhao W, Zhao F, Wang D, Lu H (2018) Defocus blur detection via multi-stream bottom-top-bottom fully convolutional network. In: 2018 IEEE conference on computer vision and pattern recognition. IEEE Press, Salt Lake City, pp 3080–3088CrossRefGoogle Scholar
  36. 36.
    Zhongdong W, Saichao W, Zichao H (2013) A Bayesian approach to skin detection in YCbCr color space. In: 2013 international joint conference on awareness science and Technology & Ubi-Media Computing. IEEE Press, Aizu-Wakamatsu, pp 606–610Google Scholar
  37. 37.
    Zitnick CL, Vedantam R, Parikh D (2014) Adopting abstract images for semantic scene un-derstanding. In: IEEE transactions on pattern analysis and machine intelligence. IEEE Press, New York, pp 627–638Google Scholar
  38. 38.
    Zuo H, Fan H, Blasch E, Ling H (2017) Combining convolutional and recurrent neural networks for human skin detection. In: IEEE signal processing letters. IEEE Press, Mississippi State, pp 289–293Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringXi’an Technological UniversityXi’anChina

Personalised recommendations