Pedestrian detection using multi-scale squeeze-and-excitation module

Lee, Yongwoo; Hwang, Hyekyoung; Shin, Jitae; Oh, Byung Tae

doi:10.1007/s00138-020-01105-1

Pedestrian detection using multi-scale squeeze-and-excitation module

Original Paper
Published: 03 August 2020

Volume 31, article number 55, (2020)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Yongwoo Lee¹,
Hyekyoung Hwang¹,
Jitae Shin ORCID: orcid.org/0000-0002-2599-3331² &
…
Byung Tae Oh³

526 Accesses
3 Citations
Explore all metrics

Abstract

Computer vision systems are major research items for autonomous vehicles. However, it is often challenging to understand the road scene, especially when objects are small and overlapping. To address these problems, this paper proposes a deep learning-based pedestrian detection method for small and overlapping objects. The proposed method adopts a parallel feature pyramid network with multi-scale feature layers, and the multi-scale squeeze-and-excitation (MSSE) module is proposed for better selection of multi-scale features. The proposed MSSE module helps to detect small objects by increasing the final feature resolution. In addition, channel-wise feature representation emphasizes important channels with reduced influence of weakly related features. Finally, the object’s proposals are regressed using soft non-maximum suppression to differentiate the overlapped objects. The experiments show significant performance enhancement with the proposed method in an ablation study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-scale feature balance enhancement network for pedestrian detection

Article 05 March 2022

Improved SSD-Based Multi-scale Pedestrian Detection Algorithm

Learning efficient single stage pedestrian detection by squeeze-and-excitation network

Article 07 July 2021

References

Zhang, S., Benenson, R., Omran, M., Hosang, J., Schiele, B.: How far are we from solving pedestrian detection? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1259–1267 (2016)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput Vision 88, 303–338 (2010)
Article Google Scholar
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C. L.: Microsoft coco: Common objects in context. In: European Conference on Computer Vision, pp. 740–755 (2014)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1904–1916 (2015)
Article Google Scholar
Ren, S., He, K., Girshick, R., Zhang, X., Sun, J.: Object detection networks on convolutional feature maps. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1476–1481 (2017)
Article Google Scholar
Adelson, E.H., Anderson, C.H., Bergen, J.R., Burt, P.J., Ogden, J.M.: Pyramid methods in image processing. RCA Eng. 29, 33–41 (1984)
Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37 (2016)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Li, X., Wu, J., Lin, Z., Liu, H., Zha, H.: Recurrent squeeze-and-excitation context aggregation net for single image deraining. In: Proceedings of the European Conference on Computer Vision, pp. 254–269 (2018)
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European Conference on Computer Vision, pp. 286–301 (2018)
Zhu, Z., Wu, W., Zou, W., Yan, J.: End-to-end flow correlation tracking with spatial-temporal attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 548–557 (2018)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Kim, S.W., Kook, H.K., Sun, J.Y., Kang, M.C., Ko, S.J.: Parallel feature pyramid network for object detection. In: Proceedings of the European Conference on Computer Vision, pp. 234–250 (2018)
Zhou, P., Ni, B., Geng, C., Hu, J., Xu, Y.: Scale-transferrable object detection. In: Proceedings of the European Conference on Computer Vision, pp. 528–537 (2018)
Wang, H., Wang, Q., Gao, M., Li, P., Zuo, W.: Multi-scale location-aware kernel representation for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1248–1257 (2018)
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S. Z.: Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4203–4212 (2018)
Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: International Conference on Pattern Recognition, pp. 850–855 (2006)
Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS–Improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5561–5569 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)
Article MathSciNet Google Scholar
Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V., Darrell, T.: BDD100K: A diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2636–2645 (2020)
Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: a benchmark. IEEE (2009)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, pp. 807–814 (2010)
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: Object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)

Download references

Acknowledgements

This research was partly supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2017R1D1A1B03031752) and was partly supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Centre) support program (IITP-2020-2018-0-01798) supervised by the IITP (Institute for Information & communications Technology Promotion).

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, Republic of Korea
Yongwoo Lee & Hyekyoung Hwang
College of Information and Communication Engineering, Sungkyunkwan University, Suwon, Republic of Korea
Jitae Shin
School of Electronic, Korea Aerospace University, Gyeonggi-Do, Republic of Korea
Byung Tae Oh

Authors

Yongwoo Lee
View author publications
You can also search for this author in PubMed Google Scholar
Hyekyoung Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Jitae Shin
View author publications
You can also search for this author in PubMed Google Scholar
Byung Tae Oh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jitae Shin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, Y., Hwang, H., Shin, J. et al. Pedestrian detection using multi-scale squeeze-and-excitation module. Machine Vision and Applications 31, 55 (2020). https://doi.org/10.1007/s00138-020-01105-1

Download citation

Received: 17 May 2019
Revised: 24 May 2020
Accepted: 17 July 2020
Published: 03 August 2020
DOI: https://doi.org/10.1007/s00138-020-01105-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pedestrian detection using multi-scale squeeze-and-excitation module

Abstract

Access this article

Similar content being viewed by others

Multi-scale feature balance enhancement network for pedestrian detection

Improved SSD-Based Multi-scale Pedestrian Detection Algorithm

Learning efficient single stage pedestrian detection by squeeze-and-excitation network

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Pedestrian detection using multi-scale squeeze-and-excitation module

Abstract

Access this article

Similar content being viewed by others

Multi-scale feature balance enhancement network for pedestrian detection

Improved SSD-Based Multi-scale Pedestrian Detection Algorithm

Learning efficient single stage pedestrian detection by squeeze-and-excitation network

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation