Skip to main content

Application of Image-To-Image Translation in Improving Pedestrian Detection

  • Conference paper
  • First Online:
Artificial Intelligence and Sustainable Computing (ICSISCET 2022)

Abstract

Application of deep learning techniques are limited in low-light scenarios. This is because lack of effective target regions makes it difficult to perform several visual functions in low intensity light. The objective of this work is to provide a framework for pedestrian recognition tasks in low-light conditions using image-to-image translation. The key idea behind is accumulation of high-quality information obtained by the combined use of infrared and visible images which make it possible to detect pedestrians even in low-light conditions. In this study, we are going to use deep learning-based models namely Pyramid pix2pixGAN and YOLOv7 to generate translated infrared images and detect pedestrians. The dataset used for training this model is LLVIP, the collection of visible-infrared image pairs for low light vision tasks. Our trained model is able to robustly detect pedestrians in low-light images and is able to beat previous state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Li C, Guo C, Han LH, Jiang J, Cheng MM, Gu J, Loy CC (2021) Low-light image and video enhancement using deep learning: A survey. IEEE Trans Pattern Anal Mach Intell 01:1–1

    Google Scholar 

  2. Haglund J, Jeppsson F, Melander E, Pendrill AM, Xie C, Schönborn K (2016) Infrared cameras in science education. Infrared Phys Technol 75:150–152

    Article  Google Scholar 

  3. Jia X et al (2021) LLVIP: a visible-infrared paired dataset for low-light vision.“ Proceedings of the IEEE/CVF International Conference on Computer Vision

    Google Scholar 

  4. Shah V, Agarwal A, Verlekar TT, Singh R (2021) Adapting Deep Neural Networks for Pedestrian-Detection to Low-Light Conditions without Re-training. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2535–2541

    Google Scholar 

  5. Kruthiventi SS, Sahay P, Biswal R (2017) Low-light pedestrian detection from RGB images using multi-modal knowledge distillation. In: 2017 IEEE International Conference on Image Processing (ICIP), pp 4207–4211. IEEE

    Google Scholar 

  6. Wang C, Luo D, Liu Y, Xu B, Zhou Y (2022) Near-surface pedestrian detection method based on deep learning for UAVs in low illumination environments. Opt Eng 61(2):023103

    Article  Google Scholar 

  7. Tian Y et al (2015) Deep learning strong parts for pedestrian detection.“ Proceedings of the IEEE international conference on computer vision

    Google Scholar 

  8. Jin Y, Zhang Y, Cen Y, Li Y, Mladenovic V, Voronin V (2021) Pedestrian detection with super-resolution reconstruction for low-quality image. Pattern Recogn 115:107846

    Article  Google Scholar 

  9. Wang CY, Bochkovskiy A, Liao HY (2022) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv:2207.02696

  10. Iqbal T, Ali H (2018) Generative adversarial network for medical images (MI-GAN). J Med Syst 42(11):1–11

    Article  Google Scholar 

  11. Ouyang X, Cheng Y, Jiang Y, Li CL, Zhou P (2018) Pedestrian-synthesis-gan: Generating pedestrian data in real scene and beyond. arXiv:1804.02047

  12. Sun J, Du Y, Li C, Wu TH, Yang B, Mok GS (2022) Pix2Pix generative adversarial network for low dose myocardial perfusion SPECT denoising. Quant Imaging Med Surg 12(7):3539

    Article  Google Scholar 

  13. Liu S, Zhu C, Xu F, Jia X, Shi Z, Jin M (2022) BCI: Breast Cancer Immunohistochemical Image Generation through Pyramid Pix2pix. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1815–1824

    Google Scholar 

  14. Shinde S, Kothari A, Gupta V (2018) YOLO based human action recognition and localization. Procedia computer science 133:831–838

    Article  Google Scholar 

  15. Yang, F., Zhang, X., & Liu, B. (2022). Video object tracking based on YOLOv7 and DeepSORT. arXiv preprint arXiv:2207.12202.

  16. Wang CY, Alexey B, Hong-Yuan Mark L (2022) “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv:2207.02696

  17. Takeshita H, Ishii D, Okamoto S, Oki E, Yamanaka N (2011) Highly energy efficient layer-3 network architecture based on service cloud and optical aggregation network. IEICE Trans Commun 94(4):894–903

    Article  Google Scholar 

  18. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Chintala S (2019) Pytorch: an imperative style, high-performance deep learning library. Advanc Neural Informat Process Syst 32

    Google Scholar 

  19. Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134

    Google Scholar 

  20. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Advanc Neural Informat Process Syst 27

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Devarsh Patel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Patel, D., Patel, S., Patel, M. (2023). Application of Image-To-Image Translation in Improving Pedestrian Detection. In: Pandit, M., Gaur, M.K., Kumar, S. (eds) Artificial Intelligence and Sustainable Computing. ICSISCET 2022. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-99-1431-9_37

Download citation

Publish with us

Policies and ethics