Skip to main content
Log in

Deep Learning-Based Outdoor Object Detection Using Visible and Near-Infrared Spectrum

  • 1212: Deep Learning Techniques for Infrared Image/Video Understanding
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Object detection is one of the essential branches of computer vision. However, detecting objects in the natural scene is challenging due to various reasons, for example, different sizes of objects, overlapping and similarities of colour, the texture of different objects, etc. The visible spectrum is not suited for standard computer vision tasks in many real-life scenarios. In low visibility settings, moving outside the visible spectrum range, such as to the thermal spectrum or near-infrared (NIR) imaging, is significantly more beneficial. For the object detection task in this study, we used photos from both the RGB and NIR spectrums. The purpose of this paper is to see if it's possible to use the information offered by the near-infrared (NIR) spectrum in conjunction with the visible band for object detection because they both have multimodal information. For example, because near-infrared wavelengths are less prone to haze and distortion, some visually indistinguishable things in the RGB spectrum can be spotted in the NIR image. We gathered a well-organized dataset of outdoor scenes in three spectra: visible (RGB), near-infrared (NIR), and thermal to train such a multispectral object recognition system. For the experiments, we use the YOLOv3 algorithm to train and evaluate our object detection models for NIR and RGB images separately, then train the model with four-channel input (3 channels from RGB images and one channel from NIR images) and the corresponding annotations to see if the model's performance improves even more in detecting the underlying objects. To determine the effectiveness of our approach, we conducted trials on YOLOv4 and SSD models and compared our results with existing related state-of-the-art models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Aguilera, Cristhian, et al. (2017) RGBN multispectral images: A novel color restoration approach." International Conference on Practical Applications of Agents and Multi-Agent Systems. Springer, Cham

  2. Alldieck T, Bahnsen CH, Moeslund TB (2016) Context-aware fusion of RGB and thermal imagery for traffic monitoring. Sensors 16(11):1947

    Article  Google Scholar 

  3. Ambinder M (2011) The secret team that killed bin laden. Natl J 3

  4. Angermann M, Wolkow S, Schwithal A, Tonhäuser C, Bestmann U, Hecker P (2017) Multispectral Image-Aided Automatic Landing System: Position Availability Investigation during Final Approach," Proceedings of the ION 2017 Pacific PNT Meeting, Honolulu, Hawaii, pp. 56–69

  5. Bochkovskiy, Alexey & Wang, Chien-Yao & Liao, Hong-yuan. (2020) YOLOv4: Optimal Speed and Accuracy of Object Detection.

  6. Brown, Matthew, and Sabine Süsstrunk (2011) Multispectral SIFT for scene category recognition. CVPR 2011. IEEE

  7. Choe, Gyeongmin, et al. (2018) RANUS: RGB and NIR urban scene DataSet for deep scene parsing. IEEE Robotics and Automation Letters 3.3:1808–1815

  8. Correa M, Hermosilla G, Verschae R, Ruiz-del-Solar J (2012) Human detection and identification by robots using thermal and visual information in domestic environments. J Intell Robot Syst 66:223–243. https://doi.org/10.1007/s10846-011-9612-2

    Article  Google Scholar 

  9. Davis, James W, and Mark A. Keck (2005) A two-stage template approach to person detection in thermal imagery." 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05)-Volume 1. Vol. 1. IEEE

  10. Di W, Zhang L, Zhang D, Pan Q (2010) Studies on hyper-spectral face recognition in visible spectrum with feature band selection. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 40(6):1354–1361

    Article  Google Scholar 

  11. Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A (2015) The pascal visual object classes challenge: A retrospective. Int J Comput Vision 111(1):98–136

    Article  Google Scholar 

  12. Farley V, Vallières A, Villemaire A, Chamberland M, Lagueux P, Giroux J (2007) Chemical agent detection and identification with a hyperspectral imaging infrared sensor. In: Electro-Optical Remote Sensing, Detection, and Photonic Technologies and Their Applications, vol. 6739, p. 673918. International Society for Optics and Photonics

  13. Felzenszwalb PF, Girshick RB, and McAllester D (2010a) "Cascade object detection with deformable part models," in Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on (San Francisco, CA: IEEE), 2241–2248.

  14. Girshick R (2015) Fast R-CNN. In ICCV

  15. Girshick R, Donahue J, Darrell T, and Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR

  16. Govardhan P and Umesh Chandra Pati (2014) NIR image based pedestrian detection in night vision with cascade classification and validation. In Proceedings of International Conference on Advanced Communication Control and Computing Technologies

  17. Gudžius, Povilas, et al. (2021) Deep learning-based object recognition in multispectral satellite imagery for real-time applications. Mach Vis Appl 32.4: 1–14

  18. Ha, Qishen, et al. (2017) MFNet: Towards real-time semantic segmentation for autonomous vehicles with multispectral scenes. 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE

  19. Hunt Jr & Hively, W. & Fujikawa, Stephen & Linden, David & Daughtry, Craig & McCarty, Greg. (2010) Acquisition of NIR-Green-Blue Digital Photographs from Unmanned Aircraft for Crop Monitoring. Remote Sensing 2https://doi.org/10.3390/rs2010290

  20. Soonmin Hwang, Jaesik Park, Namil Kim, Yukyung Choi, and In So Kweon (2015) Multispectral pedestrian detection: Benchmark dataset and baseline. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  21. Kingma, Diederik & Ba, Jimmy (2014) Adam: A Method for Stochastic Optimization. International Conference on Learning Representations.

  22. Lin T.-Y, Dollar P, Girshick R, He K, Hariharan B, and Belongie S (2017) Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2117–2125

  23. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. Springer, In European conference on computer vision

    Google Scholar 

  24. Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.

  25. Liu, Jingjing, et al. (2016) Multispectral deep neural networks for pedestrian detection. arXiv preprint arXiv:1611.02644

  26. Lu S, Wang B, Wang H, Chen L, Linjian Ma, Zhang X (2019) A real-time object detection algorithm for video. Comput Electr Eng 77:398–408. https://doi.org/10.1016/j.compeleceng.2019.05.009

    Article  Google Scholar 

  27. Mundy, Joseph L (1998) Object recognition based on geometry: Progress over three decades. Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 356, no. 1740: 1213–1231.

  28. Gani, Md Osman, et al. (2021) Multispectral Object Detection with Deep Learning.International Conference on Computational Intelligence in Communications and Business Analytics. Springer, Cham

  29. Osorio, Kavir, et al. (2020) A deep learning approach for weed detection in lettuce crops using multispectral images. AgriEngineering 2.3:471–488

  30. Redmon J, Divvala S, Girshick R, and Farhadi A (2015) You only look once: Unified, real-time object detection. ArXiv preprint arXiv:1506.02640

  31. Redmon J and Ali Farhadi (2018) YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767

  32. Redmon J and Farhadi A (2017) Yolo9000: Better, faster, stronger. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pages 6517–6525. IEEE

  33. Ren S, He K, Girshick R, and Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS

  34. Shahidi A, Patel S, Flanagan J, Hudson C (2013) Regional variation in human retinal vessel oxygen saturation. Exp Eye Res 113:143–147

    Article  Google Scholar 

  35. Takumi, Karasawa, et al. (2017) Multispectral object detection for autonomous vehicles." Proceedings of the on Thematic Workshops of ACM Multimedia 2017

  36. Zeng Bo, Wang W, Wang Na, Li F, Zhai F, Hu L (2013) Noninvasive Blood Glucose Monitoring System Based on Distributed Multi-Sensors Information Fusion of Multi-Wavelength NIR. Engineering 05:553–560. https://doi.org/10.4236/eng.2013.510B114

    Article  Google Scholar 

  37. Zhao ZQ, Zheng P, Xu ST, Wu X (2019) Object Detection with Deep Learning: A Review. https://doi.org/10.1109/TNNLS.2018.2876865

Download references

Acknowledgements

We want to thank the entire team of people that made the collection of this dataset possible.

– Image Capturing: Md. Osman Gani, Priyam Sarkar

– Data Preprocessing: Priyam Sarkar, Md. Osman Gani

– Data Annotation: Md. Osman Gani

This work is supported by the project sponsored by SERB (Government of India, order no. SB/S3/EECE/054/2016) (dated 25/11/2016) and carried out at the Centre for Microprocessor Application for Training Education and Research, CSE Department, Jadavpur University. The second author would like to thank The Department of Science and Technology for their INSPIRE Fellowship program (IF170641) for the financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Somenath Kuiry.

Ethics declarations

Conflicts of Interest

The authors declare that they do not have any conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhowmick, S., Kuiry, S., Das, A. et al. Deep Learning-Based Outdoor Object Detection Using Visible and Near-Infrared Spectrum. Multimed Tools Appl 81, 9385–9402 (2022). https://doi.org/10.1007/s11042-021-11848-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11848-2

Keywords

Navigation