Abstract
Object detection is one of the essential branches of computer vision. However, detecting objects in the natural scene is challenging due to various reasons, for example, different sizes of objects, overlapping and similarities of colour, the texture of different objects, etc. The visible spectrum is not suited for standard computer vision tasks in many real-life scenarios. In low visibility settings, moving outside the visible spectrum range, such as to the thermal spectrum or near-infrared (NIR) imaging, is significantly more beneficial. For the object detection task in this study, we used photos from both the RGB and NIR spectrums. The purpose of this paper is to see if it's possible to use the information offered by the near-infrared (NIR) spectrum in conjunction with the visible band for object detection because they both have multimodal information. For example, because near-infrared wavelengths are less prone to haze and distortion, some visually indistinguishable things in the RGB spectrum can be spotted in the NIR image. We gathered a well-organized dataset of outdoor scenes in three spectra: visible (RGB), near-infrared (NIR), and thermal to train such a multispectral object recognition system. For the experiments, we use the YOLOv3 algorithm to train and evaluate our object detection models for NIR and RGB images separately, then train the model with four-channel input (3 channels from RGB images and one channel from NIR images) and the corresponding annotations to see if the model's performance improves even more in detecting the underlying objects. To determine the effectiveness of our approach, we conducted trials on YOLOv4 and SSD models and compared our results with existing related state-of-the-art models.
Similar content being viewed by others
References
Aguilera, Cristhian, et al. (2017) RGBN multispectral images: A novel color restoration approach." International Conference on Practical Applications of Agents and Multi-Agent Systems. Springer, Cham
Alldieck T, Bahnsen CH, Moeslund TB (2016) Context-aware fusion of RGB and thermal imagery for traffic monitoring. Sensors 16(11):1947
Ambinder M (2011) The secret team that killed bin laden. Natl J 3
Angermann M, Wolkow S, Schwithal A, Tonhäuser C, Bestmann U, Hecker P (2017) Multispectral Image-Aided Automatic Landing System: Position Availability Investigation during Final Approach," Proceedings of the ION 2017 Pacific PNT Meeting, Honolulu, Hawaii, pp. 56–69
Bochkovskiy, Alexey & Wang, Chien-Yao & Liao, Hong-yuan. (2020) YOLOv4: Optimal Speed and Accuracy of Object Detection.
Brown, Matthew, and Sabine Süsstrunk (2011) Multispectral SIFT for scene category recognition. CVPR 2011. IEEE
Choe, Gyeongmin, et al. (2018) RANUS: RGB and NIR urban scene DataSet for deep scene parsing. IEEE Robotics and Automation Letters 3.3:1808–1815
Correa M, Hermosilla G, Verschae R, Ruiz-del-Solar J (2012) Human detection and identification by robots using thermal and visual information in domestic environments. J Intell Robot Syst 66:223–243. https://doi.org/10.1007/s10846-011-9612-2
Davis, James W, and Mark A. Keck (2005) A two-stage template approach to person detection in thermal imagery." 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05)-Volume 1. Vol. 1. IEEE
Di W, Zhang L, Zhang D, Pan Q (2010) Studies on hyper-spectral face recognition in visible spectrum with feature band selection. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 40(6):1354–1361
Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A (2015) The pascal visual object classes challenge: A retrospective. Int J Comput Vision 111(1):98–136
Farley V, Vallières A, Villemaire A, Chamberland M, Lagueux P, Giroux J (2007) Chemical agent detection and identification with a hyperspectral imaging infrared sensor. In: Electro-Optical Remote Sensing, Detection, and Photonic Technologies and Their Applications, vol. 6739, p. 673918. International Society for Optics and Photonics
Felzenszwalb PF, Girshick RB, and McAllester D (2010a) "Cascade object detection with deformable part models," in Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on (San Francisco, CA: IEEE), 2241–2248.
Girshick R (2015) Fast R-CNN. In ICCV
Girshick R, Donahue J, Darrell T, and Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR
Govardhan P and Umesh Chandra Pati (2014) NIR image based pedestrian detection in night vision with cascade classification and validation. In Proceedings of International Conference on Advanced Communication Control and Computing Technologies
Gudžius, Povilas, et al. (2021) Deep learning-based object recognition in multispectral satellite imagery for real-time applications. Mach Vis Appl 32.4: 1–14
Ha, Qishen, et al. (2017) MFNet: Towards real-time semantic segmentation for autonomous vehicles with multispectral scenes. 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE
Hunt Jr & Hively, W. & Fujikawa, Stephen & Linden, David & Daughtry, Craig & McCarty, Greg. (2010) Acquisition of NIR-Green-Blue Digital Photographs from Unmanned Aircraft for Crop Monitoring. Remote Sensing 2https://doi.org/10.3390/rs2010290
Soonmin Hwang, Jaesik Park, Namil Kim, Yukyung Choi, and In So Kweon (2015) Multispectral pedestrian detection: Benchmark dataset and baseline. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Kingma, Diederik & Ba, Jimmy (2014) Adam: A Method for Stochastic Optimization. International Conference on Learning Representations.
Lin T.-Y, Dollar P, Girshick R, He K, Hariharan B, and Belongie S (2017) Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2117–2125
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. Springer, In European conference on computer vision
Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.
Liu, Jingjing, et al. (2016) Multispectral deep neural networks for pedestrian detection. arXiv preprint arXiv:1611.02644
Lu S, Wang B, Wang H, Chen L, Linjian Ma, Zhang X (2019) A real-time object detection algorithm for video. Comput Electr Eng 77:398–408. https://doi.org/10.1016/j.compeleceng.2019.05.009
Mundy, Joseph L (1998) Object recognition based on geometry: Progress over three decades. Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 356, no. 1740: 1213–1231.
Gani, Md Osman, et al. (2021) Multispectral Object Detection with Deep Learning.International Conference on Computational Intelligence in Communications and Business Analytics. Springer, Cham
Osorio, Kavir, et al. (2020) A deep learning approach for weed detection in lettuce crops using multispectral images. AgriEngineering 2.3:471–488
Redmon J, Divvala S, Girshick R, and Farhadi A (2015) You only look once: Unified, real-time object detection. ArXiv preprint arXiv:1506.02640
Redmon J and Ali Farhadi (2018) YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767
Redmon J and Farhadi A (2017) Yolo9000: Better, faster, stronger. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pages 6517–6525. IEEE
Ren S, He K, Girshick R, and Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS
Shahidi A, Patel S, Flanagan J, Hudson C (2013) Regional variation in human retinal vessel oxygen saturation. Exp Eye Res 113:143–147
Takumi, Karasawa, et al. (2017) Multispectral object detection for autonomous vehicles." Proceedings of the on Thematic Workshops of ACM Multimedia 2017
Zeng Bo, Wang W, Wang Na, Li F, Zhai F, Hu L (2013) Noninvasive Blood Glucose Monitoring System Based on Distributed Multi-Sensors Information Fusion of Multi-Wavelength NIR. Engineering 05:553–560. https://doi.org/10.4236/eng.2013.510B114
Zhao ZQ, Zheng P, Xu ST, Wu X (2019) Object Detection with Deep Learning: A Review. https://doi.org/10.1109/TNNLS.2018.2876865
Acknowledgements
We want to thank the entire team of people that made the collection of this dataset possible.
– Image Capturing: Md. Osman Gani, Priyam Sarkar
– Data Preprocessing: Priyam Sarkar, Md. Osman Gani
– Data Annotation: Md. Osman Gani
This work is supported by the project sponsored by SERB (Government of India, order no. SB/S3/EECE/054/2016) (dated 25/11/2016) and carried out at the Centre for Microprocessor Application for Training Education and Research, CSE Department, Jadavpur University. The second author would like to thank The Department of Science and Technology for their INSPIRE Fellowship program (IF170641) for the financial support.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of Interest
The authors declare that they do not have any conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bhowmick, S., Kuiry, S., Das, A. et al. Deep Learning-Based Outdoor Object Detection Using Visible and Near-Infrared Spectrum. Multimed Tools Appl 81, 9385–9402 (2022). https://doi.org/10.1007/s11042-021-11848-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11848-2