Skip to main content

Learn to Recover Visible Color for Video Surveillance in a Day

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12346))

Included in the following conference series:

Abstract

In silicon sensors, the interference between visible and near-infrared (NIR) signals is a crucial problem. For all-day video surveillance, commercial camera systems usually adopt NIR cut filter, and auxiliary NIR LED illumination to selectively block or enhance NIR signal according to the surrounding light conditions. This switching between the daytime and the nighttime mode inevitably involves mechanical parts, and thus requires frequent maintenance. Furthermore, images captured at nighttime mode are in shortage of chrominance, which might hinder human interpretation and high-level computer vision algorithms in succession. In this paper, we present a deep learning based approach that directly generates human-friendly, visible color for video surveillance in a day. To enable training, we capture well-aligned video pairs through a customized optical device and contribute a large-scale dataset, video surveillance in a day (VSIAD). We propose a novel multi-task deep network with state synchronization modules to better utilize texture and chrominance information. Our trained model generates high-quality visible color images and achieves state-of-the-art performance on multiple metrics as well as subjective judgment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Berg, A., Ahlberg, J., Felsberg, M.: Generating visible spectrum images from thermal infrared. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1143–1152 (2018)

    Google Scholar 

  2. Chen, C., Chen, Q., Do, M.N., Koltun, V.: Seeing motion in the dark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3185–3194 (2019)

    Google Scholar 

  3. Chen, C., Chen, Q., Xu, J., Koltun, V.: Learning to see in the dark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3291–3300 (2018)

    Google Scholar 

  4. Chen, Z., Wang, X., Liang, R.: Rgb-nir multispectral camera. Opt. Express 22(5), 4985–4994 (2014)

    Article  Google Scholar 

  5. Cheng, Z., Yang, Q., Sheng, B.: Deep colorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 415–423 (2015)

    Google Scholar 

  6. Choe, G., Kim, S.H., Im, S., Lee, J.Y., Narasimhan, S.G., Kweon, I.S.: Ranus: RGB and NIR urban scene dataset for deep scene parsing. IEEE Rob. Autom. Lett. 3(3), 1808–1815 (2018)

    Article  Google Scholar 

  7. Fredembach, C., Süsstrunk, S.: Colouring the near-infrared. In: Color and Imaging Conference, vol. 2008, pp. 176–182. Society for Imaging Science and Technology (2008)

    Google Scholar 

  8. Gao, S., Cheng, Y., Zhao, Y.: Method of visual and infrared fusion for moving object detection. Opt. Lett. 38(11), 1981–1983 (2013)

    Article  Google Scholar 

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  10. Honda, H., Timofte, R., Van Gool, L.: Make my day-high-fidelity color denoising with near-infrared. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 82–90 (2015)

    Google Scholar 

  11. Hwang, S., Park, J., Kim, N., Choi, Y., So Kweon, I.: Multispectral pedestrian detection: Benchmark dataset and baseline. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1037–1045 (2015)

    Google Scholar 

  12. Iizuka, S., Simo-Serra, E., Ishikawa, H.: Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans. Graph. (TOG) 35(4), 110 (2016)

    Article  Google Scholar 

  13. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)

  14. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)

    Google Scholar 

  15. Jiang, H., Zheng, Y.: Learning to see moving objects in the dark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7324–7333 (2019)

    Google Scholar 

  16. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43

    Chapter  Google Scholar 

  17. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  18. Kise, M., Park, B., Heitschmidt, G.W., Lawrence, K.C., Windham, W.R.: Multispectral imaging system with interchangeable filter design. Comput. Electron. Agric. 72(2), 61–68 (2010)

    Article  Google Scholar 

  19. Kleynen, O., Leemans, V., Destain, M.F.: Development of a multi-spectral vision system for the detection of defects on apples. J. Food Eng. 69(1), 41–49 (2005)

    Article  Google Scholar 

  20. Koyama, S., Inaba, Y., Kasano, M., Murata, T.: A day and night vision MOS imager with robust photonic-crystal-based RGB-and-IR. IEEE Trans. Electron Devices 55(3), 754–759 (2008)

    Article  Google Scholar 

  21. Lai, W.S., Huang, J.B., Wang, O., Shechtman, E., Yumer, E., Yang, M.H.: Learning blind video temporal consistency. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 170–185 (2018)

    Google Scholar 

  22. Larsson, G., Maire, M., Shakhnarovich, G.: Learning representations for automatic colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 577–593. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_35

    Chapter  Google Scholar 

  23. Lei, C., Chen, Q.: Fully automatic video colorization with self-regularization and diversity. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3753–3761 (2019)

    Google Scholar 

  24. Li, W., Zhang, J., Dai, Q.H.: Robust blind motion deblurring using near-infrared flash image. J. Visual Commun. Image Representation 24(8), 1394–1413 (2013)

    Article  Google Scholar 

  25. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

    Google Scholar 

  26. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  27. Lowe, D.G., et al.: Object recognition from local scale-invariant features. In: ICCV, vol. 99, pp. 1150–1157 (1999)

    Google Scholar 

  28. Lu, Y.M., Fredembach, C., Vetterli, M., Süsstrunk, S.: Designing color filter arrays for the joint capture of visible and near-infrared images. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 3797–3800. IEEE (2009)

    Google Scholar 

  29. Lv, F., Zheng, Y., Li, Y., Lu, F.: An integrated enhancement solution for 24-hour colorful imaging. In: AAAI, pp. 11725–11732 (2020)

    Google Scholar 

  30. Matsui, S., Okabe, T., Shimano, M., Sato, Y.: Image enhancement of low-light scenes with near-infrared flash images. Inf. Media Technol. 6(1), 202–210 (2011)

    Google Scholar 

  31. Mehri, A., Sappa, A.D.: Colorizing near infrared images through a cyclic adversarial approach of unpaired samples. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 971–979. IEEE (2019)

    Google Scholar 

  32. Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1520–1528 (2015)

    Google Scholar 

  33. Nyberg, A., Eldesokey, A., Bergström, D., Gustafsson, D.: Unpaired thermal to visible spectrum transfer using adversarial training. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11134, pp. 657–669. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11024-6_49

    Chapter  Google Scholar 

  34. Özkan, K., Işık, Ş., Yavuz, B.T.: Identification of wheat kernels by fusion of RGB, SWIR, VNIR samples over feature and image domain. J. Sci. Food Agricu. 99, 4977–4984 (2019)

    Article  Google Scholar 

  35. Park, C., Kang, M.: Color restoration of RGBn multispectral filter array sensor images based on spectral decomposition. Sensors 16(5), 719 (2016)

    Article  Google Scholar 

  36. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  37. Sadeghipoor, Z., Lu, Y.M., Süsstrunk, S.: A novel compressive sensing approach to simultaneously acquire color and near-infrared images on a single sensor. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1646–1650. IEEE (2013)

    Google Scholar 

  38. Schaul, L., Fredembach, C., Süsstrunk, S.: Color image dehazing using the near-infrared. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 1629–1632. IEEE (2009)

    Google Scholar 

  39. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  40. Tessler, N., Medvedev, V., Kazes, M., Kan, S., Banin, U.: Efficient near-infrared polymer nanocrystal light-emitting diodes. Science 295(5559), 1506–1508 (2002)

    Article  Google Scholar 

  41. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)

  42. Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)

    Google Scholar 

  43. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  44. Zafar, I., Zakir, U., Romanenko, I., Jiang, R.M., Edirisinghe, E.: Human silhouette extraction on FPGAs for infrared night vision military surveillance. In: 2010 Second Pacific-Asia Conference on Circuits, Communications and System, vol. 1, pp. 63–66. IEEE (2010)

    Google Scholar 

  45. Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 649–666. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_40

    Chapter  Google Scholar 

  46. Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)

    Google Scholar 

  47. Zhang, X., Sim, T., Miao, X.: Enhancing photographs with near infra-red images. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)

    Google Scholar 

  48. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the JSPS KAKENHI under Grant No. 19K20307. A part of this work was finished during Y. Zheng’s visit and X. Ding’s internship at Peng Cheng Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yinqiang Zheng .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 9980 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, G. et al. (2020). Learn to Recover Visible Color for Video Surveillance in a Day. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12346. Springer, Cham. https://doi.org/10.1007/978-3-030-58452-8_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58452-8_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58451-1

  • Online ISBN: 978-3-030-58452-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics