Advertisement

Learn to Recover Visible Color for Video Surveillance in a Day

Conference paper
  • 1.7k Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12346)

Abstract

In silicon sensors, the interference between visible and near-infrared (NIR) signals is a crucial problem. For all-day video surveillance, commercial camera systems usually adopt NIR cut filter, and auxiliary NIR LED illumination to selectively block or enhance NIR signal according to the surrounding light conditions. This switching between the daytime and the nighttime mode inevitably involves mechanical parts, and thus requires frequent maintenance. Furthermore, images captured at nighttime mode are in shortage of chrominance, which might hinder human interpretation and high-level computer vision algorithms in succession. In this paper, we present a deep learning based approach that directly generates human-friendly, visible color for video surveillance in a day. To enable training, we capture well-aligned video pairs through a customized optical device and contribute a large-scale dataset, video surveillance in a day (VSIAD). We propose a novel multi-task deep network with state synchronization modules to better utilize texture and chrominance information. Our trained model generates high-quality visible color images and achieves state-of-the-art performance on multiple metrics as well as subjective judgment.

Keywords

Video surveillance in a day Color recovery State synchronization network 

Notes

Acknowledgements

This work was supported in part by the JSPS KAKENHI under Grant No. 19K20307. A part of this work was finished during Y. Zheng’s visit and X. Ding’s internship at Peng Cheng Laboratory.

Supplementary material

500725_1_En_29_MOESM1_ESM.pdf (9.7 mb)
Supplementary material 1 (pdf 9980 KB)

References

  1. 1.
    Berg, A., Ahlberg, J., Felsberg, M.: Generating visible spectrum images from thermal infrared. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1143–1152 (2018)Google Scholar
  2. 2.
    Chen, C., Chen, Q., Do, M.N., Koltun, V.: Seeing motion in the dark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3185–3194 (2019)Google Scholar
  3. 3.
    Chen, C., Chen, Q., Xu, J., Koltun, V.: Learning to see in the dark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3291–3300 (2018)Google Scholar
  4. 4.
    Chen, Z., Wang, X., Liang, R.: Rgb-nir multispectral camera. Opt. Express 22(5), 4985–4994 (2014)CrossRefGoogle Scholar
  5. 5.
    Cheng, Z., Yang, Q., Sheng, B.: Deep colorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 415–423 (2015)Google Scholar
  6. 6.
    Choe, G., Kim, S.H., Im, S., Lee, J.Y., Narasimhan, S.G., Kweon, I.S.: Ranus: RGB and NIR urban scene dataset for deep scene parsing. IEEE Rob. Autom. Lett. 3(3), 1808–1815 (2018)CrossRefGoogle Scholar
  7. 7.
    Fredembach, C., Süsstrunk, S.: Colouring the near-infrared. In: Color and Imaging Conference, vol. 2008, pp. 176–182. Society for Imaging Science and Technology (2008)Google Scholar
  8. 8.
    Gao, S., Cheng, Y., Zhao, Y.: Method of visual and infrared fusion for moving object detection. Opt. Lett. 38(11), 1981–1983 (2013)CrossRefGoogle Scholar
  9. 9.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  10. 10.
    Honda, H., Timofte, R., Van Gool, L.: Make my day-high-fidelity color denoising with near-infrared. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 82–90 (2015)Google Scholar
  11. 11.
    Hwang, S., Park, J., Kim, N., Choi, Y., So Kweon, I.: Multispectral pedestrian detection: Benchmark dataset and baseline. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1037–1045 (2015)Google Scholar
  12. 12.
    Iizuka, S., Simo-Serra, E., Ishikawa, H.: Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans. Graph. (TOG) 35(4), 110 (2016)CrossRefGoogle Scholar
  13. 13.
    Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
  14. 14.
    Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)Google Scholar
  15. 15.
    Jiang, H., Zheng, Y.: Learning to see moving objects in the dark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7324–7333 (2019)Google Scholar
  16. 16.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_43CrossRefGoogle Scholar
  17. 17.
    Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  18. 18.
    Kise, M., Park, B., Heitschmidt, G.W., Lawrence, K.C., Windham, W.R.: Multispectral imaging system with interchangeable filter design. Comput. Electron. Agric. 72(2), 61–68 (2010)CrossRefGoogle Scholar
  19. 19.
    Kleynen, O., Leemans, V., Destain, M.F.: Development of a multi-spectral vision system for the detection of defects on apples. J. Food Eng. 69(1), 41–49 (2005)CrossRefGoogle Scholar
  20. 20.
    Koyama, S., Inaba, Y., Kasano, M., Murata, T.: A day and night vision MOS imager with robust photonic-crystal-based RGB-and-IR. IEEE Trans. Electron Devices 55(3), 754–759 (2008)CrossRefGoogle Scholar
  21. 21.
    Lai, W.S., Huang, J.B., Wang, O., Shechtman, E., Yumer, E., Yang, M.H.: Learning blind video temporal consistency. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 170–185 (2018)Google Scholar
  22. 22.
    Larsson, G., Maire, M., Shakhnarovich, G.: Learning representations for automatic colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 577–593. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_35CrossRefGoogle Scholar
  23. 23.
    Lei, C., Chen, Q.: Fully automatic video colorization with self-regularization and diversity. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3753–3761 (2019)Google Scholar
  24. 24.
    Li, W., Zhang, J., Dai, Q.H.: Robust blind motion deblurring using near-infrared flash image. J. Visual Commun. Image Representation 24(8), 1394–1413 (2013)CrossRefGoogle Scholar
  25. 25.
    Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)Google Scholar
  26. 26.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Google Scholar
  27. 27.
    Lowe, D.G., et al.: Object recognition from local scale-invariant features. In: ICCV, vol. 99, pp. 1150–1157 (1999)Google Scholar
  28. 28.
    Lu, Y.M., Fredembach, C., Vetterli, M., Süsstrunk, S.: Designing color filter arrays for the joint capture of visible and near-infrared images. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 3797–3800. IEEE (2009)Google Scholar
  29. 29.
    Lv, F., Zheng, Y., Li, Y., Lu, F.: An integrated enhancement solution for 24-hour colorful imaging. In: AAAI, pp. 11725–11732 (2020)Google Scholar
  30. 30.
    Matsui, S., Okabe, T., Shimano, M., Sato, Y.: Image enhancement of low-light scenes with near-infrared flash images. Inf. Media Technol. 6(1), 202–210 (2011)Google Scholar
  31. 31.
    Mehri, A., Sappa, A.D.: Colorizing near infrared images through a cyclic adversarial approach of unpaired samples. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 971–979. IEEE (2019)Google Scholar
  32. 32.
    Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1520–1528 (2015)Google Scholar
  33. 33.
    Nyberg, A., Eldesokey, A., Bergström, D., Gustafsson, D.: Unpaired thermal to visible spectrum transfer using adversarial training. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11134, pp. 657–669. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-11024-6_49CrossRefGoogle Scholar
  34. 34.
    Özkan, K., Işık, Ş., Yavuz, B.T.: Identification of wheat kernels by fusion of RGB, SWIR, VNIR samples over feature and image domain. J. Sci. Food Agricu. 99, 4977–4984 (2019)CrossRefGoogle Scholar
  35. 35.
    Park, C., Kang, M.: Color restoration of RGBn multispectral filter array sensor images based on spectral decomposition. Sensors 16(5), 719 (2016)CrossRefGoogle Scholar
  36. 36.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24574-4_28CrossRefGoogle Scholar
  37. 37.
    Sadeghipoor, Z., Lu, Y.M., Süsstrunk, S.: A novel compressive sensing approach to simultaneously acquire color and near-infrared images on a single sensor. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1646–1650. IEEE (2013)Google Scholar
  38. 38.
    Schaul, L., Fredembach, C., Süsstrunk, S.: Color image dehazing using the near-infrared. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 1629–1632. IEEE (2009)Google Scholar
  39. 39.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  40. 40.
    Tessler, N., Medvedev, V., Kazes, M., Kan, S., Banin, U.: Efficient near-infrared polymer nanocrystal light-emitting diodes. Science 295(5559), 1506–1508 (2002)CrossRefGoogle Scholar
  41. 41.
    Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)
  42. 42.
    Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)Google Scholar
  43. 43.
    Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRefGoogle Scholar
  44. 44.
    Zafar, I., Zakir, U., Romanenko, I., Jiang, R.M., Edirisinghe, E.: Human silhouette extraction on FPGAs for infrared night vision military surveillance. In: 2010 Second Pacific-Asia Conference on Circuits, Communications and System, vol. 1, pp. 63–66. IEEE (2010)Google Scholar
  45. 45.
    Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 649–666. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_40CrossRefGoogle Scholar
  46. 46.
    Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)Google Scholar
  47. 47.
    Zhang, X., Sim, T., Miao, X.: Enhancing photographs with near infra-red images. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)Google Scholar
  48. 48.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.The University of TokyoTokyoJapan
  2. 2.National Institute of InformaticsTokyoJapan
  3. 3.Wuhan UniversityHubeiChina
  4. 4.Peng Cheng LaboratoryShenzhenChina

Personalised recommendations