Siamese Dense Network for Reflection Removal with Flash and No-Flash Image Pairs

Abstract

This work addresses the reflection removal with flash and no-flash image pairs to separate reflection from transmission. When objects are covered by glass, the no-flash image usually contains reflection, and thus flash is used to enhance transmission details. However, the flash image suffers from the specular highlight on the glass surface caused by flash. In this paper, we propose a siamese dense network (SDN) for reflection removal with flash and no-flash image pairs. SDN extracts shareable and complementary features via concatenated siamese dense blocks. We utilize an image fusion block for the SDN to fuse the intermediate output of two branches. Since severe information loss occurs in the specular highlight, we detect the specular highlight in the flash image based on gradient of the maximum chromaticity. Through observations, flash causes various artifacts such as tone distortion and inhomogeneous brightness. Thus, with synthetic datasets we collect 758 pairs of real flash and no-flash image pairs (including their ground truth) by different cameras to gain generalization. Various experiments show that the proposed method successfully removes reflections using flash and no-flash image pairs and outperforms state-of-the-art ones in terms of visual quality and quantitative measurements. Besides, we apply the SDN to color/depth image pairs and achieve both color reflection removal and depth filling.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

References

  1. Agrawal, A., Raskar, R., Nayar, S. K., & Li, Y. (2005). Removing photography artifacts using gradient projection and flash-exposure sampling. ACM Transactions on Graphics (TOG), 24(3), 828–835.

    Article  Google Scholar 

  2. Aksoy, Y., Kim, C., Kellnhofer, P., Paris, S., Elgharib, M., Pollefeys, M., & Matusik, W. (2018). A dataset of flash and ambient illumination pairs from the crowd. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 634–649).

  3. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., & Shah, R. (1994). Signature verification using a siamese time delay neural network. In Advances in neural information processing systems (pp. 737–744).

  4. Camplani, M., & Salgado, L. (2012). Efficient spatio-temporal hole filling strategy for kinect depth maps. In Proceedings of SPIE 8290, three-dimensional image processing (3DIP) and applications II (Vol. 8290, p. 82900E). International Society for Optics and Photonics.

  5. Chang, Y., & Jung, C. (2019). Single image reflection removal using convolutional neural networks. IEEE Transactions on Image Processing, 28(4), 1954–1966.

    MathSciNet  Article  Google Scholar 

  6. Chang, Y., Jung, C., Ke, P., Song, H., & Hwang, J. (2018). Automatic contrast-limited adaptive histogram equalization with dual gamma correction. IEEE Access, 6, 11782–11792.

    Article  Google Scholar 

  7. Chopra, S., Hadsell, R., & Lecun, Y. (2005). Learning a similarity metric discriminatively, with application to face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  8. Diamant, Y., & Schechner, Y.Y. (2008). Overcoming visual reverberations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8). IEEE.

  9. Eisemann, E., & Durand, F. (2004). Flash photography enhancement via intrinsic relighting. In ACM Transactions on Graphics (TOG) (Vol. 23, pp. 673–678). ACM.

  10. Fan, Q., Yang, J., Hua, G., Chen, B., & Wipf, D. (2017). A generic deep architecture for single image reflection removal and image smoothing. In Proceedings of the IEEE Conference on Computer Vision (ICCV) (pp. 3258–3267). IEEE.

  11. Farid, H., & Adelson, E.H. (1999). Separating reflections and lighting using independent components analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Vol. 1, pp. 262–267). IEEE.

  12. Guo, X., Cao, X., & Ma, Y. (2014). Robust separation of reflection from multiple images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2187–2194).

  13. Han, B. J., & Sim, J. Y. (2017). Reflection removal using low-rank matrix completion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  14. Han, B. J., & Sim, J. Y. (2018). Glass reflection removal using co-saliency-based image alignment and low-rank matrix completion in gradient domain. IEEE Transactions on Image Processing, 27(10), 4873–4888.

    MathSciNet  Article  Google Scholar 

  15. Hang, Z., & Dana, K. (2018). Multi-style generative network for real-time transfer (pp. 349–365).

  16. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).

  17. He, S., & Lau, R. W. (2014). Saliency detection with flash and no-flash image pairs. In Proceedings of the European Conference on Computer Vision (pp. 110–124). Springer.

  18. Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 2261–2269).

  19. Kim, H., Jin, H., Hadap, S., & Kweon, I. (2013). Specular reflection separation using dark channel prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1460–1467).

  20. Kong, N., Tai, Y. W., & Shin, S. Y. (2012). A physically-based approach to reflection separation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 9–16). IEEE.

  21. Levin, A., Lischinski, D., & Weiss, Y. (2004). Colorization using optimization. ACM Transactions on Graphics, 23(3), 689–694.

    Article  Google Scholar 

  22. Levin, A., & Weiss, Y. (2007). User assisted separation of reflections from a single image using a sparsity prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(9), 1647–1654.

    Article  Google Scholar 

  23. Li, Y., & Brown, M.S. (2013). Exploiting reflection change for automatic reflection removal. In Proceedings of the IEEE Conference on Computer Vision (pp. 2432–2439).

  24. Li, Y., & Brown, M. S. (2014). Single image layer separation using relative smoothness. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2752–2759).

  25. Li, Y., Tan, R. T., Guo, X., Lu, J., & Brown, M. S. (2016). Rain streak removal using layer priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2736–2744).

  26. Lu, C., Drew, M. S., & Finlayson, G. D. (2006). Shadow removal via flash/noflash illumination. In Proceedings of the IEEE Workshop on Multimedia Signal Processing (pp. 198–201). IEEE.

  27. Matsui, S., Okabe, T., Shimano, M., & Sato, Y. (2011). Image enhancement of low-light scenes with near-infrared flash images. Information and Media Technologies, 6(1), 202–210.

    Google Scholar 

  28. Mertens, T., Kautz, J., & Van Reeth, F. (2009). Exposure fusion: A simple and practical alternative to high dynamic range photography. Computer Graphics Forum, 28(1), 161–171.

  29. Nayar, S. K., Fang, X. S., & Boult, T. (1997). Separation of reflection components using color and polarization. International Journal of Computer Vision, 21(3), 163–186.

    Article  Google Scholar 

  30. Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context encoders: Feature learning by inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2536–2544).

  31. Petschnigg, G., Szeliski, R., Agrawala, M., Cohen, M., Hoppe, H., & Toyama, K. (2004). Digital photography with flash and no-flash image pairs. In ACM Transactions on Graphics (TOG) (Vol. 23, pp. 664–672). ACM.

  32. Punnappurath, A., & Brown, M. S. (2019). Reflection removal using a dual-pixel sensor. In The IEEE conference on computer vision and pattern recognition (CVPR).

  33. Schechner, Y. Y., Kiryati, N., & Basri, R. (2000). Separation of transparent layers using focus. International Journal of Computer Vision, 39(1), 25–39.

    Article  Google Scholar 

  34. Schechner, Y. Y., Shamir, J., & Kiryati, N. (2000). Polarization and statistical analysis of scenes containing a semireflector. JOSA A, 17(2), 276–284.

    Article  Google Scholar 

  35. Seo, H. J., & Milanfar, P. (2012). Robust flash denoising/deblurring by iterative guided filtering. EURASIP Journal on Advances in Signal Processing, 2012(1), 3.

    Article  Google Scholar 

  36. Shen, J., & Cheung, S. C. S. (2013). Layer depth denoising and completion for structured-light rgb-d cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1187–1194).

  37. Shih, Y., Krishnan, D., Durand, F., & Freeman, W. T. (2015). Reflection removal using ghosting cues. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3193–3201).

  38. Shirai, K., Okamoto, M., & Ikehara, M. (2011). Noiseless no-flash photo creation by color transform of flash image. In Proceedings of the IEEE Conference on Image Processing (ICIP) (pp. 3437–3440). IEEE.

  39. Silberman, N., Hoiem, D., Kohil, P., & Fergus, R. (2012). Indoor segmentation and support inference from rgbd images. In Proceedings of the European Conference on Computer Vision. Springer.

  40. Simon, C., & Park, I. K. (2015). Reflection removal for in-vehicle black box videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4231–4239).

  41. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  42. Song, S., Lichtenberg, S. P., & Xiao, J. (2015). Sun rgb-d: A rgb-d scene understanding benchmark suite. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 567–576).

  43. Sun, J., Chang, Y., Jung, C., & Feng, J. (2019). Multi-modal reflection removal using convolutional neural networks. IEEE Signal Processing Letters, 26(7), 1011–1015.

    Article  Google Scholar 

  44. Sun, J., Kang, S. B., Xu, Z. B., Tang, X., & Shum, H. Y. (2007). Flash cut: Foreground extraction with flash and no-flash image pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8). IEEE.

  45. Sun, J., Li, Y., Kang, S. B., & Shum, H. Y. (2006). Flash matting. ACM Transactions on Graphics (TOG), 25(3), 772–778.

    Article  Google Scholar 

  46. Szeliski, R., Avidan, S., & Anandan, P. (2000). Layer extraction from multiple images containing reflections and transparency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Vol. 1, pp. 246–253). IEEE.

  47. Tan, T., Nishino, K., & Ikeuchi, K. (2003). Illumination chromaticity estimation using inverse-intensity chromaticity space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  48. Wan, R., Shi, B., Duan, L. Y., Tan, A. H., & Kot, A. C. (2018). Crrn: Multi-scale guided concurrent reflection removal network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4777–4785).

  49. Wei, K., Yang, J., Fu, Y., Wipf, D., & Huang, H. (2019). Single image reflection removal exploiting misaligned training data and network enhancements. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8178–8187).

  50. Yang, J., Gong, D., Liu, L., & Shi, Q. (2018). Seeing deeply and bidirectionally: A deep learning approach for single image reflection removal. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 654–669).

  51. Yang, J., Li, H., Dai, Y., & Tan, R. T. (2016). Robust optical flow estimation of double-layer images under transparency or reflection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1410–1419).

  52. Yang, Y., Ma, W., Zheng, Y., Cai, J. F., & Xu, W. (2019). Fast single image reflection suppression via convex optimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8141–8149).

  53. Yi, S., Wang, X., & Tang, X. (2014). Deep learning face representation from predicting 10,000 classes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  54. Yu, L., Xun, C., Cheng, J., & Hu, P. (2017). A medical image fusion method based on convolutional neural networks. In Proceedings of the International Conference on Information Fusion.

  55. Yu, L., Xun, C., Hu, P., & Wang, Z. (2017). Multi-focus image fusion with a deep convolutional neural network. Information Fusion, 36, 191–207.

    Article  Google Scholar 

  56. Zagoruyko, S., & Komodakis, N. (2015). Learning to compare image patches via convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4353–4361).

  57. Zhang, L., Zhang, L., Mou, X., & Zhang, D. (2011). Fsim: A feature similarity index for image quality assessment. IEEE Transactions on Image Processing, 20(8), 2378–2386.

    MathSciNet  Article  Google Scholar 

  58. Zhang, X., Ng, R., & Chen, Q. (2018). Single image reflection separation with perceptual losses. arXiv preprint arXiv:1806.05376

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Cheolkon Jung.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the National Natural Science Foundation of China (No. 61872280) and the International S&T Cooperation Program of China (No. 2014DFG12780).

Communicated by Stephen Lin.

Appendices

Appendices

Appendix A

We present a tripartite database including real and synthetic images in Sect. III. B. For the synthetic images generated from the data in Aksoy et al. (2018), we obtain natural glass images by simply invoking functions (2) (3) and (4) because both flash/no-flash images are available. However, there are no flash images in SUN RGB-D. Obviously, simply brightening the global images cannot simulate the flash condition because objects are ununiformly brightened in flash images. These characteristics are from a major cause of the distance between camera and objects. That is, objects near to the camera are exposed to strong illumination, and vice versa. RGB-D data contains depth information, which helps us hierarchically brighten the image. We first fill holes of depth images, then normalize pixels to 0 \(\sim \) 1 by the minimum and maximum values. Finally, we hierarchically enhance the color images according to the following function:

$$\begin{aligned} Y_F (p) = Y (p)^{0.5 + D(p)/1.3} \end{aligned}$$
(18)
Fig. 19
figure19

More results on real images. The six images listed in each group are arranged as the input no-flash image, output \(O_A\), input flash image, output \(O_F\), our final output \(O_f\) and ground truth \(T_f\)

Fig. 20
figure20

This figure is continuation of Fig. 19

where Y(p) is the pixel value located at p in Y channel of YCbCr color space. Within the enhanced Y channel, we generate the image captured with flash. Figure 18 shows two examples of \(T_F\). As shown in them, wall is not visibly brightened because it is relatively far from the camera, while the chairs, bed and ground are enhanced a lot because they are close to the camera. Hence, the data synthesis simulates the flash situation to the maximum extent.

Appendix B

We provide more results on real images in Figs. 19 and 20. For brevity, the ground truth of intermediate results, i.e., \(T_A\) and \(T_F\), are omitted in this figure. It can be observed that the proposed method achieves good performance in various scenarios.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chang, Y., Jung, C., Sun, J. et al. Siamese Dense Network for Reflection Removal with Flash and No-Flash Image Pairs. Int J Comput Vis 128, 1673–1698 (2020). https://doi.org/10.1007/s11263-019-01276-z

Download citation

Keywords

  • Deep learning
  • Reflection removal
  • Image restoration
  • Flash/no-flash
  • Image fusion
  • Layer separation
  • Depth filling