Efficient Burst Raw Denoising with Variance Stabilization and Multi-frequency Denoising Network

Li, Dasong; Zhang, Yi; Law, Ka Lung; Wang, Xiaogang; Qin, Hongwei; Li, Hongsheng

doi:10.1007/s11263-022-01627-3

Efficient Burst Raw Denoising with Variance Stabilization and Multi-frequency Denoising Network

Published: 20 June 2022

Volume 130, pages 2060–2080, (2022)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Dasong Li¹^na1,
Yi Zhang¹^na1,
Ka Lung Law²,
Xiaogang Wang^1,3,
Hongwei Qin² &
…
Hongsheng Li^1,3

823 Accesses
5 Citations
3 Altmetric
Explore all metrics

Abstract

With the growing popularity of smartphones, capturing high-quality images is of vital importance to smartphones. The cameras of smartphones have small apertures and small sensor cells, which lead to the noisy images in low light environment. Denoising based on a burst of multiple frames generally outperforms single frame denoising but with the larger compututional cost. In this paper, we propose an efficient yet effective burst denoising system. We adopt a three-stage design: noise prior integration, multi-frame alignment and multi-frame denoising. First, we integrate noise prior by pre-processing raw signals into a variance-stabilization space, which allows using a small-scale network to achieve competitive performance. Second, we observe that it is essential to adopt an explicit alignment for burst denoising, but it is not necessary to integrate an learning-based method to perform multi-frame alignment. Instead, we resort to a conventional and efficient alignment method and combine it with our multi-frame denoising network. At last, we propose a denoising strategy that processes multiple frames sequentially. Sequential denoising avoids filtering a large number of frames by decomposing multiple frames denoising into several efficient sub-network denoising. As for each sub-network, we propose an efficient multi-frequency denoising network to remove noise of different frequencies. Our three-stage design is efficient and shows strong performance on burst denoising. Experiments on synthetic and real raw datasets demonstrate that our method outperforms state-of-the-art methods, with less computational cost. Furthermore, the low complexity and high-quality performance make deployment on smartphones possible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Burst Denoising

MOTF: Multi-objective Optimal Trilateral Filtering based partial moving frame algorithm for image denoising

Article 03 August 2020

Burst Denoising via Temporally Shifted Wavelet Transforms

Availability of Data and Materials

All datasets mentioned in this manuscript are the open datasets.

Code Availability

The code of variance stabilization and network is available from the authors upon reasonable request.

References

ARM Neon Intrinsics. (2014). https://developer.arm.com/architectures/instruction-sets/intrinsics/.
Anscombe, F. J. (1948). The transformation of poisson, binomial and negative-binomial data. Biometrika, 35(3–4), 1948.
MathSciNet MATH Google Scholar
Baker, S., Scharstein, D., Lewis, J. P., Roth, S., Black, M. J., & Szeliski, R. (2011). A database and evaluation methodology for optical flow. International Journal of Computer Vision, 92(1), 1–31.
Article Google Scholar
Bartlett, M. S. (1936). The square root transformation in analysis of variance. Supplement to the Journal of the Royal Statistical Society, 3(1), 68–78.
Article Google Scholar
Bartlett, M. S. (1947). The use of transformations. Biometrics, 3(1), 39–52.
Article MathSciNet Google Scholar
Brooks, T., Mildenhall, B., Xue, T., Chen, J., Sharlet, D., & Barron, JT. (2019). Unprocessing images for learned raw denoising. CVPR
Buades, A., Coll, B., & Morel, J-M. (2005). A non-local algorithm for image denoising. In: Conference on Computer Vision and Pattern Recognition CVPR (2), IEEE Computer Society, pp. 60–65.
Calonder, Michael, Lepetit, Vincent, Strecha, Christoph, & Fua, Pascal. (2010). Brief: Binary robust independent elementary features. In Kostas Daniilidis, Petros Maragos, & Nikos Paragios (Eds.), Computer Vision - ECCV 2010. Berlin Heidelberg: Springer.
Google Scholar
Curtiss, J. H. (1943). On Transformations Used in the Analysis of Variance. Annals of Mathematical Statistics, 14(2), 107–122.
Article MathSciNet Google Scholar
Dabov, K., Foi, A., & Egiazarian, K. (2007). Video denoising by sparse 3D transform-domain collaborative filtering. In 2007 15th European Signal Processing Conference (pp. 145-149). IEEE.
Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., & Wei, Y. (2017). Deformable convolutional networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 764–773.
Doob, J. L. (1935). The Limiting Distributions of Certain Statistics. Annals of Mathematical Statistics, 6(3), 160–169.
Article Google Scholar
Digital negative (dng). https://helpx.adobe.com/camera-raw/digital-negative.html#dng.
Foi, Alessandro, Trimeche, Mejdi, Katkovnik, Vladimir, & Egiazarian, Karen. (2008). Practical poissonian-gaussian noise modeling and fitting for single-image raw-data. IEEE Transactions on Image Processing, 17(10), 1737–1754.
Article MathSciNet Google Scholar
Freeman, Murray F., & Tukey, John W. (1950). Transformations Related to the Angular and the Square Root. Annals of Mathematical Statistics, 21(4), 607–611.
Article MathSciNet Google Scholar
Gu, S., Li, Y., Gool, LV., & Timofte, R. (2019) Self-guided network for fast image denoising. In: The IEEE International Conference on Computer Vision (ICCV).
Hasinoff, S. W., Sharlet, D., Geiss, R., Adams, A., Barron, J., Kainz, F., et al. (2016). Burst photography for high dynamic range and low-light imaging on mobile cameras. ACM Transactions on Graphics, 35(6), 1–12.
Article Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Healey, G. E., & Kondepudy, R. (1994). Radiometric ccd camera calibration and noise estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(3), 267–276.
Article Google Scholar
Horn, B. K., & Schunck, B. G. (1981). Determining optical flow. Artificial Intelligence, 17(1–3), 185–203.
Article Google Scholar
Krasin, I., Duerig, T., Alldrin, N., Ferrari, V., Abu-El-Haija, S., Kuznetsova, A., Rom, H., Uijlings, J., Popov, S., Veit, A., Belongie, S., Victor G., Abhinav G., Chen S., Gal C., David C., Zheyun F., Dhyanesh N., and Kevin M. (2017) Openimages: A public dataset for large-scale multi-label and multi-class image classification. Dataset available from https://github.com/openimages.
Liang, Z., Guo, S., Gu, H., Zhang, H., & Zhang, L. (2020) A decoupled learning scheme for real-world burst denoising from raw images. In: European Conference on Computer Vision (ECCV).
Liu, Xinhao, Tanaka, Masayuki, & Okutomi, Masatoshi. (2014). Practical signal-dependent noise parameter estimation from a single noisy image. IEEE Transactions on Image Processing, 23(10), 4361–4371.
Article MathSciNet Google Scholar
Liu, Z., Yuan, L., Tang, X., Uyttendaele, M., & Sun, J. (2014). Fast burst images denoising. ACM Transactions on Graphics, 33(6), 1–232.
Google Scholar
Bruce, D. (1981). Lucas and Takeo Kanade. IJCAI: An iterative image registration technique with an application to stereo vision.
Google Scholar
Maggioni, M., Huang, Y., Li, C., Xiao, S., Fu, Z. & Song, F. (2021). Efficient multi-stage video denoising with recurrent spatio-temporal fusion.
Maggioni, M., Boracchi, G., Foi, A., & Egiazarian, K. (2011). Video denoising using separable 4d nonlocal spatiotemporal transforms. In: Image Processing: Algorithms and Systems, vol. 7870 of SPIE Proceedings, pp. 787003. SPIE
Maggioni, Matteo, Boracchi, Giacomo, Foi, Alessandro, & Egiazarian, Karen O. (2012). Video denoising, deblocking, and enhancement through separable 4-d nonlocal spatiotemporal transforms. IEEE Transactions on Image Processing, 21(9), 3952–3966.
Article MathSciNet Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zisserman, A. (2009). Non-local sparse models for image restoration. In: IEEE 12th International Conference on Computer Vision ICCV, pp. 2272–2279. IEEE Computer Society
Makitalo, Markku, & Foi, Alessandro. (2013). Optimal inversion of the generalized anscombe transformation for poisson-gaussian noise. IEEE Transactions on Image Processing, 22(1), 91–103.
Article MathSciNet Google Scholar
Marinc, T., Srinivasan, V., Gül, S., Hellge, C., & Samek, W. (2019). Multi-kernel prediction networks for denoising of burst images. CoRR, abs/1902.05392
Menze, M., & Geiger, A. (2015). Object scene flow for autonomous vehicles. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3061–3070
Mildenhall, B., Barron, JT., Chen, J., Sharlet, D., Ng, R., & Carroll, R. (2018). Burst denoising with kernel prediction networks. In CVPR
Mäkitalo, M., & Foi, A. (2012). Poisson-gaussian denoising using the exact unbiased inverse of the generalized anscombe transformation. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Nah, S., Kim, TH, & Lee, KM. (2017). Deep multi-scale convolutional neural network for dynamic scene deblurring. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. In: H. Wallach, H. Larochelle, A. Beygelzimer, F. d’ Alché-Buc, E. Fox, and R. Garnett, (Eds.), Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc.,
OLiba, O., Murthy, K., Tsai, Y.T., Brooks, T., Xue, T., Karnad, N., He, Q., Barron, J.T., Sharlet, D., Geiss, R., Hasinoff SW., Yael P., and Marc L. Handheld mobile photography in very low light. SIGGRAPH Asia, 2019.
Pplnn is a primitive library for neural network. https://github.com/openppl-public/ppl.nn, 2021.
Portilla, Javier, Strela, Vasily, Wainwright, Martin J., & Simoncelli, Eero P. (2003). Image denoising using scale mixtures of gaussians in the wavelet domain. IEEE Transactions on Image Processing, 12(11), 1338–1351.
Article MathSciNet Google Scholar
Rosten, E., & Drummond, T. (2005). Fusing points and lines for high performance tracking. In IEEE International Conference on Computer Vision, 2, 1508–1511.
Rosten, E., & Drummond, T. (2006). Machine learning for high-speed corner detection. European Conference on Computer Vision, 1 430–443.
Rudin, L. I., Osher, S., & Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena, 60(1–4), 259–268.
Article MathSciNet Google Scholar
Snapdragon 888 5g mobile platform. https://www.qualcomm.com/products/snapdragon-888-5g-mobile-platform, 2020.
Starck, J. L., Murtagh, F., & Bijaoui, A. (1998). Image processing and data analysis. Cambridge: In Cambridge University Press.
Tassano, M., Delon, J., Veit, T. (2020). Fastdvdnet: Towards real-time deep video denoising without flow estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
Vogels, T., Rousselle, F., Mcwilliams, B., Röthlin, G., Harvill, A., Adler, D., et al. (2018). Denoising with kernel prediction and asymmetric loss functions. ACM Transactions on Graphics, 37(4), 1–5.
Article Google Scholar
Wang, X., Chan, KC., Yu, K., Dong, C., Change LC. (2019). Edvr: Video restoration with enhanced deformable convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
Wang, Y., Huang, H., Xu, Q., Liu, J., Liu, Y., & Wang, J. (2020). Practical deep raw image denoising on mobile devices. In: European Conference on Computer Vision (ECCV), pp.1–16.
Wei, K., Fu, Y., Yang, J. & Huang, H. (2020) A physics-based noise formation model for extreme low-light raw denoising. In: IEEE Conference on Computer Vision and Pattern Recognition
Weinzaepfel, P., Revaud, J., Harchaoui, Z., & Schmid, C. (2013). Deepflow: Large displacement optical flow with deep matching. In: 2013 IEEE International Conference on Computer Vision, pp. 1385–1392.
Xia, Z., Perazzi, F., Gharbi, M., Sunkavalli, K., & Chakrabarti, A. (2010). Basis prediction networks for effective burst denoising with large kernels. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
Xue, T., Chen, B., Wu, J., Wei, D., & Freeman, W. T. (2019). Video enhancement with task-oriented flow. International Journal of Computer Vision (IJCV), 127(8), 1106–1125.
Article Google Scholar
Yue, H., Cao, C., Liao, L., Chu, R., & Yang, J. (2020). Supervised raw video denoising with a benchmark dataset on dynamic scenes. In; IEEE Conference on Computer Vision and Pattern Recognition, 2020.
Zhang, Kai, Zuo, Wangmeng, Chen, Yunjin, Meng, Deyu, & Zhang, Lei. (2017). Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Transactions on Image Processing, 26(7), 3142–3155.
Article MathSciNet Google Scholar
Zhang, K., Zuo, W., & Zhang, L. (2018). Ffdnet: Toward a fast and flexible solution for CNN based image denoising. IEEE Transactions on Image Processing, 27(9), 4608–22.
Article MathSciNet Google Scholar
Zhang, Y., Qin, H. Wang, X. & Li, H. (2021). Rethinking noise synthesis and modeling in raw denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4593–4601
Zhu, X., Hu, H., Lin, S., & Dai, J. (2019). Deformable convnets v2: More deformable, better results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Download references

Acknowledgements

This work is supported in part by Centre for Perceptual and Interactive Intelligence Limited, in part by the General Research Fund through the Research Grants Council of Hong Kong under Grants (Nos. 14204021, 14207319, 14203118, 14208619), in part by Research Impact Fund Grant No. R5001-18, in part by CUHK Strategic Fund.

Author information

Dasong Li and Yi Zhang have contributed equally to this work.

Authors and Affiliations

Multimedia Laboratory, The Chinese University of Hong Kong, Shatin, Hong Kong
Dasong Li, Yi Zhang, Xiaogang Wang & Hongsheng Li
SenseTime Research, Beijing, China
Ka Lung Law & Hongwei Qin
Centre for Perceptual and Interactive Intelligence Limited, Shatin, Hong Kong
Xiaogang Wang & Hongsheng Li

Authors

Dasong Li
View author publications
You can also search for this author in PubMed Google Scholar
Yi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ka Lung Law
View author publications
You can also search for this author in PubMed Google Scholar
Xiaogang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Qin
View author publications
You can also search for this author in PubMed Google Scholar
Hongsheng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Hongwei Qin or Hongsheng Li.

Ethics declarations

Conflict of interest

This research is sponsored by The Chinese University of Hong Kong and may lead to the development of products which may be licensed to Sensetime, in which I have a business interest. I have disclosed those interests fully to Taylor & Francis, and have in place an approved plan for managing any potential conflicts arising from this arrangement.

Ethics Approval

Not applicable.

Consent to Participate

Not applicable.

Consent for Publication

All authors consent to participate in this work. No extra dataset is proposed in this study.

Additional information

Communicated by Michael S. Brown.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Noise Modeling of CMOS Signals

We provide the detailed noise modeling of CMOS signals to obtain the relation between sensor gain and $\sigma _r, \sigma _s$. We define the observed intensity as x and underlying true intensity as $x^{*}$. Following Wang et al. (2020), the raw signal is modeled as

$$\begin{aligned} x \sim q_e \alpha \mathcal {P}\left( \frac{x^{*}}{q_e \alpha }\right) + \mathcal {N}(0, \alpha ^2 \sigma _0^2 + \sigma _{adc}^2), \end{aligned}$$

(A.1)

where $q_e$ is quantum efficiency factor, $\alpha $ is the sensor gain, $\sigma _0$ is the variance of read noise caused by sensor readout effects and $\sigma _{adc}$ is the variance of amplifier noise. Then we have:

$$\begin{aligned} \begin{aligned} \sigma _s&= q_e a \\ \sigma _r^2&= \alpha ^2\sigma _{0}^2 + \sigma _{adc}^2. \end{aligned} \end{aligned}$$

(A.2)

For one fixed senor, $q_e$, $\sigma _{0}$, $\sigma _{adc}$ is unchanged. Then sensor gain $\alpha $ is the only factor to affect $\sigma _s, \sigma _r$.

Table 9 Ablation study of different inverse and different loss functions on CRVD dataset (burst number $N=5$)

Full size table

Appendix B Generalized Verison of Freeman-Tukey Transformation

For For variable x in Poisson distribution of the mean value $x^{*}$, the general form of variance stabilization transformation in root-type is

$$\begin{aligned} y = 2 \sqrt{x + c}. \end{aligned}$$

(B.1)

The core problem of variance stabilization is to stabilize Poisson distribution to have unit variance. But no exact stabilization is possible Curtiss (1943). In practice, approximate transformations are generally used. The mainstreaming transformations include $2\sqrt{x}$, $2\sqrt{x+1}$, $2\sqrt{x+\frac{1}{2}}$ Bartlett (1936), $2\sqrt{x+\frac{3}{8}}$Anscombe (1948) and $\sqrt{x}+\sqrt{x+1}$Freeman and Tukey (1950). $\sqrt{x} + \sqrt{x+1}$ can be taken as the linear combination of two general forms with $c=0$ and $c=1$. We visualize the variance of transformed y in Fig. 7. When the value x is enough large, the variance of $2\sqrt{x+\frac{1}{2}}$ Bartlett (1936), $2\sqrt{x+\frac{3}{8}}$ Anscombe (1948) and $\sqrt{x}+\sqrt{x+1}$Freeman and Tukey (1950) approach the unity. However, $\sqrt{x}+\sqrt{x+1}$ Freeman and Tukey (1950) shows better approximation than other transformations when the mean value $x^{*}$ is close to zero. The SNR (signal-to-noise ratio) in dark areas is usually lower than that of other areas. Therefore, we seek the generalized version of Freeman-Tukey Transformation Freeman and Tukey (1950) to handle Poisson-Gaussian distribution for raw denoising.

Firstly, we start from the transform of Poisson distribution. We define variable x to be a Poisson variable of mean m. Its variance is $\text {Var}(x) = m$. We define y to be the transformed x. Then we have $\text {Var}(y) \approx (\frac{dy}{dx})^2\text {Var}(x)$ based on Doob (1935) and Bartlett (1947). The core problem of variance stabilization is stabilize Poisson distribution into unity variance. Hence we let $\text {Var}(y) = 1$ and obtain:

$$\begin{aligned} \frac{dy}{dx} = \sqrt{\frac{\text {Var}(y)}{\text {Var}(x)}} = \frac{1}{\sqrt{m}}. \end{aligned}$$

(B.2)

For the general transform $y = 2\sqrt{x+c}$, we have

$$\begin{aligned} \frac{dy}{dx} = \frac{1}{\sqrt{x + c}}. \end{aligned}$$

(B.3)

From Eqs. (B.2) and (B.3), we obtain the approximation:

$$\begin{aligned} m = x + c. \end{aligned}$$

(B.4)

Secondly, we consider the transform of Poisson-Gaussian distribution. Similar to Eq. (5), we define variable z as $z = x + \gamma $, where x is a Poisson variable of mean m and $\gamma $ is a Gaussian variable of mean g and standard deviation $\sigma $. The variance of transformed z is given by $\text {Var}(y) \approx (\frac{dy}{dx})^2\text {Var}(z)$ based on Doob (1935) and Bartlett (1947). Similarly, we let $\text {Var}(y) = 1$ and obtain:

$$\begin{aligned} \frac{dy}{dz} = \sqrt{\frac{\text {Var}(y)}{\text {Var}(z)}} = \frac{1}{\sqrt{m + \sigma ^2}}. \end{aligned}$$

(B.5)

We take the first-order approximation in Starck et al. (1998) to approximate the Gaussian distribution $\gamma \approx g$. From Eq. (B.4), we have $m = z + c - g$. Thus we have:

$$\begin{aligned} \frac{dy}{dx} = \frac{1}{\sqrt{z + c + \sigma ^2 - g}}. \end{aligned}$$

(B.6)

By integral of Eq. (B.6), we have the transformation y(z) for Poisson-Gaussian distribution:

$$\begin{aligned} y(x) = 2\sqrt{z + c + \sigma ^2 - g}. \end{aligned}$$

(B.7)

Finally, we move to the generalized version of Freeman-Tukey Transformation Freeman and Tukey (1950): $y = \sqrt{x} + \sqrt{x+1}$. From the Eq. (B.7), we generalize $2\sqrt{x}$ and $2\sqrt{x+1}$ respectively. By using linear combination of two generalized transformations ($c=0$ and $c=1$), we obtain the generalized version of Freeman-Tukey Transformation:

$$\begin{aligned} y(x) = \sqrt{x + 1 + \sigma ^2 - g} + \sqrt{x + \sigma ^2 - g }. \end{aligned}$$

(B.8)

Appendix C Algebraic Inverse of transform

It is known that algebraic inverse is usually avoided due to bias in previous methods Starck et al. (1998). However the bias is already handled when we calculate the loss in the space of variance stabilization. Moreover, algebraic inverse can be used for both Anscombe transformation Anscombe (1948); Starck et al. (1998) and Freeman-Tukey transformation Freeman and Tukey (1950) in our framework.

Let x and $x^{*}$ denote noisy signal and clean signal, respectively. The transform (Anscombe transform or Freeman-Tukey transform) is denoted as f and the algebraic transform is denoted as $f^{-1}$. The bias is produced by the nonlinearity of the transformation f. We calculate the loss in the variance stabilization space. The denoising network would learn the mapping from f(x) to $f(x^{*})$ directly. Therefore, the bias is already handled when the denoising output approximates $f(x^{*})$.

We further conduct experiments on CRVD dataset (burst number $N=5$) to compare algebraic inverse and exact unbiased inverse under different training settings. The results are shown in Table 9. We first training with Generalization Anscombe transformation (GAT) Starck et al. (1998) and calculate the loss function before the inverse. Then we test the model with algebraic inverse (denoted as “GAT-4”) and exact unbiased inverse (denoted as “GAT-3”). It is shown that algebraic inverse outperforms the exact unbiased inverse Makitalo and Foi (2013) by 0.13 dB PSNR, which demonstrates that the bias is handled in calculating loss before inverse. Then we train with GAT with algebraic inverse (denoted as “GAT-2”) and optimal inverse (denoted as “GAT-1”) and calculate the loss function after the inverse. In Table 9, it can be observed that both two inverses show the same performance (44.60 dB PSNR) but are 0.03 dB PSNR lower than calculating the loss before inverse. It might be because the bias produced in the space of variance stabilization becomes more complicated after the non-linear inverse transformation. Handling the bias before inverse is more direct. The same phenomenon can also observed in the Freeman-Tukey transformation (“Ours-1” VS “Ours”).

Table 10 Ablation study of different input orders of alternate frames on CRVD dataset (burst number $N=5$)

Full size table

Appendix D More Ablation of Denoising Network

Input order of alternate frames We conduct experiments on CRVD dataset Yue et al. (2020) (burst number $N=5$) to compare three input orders: a) preserving the temporal order of an input burst (denoted as “Keep”), b) shuffling the burst order randomly (denoted as “Shuffle”), and c) reversing the burst order (denoted as “Reverse”). In training and testing, 4 alternate frames are re-arranged following the same ordering strategies. It is shown in Table 10 that training with preserving the temporal order achieve the best performance of 44.70 dB PSNR, which slightly outperforms random shuffling by 0.03 dB PSNR. Furthermore, reversing the temporal order achieve the worst performance of 44.63 dB PSNR, which suffers a drop of 0.07dB PSNR. It can be observed that preserving the temporal order is helpful in sequential denoising.

Specializing the network weights In our denoising network S, we have a series of sub-networks for sequential denoising. For burst denoising on CRVD dataset Yue et al. (2020) (burst number $N=5$), $S_0$ is for spatially denoising of reference frame and $S_1,S_2,S_3,S_4$ are for sequential denoising of the 4 alternate frames. We conduct experiments on CRVD dataset (burst number $N=5$) to compare $S_i$ with different weights (denoted as “specializing”) and $S_i$ with shared weights (denoted as “sharing”). It shown in Table 11 that using shared weights of $S_1,S_2,S_3,S_4$ just achieves 44.44 dB PSNR, which has a drop of 0.26 dB PSNR compared with specializing each $S_i$ (44.70 dB PSNR).

Table 11 Ablation study of using specialized or shared-weights networks on CRVD dataset (burst number $N=5$)

Full size table

Table 12 Ablation study of using different numbers of frequencies in the denoising network on CRVD dataset (burst number $N=5$)

Full size table

Different Scales in denoising backbone We conduct experiments on CRVD dataset (burst number $N=5$) to explore using different scales (frequencies). We define the number of scales (frequencies) as s. When $s=4$, we use four frequencies ($m_0,m_1,m_2,m_3$) to achieve multi-frequency denoising. It can be observed in Table 12 that using two frequencies achieves 44.49 dB PSNR, which is a drop of 0.21dB compared with using three frequencies (44.70 dB PSNR). But when we use four scales (frequencies), the denoising performance is 44.71 dB PSNR and just outperform using three frequencies by only 0.01 dB PSNR but its model size increases from 1.57M to 2.10M.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, D., Zhang, Y., Law, K.L. et al. Efficient Burst Raw Denoising with Variance Stabilization and Multi-frequency Denoising Network. Int J Comput Vis 130, 2060–2080 (2022). https://doi.org/10.1007/s11263-022-01627-3

Download citation

Received: 21 September 2021
Accepted: 07 May 2022
Published: 20 June 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s11263-022-01627-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Burst Raw Denoising with Variance Stabilization and Multi-frequency Denoising Network

Abstract

Access this article