An improved exact sampling algorithm for the standard normal distribution

Du, Yusong; Fan, Baoying; Wei, Baodian

doi:10.1007/s00180-021-01136-w

An improved exact sampling algorithm for the standard normal distribution

Original Paper
Published: 28 July 2021

Volume 37, pages 721–737, (2022)
Cite this article

Computational Statistics Aims and scope Submit manuscript

316 Accesses
1 Citation
Explore all metrics

Abstract

In 2016, Karney proposed an exact sampling algorithm for the standard normal distribution. In this paper, we study the computational complexity of this algorithm under the random deviate model. Specifically, Karney’s algorithm requires the access to an infinite sequence of independently and uniformly random deviates over the range (0, 1). We give a theoretical estimate of the expected number of uniform deviates used by this algorithm until it completes, and present an improved algorithm with lower uniform deviate consumption. The experimental results also shows that our improved algorithm has better performance than Karney’s algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stratified random sampling from streaming and stored data

Article 23 October 2020

A Systematic Review of Hidden Markov Models and Their Applications

Article 12 May 2020

Simulation optimization: a review of algorithms and applications

Article Open access 23 September 2015

Notes

ExRandom is available at http://exrandom.sf.net.
Due to the release of our pre-print version of this paper on arXiv, Karney has upgraded the ExRandom to ExRandom3, which has incorporated Algorithm 4 and Algorithm 5. One may test directly using ExRandom3.
In particular, the variable b is set to be 0 in the source code if the base is $2^{32}$.
Actually, in ExRandom3 library, Karney has already improved the source code of his sampling algorithm for discrete Gaussian distributions over the integers. One can test it with discrete_normal_dist.hpp in ExRandom3.
Here, ‘Algorithm 3 restarts’ means that Algorithm 3 performs step 5, and goes back to step 2.
In particular, we say that Algorithm 3 restarts 0 times means that Algorithm 3 is ready to perform step 2 for the first time, and it has not used a uniform deviate yet.

References

Devroye Luc (1986) Non-uniform random variate generation. Springer, Berlin
Book Google Scholar
Devroye Luc, Gravel Claude (2017) The expected bit complexity of the von Neumann rejection algorithm. Stat Comput 27(3):699–710. https://doi.org/10.1007/s11222-016-9648-z
Article MathSciNet MATH Google Scholar
Dwarakanath Nagarjun C, Galbraith Steven D (2014) Sampling from discrete Gaussians for lattice-based cryptography on a constrained device. Appl Algeb Eng Commun Comput 25(3):159
Article MathSciNet Google Scholar
Flajolet Philippe, Saheb Nasser (1986) The complexity of generating an exponentially distributed variate. J. Algorithms 7(4):463–488. https://doi.org/10.1016/0196-6774(86)90014-3
Article MathSciNet MATH Google Scholar
Gentry, C, Peikert, C, Vaikuntanathan, V (2008) Trapdoors for hard lattices and new cryptographic constructions. In: Cynthia Dwork, editor, STOC 2008, Victoria, British Columbia, Canada, May 17–20, 2008, pp 197–206. ACM
Karney Charles F. F. (2016) Sampling exactly from the normal distribution. ACM Trans Math Softw 42(1):3:1-3:14
Article MathSciNet Google Scholar
Knuth Donald E, Yao A (1976) Algorithms and complexity: new directions and recent results, chapter the complexity of nonuniform random number generation. Academic Press, Cambridge
Google Scholar
Micciancio Daniele, Regev Oded (2007) Worst-case to average-case reductions based on Gaussian measures. SIAM J Comput 37(1):267–302
Article MathSciNet Google Scholar
Micciancio, D, Walter, M (2017) Gaussian sampling over the integers: Efficient, generic, constant-time. In: Katz J, Shacham H, (eds), CRYPTO 2017, Santa Barbara, CA, USA, August 20–24, 2017, Proceedings, Part II, volume 10402 of LNCS, pages 455–485. Springer
von Neumann, J (1951) Various techniques used in connection with random digits. In: Householder AS, Forsythe GE, Germond HH, (eds), Monte carlo method, volume 12 of National Bureau of Standards Applied Mathematics Series, chapter 13, pages 36–38. US Government Printing Office, Washington, DC, (1951)
Zhao Raymond K, Steinfeld Ron, Sakzad Amin (2020) FACCT: fast, compact, and constant-time discrete Gaussian sampler over integers. IEEE Trans Comput 69(1):126–137. https://doi.org/10.1109/TC.2019.2940949
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China
Yusong Du, Baoying Fan & Baodian Wei
Guangdong Key Laboratory of Information Security Technology, Guangzhou, 510006, China
Yusong Du & Baodian Wei

Authors

Yusong Du
View author publications
You can also search for this author in PubMed Google Scholar
Baoying Fan
View author publications
You can also search for this author in PubMed Google Scholar
Baodian Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yusong Du.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by Guangdong Major Project of Basic and Applied Research (2019B030302008), National Key R&D Program of China (2017YFB0802500), National Natural Science Foundations of China (Grant Nos. 61672550, 61972431), Guangdong Basic and Applied Basic Research Foundation (No. 2020A1515010687) and the Fundamental Research Funds for the Central Universities (Grant No. 19lgpy217)

Appendix

1.1 The Proof of Proposition 1

Proof

Let $p_1=1/\sqrt{e}$ and $p_0=1-p_1$. Assume that $k\ge 0$ is an integer which is generated in step 1. Then, the expected number of Bernoulli random values from ${\mathcal {B}}_{p_1}$ used by step 1 conforms to a geometric distribution. Its expected value is

$$\begin{aligned} 1+p_1+p_1^2+\ldots =\sum _{k=0}^\infty p_1^k=1/p_0. \end{aligned}$$

In step 2 for a given $k\ge 0$, the expected number of Bernoulli random values used by step 2 is

$$\begin{aligned} \sum _{j=1}^{k(k-1)-1}(p_1^{j-1}p_0)j+p_1^{k(k-1)-1}k(k-1)=\sum _{j=0}^{k(k-1)-1}p_1^j=\frac{1-p_1^{k(k-1)}}{p_0} \end{aligned}$$

In particular, if $k=0$ or $k=1$, then k is directly accepted in step 2. Then, on average, the expected number of Bernoulli random values used by step 2 is

$$\begin{aligned} \sum _{k=2}^\infty (p_1^kp_0)\frac{1-p_1^{k(k-1)}}{p_0}=\sum _{k=2}^\infty (p_1^k-p_1^{k^2})=\sum _{k=0}^\infty (p_1^k-p_1^{k^2})=1/p_0-G, \end{aligned}$$

where $G=\sum _{k=0}^\infty p_1^{k^2}$. Therefore, the expected number of Bernoulli random values from ${\mathcal {B}}_{p_1}$ used by Algorithm 1 for executing both steps 1 and 2 can be given by

$$\begin{aligned} 1/p_0+1/p_0-G=2/p_0-G\approx 3.32967. \end{aligned}$$

Furthermore, the probability of Algorithm 1 not going back to step 1 in step 2 is $\sum _{k=0}^\infty p_1^{k^2}p_0\approx 0.689875$. So, the expected number of Bernoulli random values from ${\mathcal {B}}_{p_1}$ used by Algorithm 1 for successfully generating an integer from ${\mathcal {D}}_{{\mathbb {Z}}^+,1}$ is about $3.32967/0.689875\approx 4.82649$. $\square $

1.2 The Proof of Proposition 3

Lemma 1

For a given $k\ge 1$, the probability that Algorithm 3 restarts^{Footnote 5}n times can be given by

$$\begin{aligned} \left( \frac{x}{m}+\frac{2k}{m}\right) ^n\left( \frac{x^n}{n!}\right) , \end{aligned}$$

where $m=2k+2$ and n is a positive integer.

Proof

(of Lemma 1) For a given $k\ge 1$, there are two cases in which the algorithm goes to step 2: (1) $z<y$ and $f=1$ with probability $x\cdot (2k/m)$; (2) $z<y$, $f=0$ and $r<x$ with probability $x\cdot (x/m)$. Thus, the probability that the algorithm goes to step 2 is equal to

$$\begin{aligned} x\frac{x}{m}+x\cdot \frac{2k}{m}=x\left( \frac{x}{m}+\frac{2k}{m}\right) . \end{aligned}$$

After going back to step 2 one time, the probability that the algorithm goes to step 2 once again is equal to

$$\begin{aligned} \left( x\left( \frac{x}{m}+\frac{2k}{m}\right) \right) \frac{x}{2}\cdot \frac{x}{m}+\left( x\left( \frac{x}{m}+\frac{2k}{m}\right) \right) \frac{x}{2}\cdot \frac{2k}{m}=\frac{x^2}{2}\left( \frac{x}{m}+\frac{2k}{m}\right) ^2. \end{aligned}$$

Generally, the probability that the algorithm restarts n times ($n\ge 1$) can be given by

$$\begin{aligned} \left( \frac{x}{m}+\frac{2k}{m}\right) ^n\left( \frac{x^n}{n!}\right) , \end{aligned}$$

where ${x^n}/{n!}$ is the probability that a set of uniform deviates $z_1, z_2, \ldots z_n$ over the range (0, 1) satisfy $x>z_1>z_2>\cdots >z_n$. The proof can be completed by using the mathematical induction on n. $\square $

Proof of Proposition 3

We can see that Algorithm 3 needs

$$\begin{aligned} 1+1+\left( \frac{x}{m}\right) \cdot 1=2+\left( \frac{x}{m}\right) \end{aligned}$$

uniform deviates on average every time it goes back to step 2, where the first deviate is for step 2, the second one is used by the random selector C(m), and the third one is possibly required when $f=0$ in Algorithm 3. By Lemma 1, the probability of restarting $n-1$ times ($n\ge 1$) is

$$\begin{aligned} \left( \frac{x}{m}+\frac{2k}{m}\right) ^{n-1}\left( \frac{x^{n-1}}{(n-1)!}\right) . \end{aligned}$$

By the binomial theorem, for a given $n\ge 1$, Algorithm 3 uses

$$\begin{aligned} \sum _{i=0}^{n-1}\left( \begin{array}{c} n-1\\ i\end{array}\right) \left( \frac{x}{m}\right) ^i\left( \frac{2k}{m}\right) ^{n-1-i}(2n-2+i) \end{aligned}$$

uniform deviates on average if it restarts $n-1$ times.^{Footnote 6} For a given $k\ge 1$, there are three cases in which the algorithm goes to step 6 and returns the result before it goes to step 2: (1) $z>y$ with probability $(1-x)$ at a cost of one new uniform deviate; (2) $f=-1$ with probability x(1/m) at a cost of two new uniform deviates; (3) $f=0$ and $r>x$ with probability $x(1/m)(1-x)$ at a cost of three new uniform deviates. Then, after restarting $n-1$ times ($n\ge 1$), there are three cases in which the algorithm goes to step 6 and returns the result before it goes back to step 2 once again.

(1)
$z>y$ with probability
$$\begin{aligned} \left( \frac{x}{m}+\frac{2k}{m}\right) ^{n-1}\left( \frac{x^{n-1}}{(n-1)!}-\frac{x^n}{n!}\right) , \end{aligned}$$
and at a cost of
$$\begin{aligned} \sum _{n=1}^{\infty }\sum _{i=0}^{n-1}\left( \begin{array}{c}n-1\\ i\end{array}\right) \left( \frac{x}{m}\right) ^i\left( \frac{2k}{m}\right) ^{n-1-i}(2n-1+i) \left( \frac{x^{n-1}}{(n-1)!}-\frac{x^n}{n!}\right) \end{aligned}$$
(1)
uniform deviates, where ${x^{n-1}}/{(n-1)!}-{x^n}/{n!}$ is the probability that the length of the longest decreasing sequence $x>z_1>z_2>\ldots >z_n$ is n.
(2)
$f=-1$ with probability
$$\begin{aligned} \left( \frac{x}{m}+\frac{2k}{m}\right) ^{n-1}\left( \frac{x^n}{n!}\right) \left( \frac{1}{m}\right) , \end{aligned}$$
and at a cost of
$$\begin{aligned} \sum _{n=1}^{\infty }\sum _{i=0}^{n-1}\left( \begin{array}{c}n-1\\ i\end{array}\right) \left( \frac{x}{m}\right) ^i\left( \frac{2k}{m}\right) ^{n-1-i}(2n+i) \left( \frac{x^n}{n!}\right) \left( \frac{1}{m}\right) \end{aligned}$$
(2)
uniform deviates.
(3)
$f=0$ and $r>x$ with probability
$$\begin{aligned} \left( \frac{x}{m}+\frac{2k}{m}\right) ^{n-1}\left( \frac{x^n}{n!}\right) \left( \frac{1}{m}\right) (1-x), \end{aligned}$$
and at a cost of
$$\begin{aligned} \sum _{n=1}^{\infty }\sum _{i=0}^{n-1}\left( \begin{array}{c}n-1\\ i\end{array}\right) \left( \frac{x}{m}\right) ^i\left( \frac{2k}{m}\right) ^{n-1-i}(2n+1+i) \left( \frac{x^n}{n!}\right) \left( \frac{1}{m}\right) (1-x) \end{aligned}$$
(3)
uniform deviates. Therefore, for a given $k\ge 1$, by combining the above three cases, i.e., summing Eqs. (1)–(3), we can obtain the expected number of uniform deviates used by Algorithm 3, which can be reduced to a function of x:
$$\begin{aligned} \frac{(4k+x+3)\cdot \tau _k(x)-2k-3}{2k+x}, \end{aligned}$$
where $\tau _k(x)=\exp (x\frac{2k+x}{2k+2})$. $\square $

1.3 The Proof of Proposition 4

Proof

If $k=0$, after restarting $n-1$ times ($n\ge 1$), there are three cases in which Algorithm 3 goes to step 6 and returns the result before it goes back to step 2 again.

(1)
$f=-1$ with probability
$$\begin{aligned} \left( \frac{1}{2}\right) ^n\left( \frac{x^{n-1}}{(n-1)!}\right) x^{n-1}, \end{aligned}$$
and at a cost of
$$\begin{aligned} \sum _{n=1}^{\infty }\left( \frac{1}{2}\right) ^n\left( \frac{x^{n-1}}{(n-1)!}\right) x^{n-1}(2n-2) \end{aligned}$$
(4)
uniform deviates.
(2)
$z>y$ with probability
$$\begin{aligned} \left( \frac{1}{2}\right) ^nx^{n-1}\left( \frac{x^{n-1}}{(n-1)!}-\frac{x^n}{n!}\right) , \end{aligned}$$
and at a cost of
$$\begin{aligned} \sum _{n=1}^{\infty }\left( \frac{1}{2}\right) ^nx^{n-1}\left( \frac{x^{n-1}}{(n-1)!}-\frac{x^n}{n!}\right) (2n-1) \end{aligned}$$
(5)
uniform deviates.
(3)
$f=0$ and $r>x$ with probability
$$\begin{aligned} \left( \frac{1}{2}\right) ^nx^{n-1}\left( \frac{x^n}{n!}\right) (1-x), \end{aligned}$$
and at a cost of
$$\begin{aligned} \sum _{n=1}^{\infty }\left( \frac{1}{2}\right) ^nx^{n-1}\left( \frac{x^n}{n!}\right) (1-x)(2n) \end{aligned}$$
(6)
uniform deviates. Therefore, when $k=0$, by summing Eqs. (4)–(6), we have the expected number of uniform deviates used by Algorithm 3, which can be reduced to a function of x:
$$\begin{aligned} \frac{(x+2)\cdot \tau _0(x)-2}{2x}, \end{aligned}$$
where $\tau _0(x)=\exp (\frac{x^2}{2})$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Du, Y., Fan, B. & Wei, B. An improved exact sampling algorithm for the standard normal distribution. Comput Stat 37, 721–737 (2022). https://doi.org/10.1007/s00180-021-01136-w

Download citation

Received: 16 September 2020
Accepted: 08 July 2021
Published: 28 July 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s00180-021-01136-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved exact sampling algorithm for the standard normal distribution

Abstract

Access this article

Similar content being viewed by others

Stratified random sampling from streaming and stored data

A Systematic Review of Hidden Markov Models and Their Applications

Simulation optimization: a review of algorithms and applications

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

1.1 The Proof of Proposition 1

Proof

1.2 The Proof of Proposition 3

Lemma 1

Proof

Proof of Proposition 3

1.3 The Proof of Proposition 4

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An improved exact sampling algorithm for the standard normal distribution

Abstract

Access this article

Similar content being viewed by others

Stratified random sampling from streaming and stored data

A Systematic Review of Hidden Markov Models and Their Applications

Simulation optimization: a review of algorithms and applications

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 The Proof of Proposition 1

Proof

1.2 The Proof of Proposition 3

Lemma 1

Proof

Proof of Proposition 3

1.3 The Proof of Proposition 4

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation