Discrete uniform and binomial distributions with infinite support

We study properties of two probability distributions defined on the infinite set {0,1,2,…}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{0,1,2, \ldots \}$$\end{document} and generalizing the ordinary discrete uniform and binomial distributions. Both extensions use the grossone-model of infinity. The first of the two distributions we study is uniform and assigns masses 1/1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1/\textcircled {1}$$\end{document} to all points in the set {0,1,…,1-1}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \{0,1,\ldots ,\textcircled {1}-1\}$$\end{document}, where 1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textcircled {1}$$\end{document} denotes the grossone. For this distribution, we study the problem of decomposing a random variable ξ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} with this distribution as a sum ξ=dξ1+⋯+ξm\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi {\mathop {=}\limits ^\mathrm{d}} \xi _1 + \cdots + \xi _m$$\end{document}, where ξ1,…,ξm\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi _1 , \ldots , \xi _m$$\end{document} are independent non-degenerate random variables. Then, we develop an approximation for the probability mass function of the binomial distribution Bin(1,p)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\textcircled {1},p)$$\end{document} with p=c/1α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p=c/\textcircled {1}^{\alpha }$$\end{document} with 1/2


Introduction
In this paper, we are interested in properties of two probability distributions defined on the infinite set {0, 1, 2, . . .} and generalizing the ordinary discrete uniform and binomial distributions. Both of these extensions have been recently discussed in Calude and Dumitrescu (2020) and mentioned in Zhigljavsky (2012); both extensions use the notion of grossone. The grossone, introduced in Sergeyev (2013) and denoted by 1 , is a model of infinity which, as shown in Sergeyev (2009), Sergeyev (2017 and many other publications can be very useful in solving diverse problems of computational mathematics and optimization; in such applications, 1 is used as numerical infinity. Grossone can also be useful as a theoretical model of infinity, see, e.g., (Zhigljavsky 2012;Sergeyev 2017). Some historical, philosophical and logical aspects of grossone have been considered in Lolli (2012), Lolli (2015), Hansson (2020). In Sect. 1, we consider and briefly discuss postulates of 1 .
In Sect. 2.2, we consider the problem of decomposing a random variable ξ ∼ DU( 1 ) into sums ξ d = ξ 1 + . . . + ξ m , where ξ 1 , . . . , ξ m are independent non-degenerate random variables and the equality d = means that the distributions of the random variables in the lhs and rhs of the equation are equal. In particular, we shall establish that DU( 1 ) is not an infinitely divisible distribution which might have been expected in view of results of Warde and Katti (1971).
The probability mass function (pmf) for Bin (N , p), the binomial distribution with parameters N and p, is where N is usually interpreted as the number of Bernoulli trials and p as the probability of success in these trials. We are interested in approximating the binomial probabilities (3) in the case when N is (very) large but p is rather small like p = c/N α with finite c > 0 and 1/2 < α ≤ 1. This case is important for understanding the distribution Bin( 1 , p), the grossone extension of Bin(N , p). According to the central limit theorem, for any p and large N , the distribution [Bin (N , p) is approximately the standard normal distribution N (0, 1) and, therefore, the binomial distribution Bin(N , p) can be approximated by the normal distribution N (N p, N p(1 − p)). However, if p is small then, even for very large N , this normal approximation is very poor, especially in the tails, see for example (Berry 1941). Also, the support of the random variable with distribution N (N p, N p(1 − p)) barely resembles the support of Bin(N , p) and this could be a serious problem in practice. There are many improvements to the normal approximation, see, e.g., (Brown et al. 2001). However, even corrected normal approximations are rather poor in approximating tails; in particular, the approximations based on the Edgeworth expansion do not guarantee that the approximations to individual binomial probabilities are non-negative, see for example (Petrov 1995) for an excellent account of different approximations in the CLT. Even the shape of the normal approximation N (N p, N p(1 − p)) may be misleading. Consider, for instance, the skewness which is the widely accepted characteristic of a non-symmetry of distributions. The skewness of N (0, 1) is zero, whereas the skewness of [Bin( 1 , p As an example, for p = λ/ 1 we have γ 1 = 1/ √ λ + O( 1 −1 ) which shows that even if N is very large, the binomial distribution Bin(N , p) can still be very asymmetric for small p, even after the renormalization.
Bearing in mind that the normal approximation to Bin(N , p) cannot be suitably corrected if p is small, in Sect. 3 we will concentrate on correcting the Poisson approximation to Bin(N , p) assuming that N is very large but p is of order p = c/N α with finite c > 0 and 1/2 < α ≤ 1.
One of the central concepts used below is the concept of grossone which has been introduced in Sergeyev (2013), developed in a series of papers by Ya. Sergeyev and coauthors and recently comprehensively reviewed in Sergeyev (2017). Grossone can be defined axiomatically, see (Sergeyev 2017). The two main axioms are given below.
Axiom 2 (Divisibility) For any finite positive integer n, 1 is divisible by n.
The grossone models infinity. Similarly, the quantities like 1/ 1 and 1/ 1 2 model infinitesimals. These models, as com-prehensively discussed in Sergeyev (2017), are very useful as theoretical models and as models of numerical infinity and infinitesimals. A very attractive feature of these numerical infinity and infinitesimals is a possibility to operate with them in numerical fashion, exactly as with numbers (rather than with symbols like in MAPLE); this feature is the key concept of the 'infinity computer' discussed in many publications of Ya. Sergeyev and coauthors, see (Sergeyev 2009(Sergeyev , 2017. In mathematics, a more common approach to model infinitesimal quantities is to use the framework of the non-standard analysis. The non-standard analysis approach for modeling infinitesimal probabilities has been recently discussed in Benci et al. (2018). Modeling infinitesimal probabilities with 1/ 1 and similar quantities involving 1 has also attracted serious attention, see (Calude and Dumitrescu 2020;Sergeyev 2017;Rizza 2018). One of an attractive features of the grossone-based approach is that one may simultaneously work with infinitesimal probabilities of different order like 1/ 1 and 1/ 1 2 . It should be stressed that the grossone-based methodology is different from the approach based on the non-standard analysis, see (Sergeyev 2019).

Deconvolution and infinite divisibility of a discrete r.v.
The concepts of deconvolution of a r.v. and its infinite divisibility are closely related. A r.v. ξ can be deconvoluted if it can be represented as ξ d = ξ 1 + ξ 2 , where ξ 1 and ξ 2 are independent but not necessarily identically distributed r.v. Let φ(t) = E exp (itξ) be the characteristic function (c.f.) of ξ . Clearly, ξ can be deconvoluted if and only if φ(t) can be written as a product of two or more characteristic functions of non-degenerate r.v. In Sect. 2.2, we consider the case where ξ is discrete uniform r.v.
Let now ξ be a discrete r.v. taking values 0, 1, . . . with It yields, in particular, that if a discrete r.v. ξ has a finite support then it cannot be infinitely divisible. In particular, discrete uniform DU(n) and binomial Bin(N , p) r.v. are not infinitely divisible. Note that c.f. φ DU(n) (t) of ξ ∼DU(n) and see formulas (2.22) and (5.5) in Balakrishnan and Nevzorov (2004). A very general sufficient condition for the infinite divisibility of a discrete r.v. has been established in Warde and Katti (1971): Theorem 1 see Theorem 2.1 in Warde and Katti (1971). If p 0 = 0, p 1 = 0 and the ratios p j+1 / p j form a monotonously non-decreasing sequence ( j = 0, 1, . . .), then the r.v. ξ is infinitely divisible.
One of our aims in this paper is to generalize the concept of infinite divisibility to the case when the vague ∞ is replaced with rigid 1 and check if Theorem 1 can still be applied. In Sect. 2.2 below, we will consider DU( 1 ) where we show that formal extension of Theorem 1 to the random variables defined on {0, 1, . . . , 1 −1} or {0, 1, . . . , 1 } fails. On the other hand, as shown below in this section, the formal extension of Theorem 1 to Bin( 1 , p) cannot be applied, but the distribution Bin( 1 , p) is infinitely divisible.
Consider ξ ∼Bin( 1 , p) defined on {0, 1, . . . , 1 } with For the ratios p j+1 / p j , we have As 1 is much larger than j for finite j, for such j we can neglect the second term in (4) and we can clearly see that, at least for finite j, the ratios p j+1 / p j are decreasing with j. The formal extension of Theorem 1 is thus not applicable but ξ ∼Bin( 1 , p) is clearly infinitely divisible if we assume Axiom 2. This is a direct consequence of (2) with N = 1 .
All results of Zhigljavsky et al. (2016) can be extended to the case n = 1 . Likewise, we arrive at the conclusion of impossibility of representation of ξ ∼ DU( 1 ) in the form of a sum ξ d = ξ 1 + . . . + ξ m , where ξ 1 , . . . , ξ m are independent non-degenerate random variables (1 < m ≤ 1 ). This conclusion seems to contradict to Theorem 1 of Sect. 2.1. However, the proof of Theorem 1 of Sect. 2.1 (which is Theorem 2.1 of Warde and Katti (1971)) cannot be extended to the case when Pr(ξ = j + 1)/Pr(ξ = j) = 1 for all j as for the validity of formula (3) of Warde and Katti (1971), at least one of the inequalities p j+1 / p j ≥ p j / p j−1 should be strict.
For ξ ∼ DU(n), the number of different decompositions ξ d = ξ 1 + . . . + ξ m with independent non-degenerate random variables ξ 1 , . . . , ξ m is equal to the number of all ordered factorizations of 1 . In view of Hille (1936) and (Chor et al. 2000), the number of all ordered factorizations of n (and hence the number of different decompositions of ξ ∼ DU(n) as sums of independent non-degenerate random variables) may reach the order n ρ , where ρ 1.72865 and is defined as the solution of the equation ζ(ρ) = 2, where ζ(·) is the classical Riemann's zeta-function ζ(s); this function is unambiguously defined for all s with Re(s) > 1. Assuming the grossone divisibility axiom, we thus expect for the number of all ordered factorizations of 1 to be much larger than 1 and reaching 1 ρ with ρ 1.72865. Note that this is not a precise statement as the divisibility axiom does not give us enough information about all divisors of 1 . Note also that there are difficulties (discussed in Sergeyev (2017), Ya (2011) regarding the use of the zeta-function in the grossone-based universe since many different zeta-functions can be distinguished in the latter.

Improving Poisson approximation to the Binomial probabilities
The pmf for Poi(λ), the Poisson distribution with parameter λ, is defined by There is a lot of literature on the accuracy of Poisson approximation to Bin(N , p), see for example (Hodges and Le Cam 1960;Duembgen et al. 2019). If we are not interested in approximating tails of the Binomial distribution, then Poisson approximation Poi(λ) to Bin(N , p) is rather accurate when λ = pN is not too large. However, unless λ is small enough, Poi(λ) does not approximate tails of Bin(N , p) well even if N is very large. There are several approaches for correcting the Poisson approximation including the Stein-Chen method, see (Barbour et al. 1992). Among these asymptotic corrections, the most known is based on the use the expansion with respect to the Charlier polynomials. This expansion has been developed in Uspensky (1931) and can be written as follows: where λ = p/N , b x and p x are defined by (3) and (5), respectively,c j (x; λ) = λ j c j (x; λ) and c j (x; λ) are the Charlier polynomials We will derive an alternative improvement to the Poisson approximation which, as we will demonstrate, is very accurate at the lower tail of the binomial distribution Bin (N , p) with very large N and very small p; the value of λ = N p could be rather large but smaller than √ N . To start with, we rearrange the binomial probability b x as follows Assume N is large enough and p = λ/N . For all λx << N , we have (7): Using the expansion e −y = 1 + ∞ m=1 (−1) m y m /m! and writing y m in the form of the multiple sum Collecting the terms by powers of 1/N , we obtain where the first four polynomials Q j (λ) are The series (9) converges if √ λ/N → 0 as N → ∞. For the term R of (7), we have: where the first four polynomials R j (x) are Combining (7)-(10), we obtain the following expansion for the probability b All the series converge if λx/N → 0 as N → ∞. If λx/N does not tend to 0 as N → ∞, but x/N and √ λ/N do, then we recommend to use the approximation where the term T of (7) is not expanded. Numerical results show that for x in the left tail of the binomial distribution, the approximations (11) and (12) practically coincide if we get enough terms in the expansion for T . Let us rewrite the result (11) in terms of the binomial probabilities of Bin( 1 , λ/ 1 ) keeping only two main terms in the expansion: This formula makes sense in the grossone universe. Indeed, both λ and x could be infinitely large but (λ + x) 2 / 1 should be kept infinitesimal as otherwise the second and third terms in the rhs of (13) become large. In particular, the expansion (13) is valid if λ = c 1 1 α and x = c 2 1 β where c 1 and c 2 are finite constants and both α and β are smaller than 1/2.

Numerical study
In Figs. 1, 2 and 3, we demonstrate accuracy of several approximations for b x in (3) by plotting the ratiob x /b x wherẽ b x is an approximation for b x . We have chosen the following coding for the x axis: "0" is the mean λ = N p of the distribution Bin(N , p) and j = ±1, ±2, . . . denote the points λ + js, where s = √ N p(1 − p) is the standard deviation of Bin (N , p) .
In Figs. 1, 2 and 3, we have chosen p = 0.03 and N = 10 3 , 10 4 , 10 5 so that λ = 30, 300 and 3000. The normal approximation (depicted by black solid line) is always very poor. Due to incorrect skewness, the normal approximation considerably underestimates the probabilities b x for x < λ overestimates them for x > λ. We have tried to use several improved normal approximations, in particular, the ones based on the Edgeworth expansion, but these expansions were not much better and in many cases some of the estimatorsb x became negative. Uncorrected Poisson approximation (blue dashed line) is slightly more accurate than normal, but it is still quite poor. The corrected Poisson approximations (6), based on the expansion with respect to the Charlier polynomials, significantly improve the Poisson approximation. The first-order Charlier approximation (red dotted line), where we keep only the termc 2 (x; λ)/2N in (6), works well for x within the 3σ -interval; the second-order Charlier approximation is very accurate x within the 4σ -interval.  p). In Figs. 4, 5 and 6, we do not use uncorrected normal and Poisson approximations as these approximations are poor. We use three approximations based on the expansion (6): the first-order Charlier approximation (red dotted line), second-order Charlier (green dash-dotted) and third-order Charlier (magenta long dashed). We also use the expansion (12) (brown dashed line), where we keep the first four terms in the expansions for Q and R.
The corrected Poisson approximations (6), based on the expansion with respect to the Charlier polynomials, significantly improve the Poisson approximation. The first-order Charlier approximation works well for x within the 3σinterval, but the second-order and especially third-order Charlier approximations are much more accurate. The new approximation (11) is basically exact at the lower tail of the binomial distribution and outperforms the Charlier approximations.

Conclusion
We study properties of two probability distributions defined on the infinite set {0, 1, 2, . . .} and generalizing the ordinary discrete uniform and binomial distributions. Both extensions use the notion of grossone denoted by 1 . The uniform distribution assigns masses 1/ 1 to all points in the set {0, 1, . . . , 1 − 1}. For this distribution, we study the problem of decomposing a r.v. ξ with this distribution as a sum ξ d = ξ 1 + . . . + ξ m , where ξ 1 , . . . , ξ m are independent nondegenerate r.v. We establish that, under the validity of the grossone divisibility axiom, such decompositions exist, but all r.v. ξ j in the decomposition ξ d = ξ 1 + . . . + ξ m must have different distributions and, as a corollary, that the discrete uniform distribution on the set {0, 1, . . . , 1 − 1} is not infinitely divisible, where the natural extension of the notion of infinite divisibility (introduced in Sect. 2.1) is used.
Then, we study the accuracy of different approximations for the probability mass function of the binomial distribution Bin( 1 , p) with p = c/ 1 α with 1/2 < α ≤ 1. We demonstrate that the normal and uncorrected Poisson approximations are rather poor and develop a new approximation which is demonstrated to be extremely accurate on the lower tail of Bin( 1 , p). We compare the accuracy of the developed approximation with the corrected Poisson approximations constructed from the expansion with respect to the Charlier polynomials. The accuracy of approximations is assessed on the base of a numerical study. To derive approximations, we use asymptotic expansions formulated in the standard language, but the final results we translate into the language of grossone.

Compliance with ethical standards
Conflict of interest The authors declare that there is no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.