On a new distribution based on the arccosine function

This note focuses on a new one-parameter unit probability distribution centered around the inverse cosine and power functions. A special case of this distribution has the exact inverse cosine function as a probability density function. To our knowledge, despite obvious mathematical interest, such a probability density function has never been considered in Probability and Statistics. Here, we fill this gap by pointing out the main properties of the proposed distribution, from both the theoretical and practical aspects. Specifically, we provide the analytical form expressions for its cumulative distribution function, survival function, hazard rate function, raw moments and incomplete moments. The asymptotes and shape properties of the probability density and hazard rate functions are described, as well as the skewness and kurtosis properties, revealing the flexible nature of the new distribution. In particular, it appears to be “round mesokurtic” and “left skewed”. With these features in mind, special attention is given to find empirical applications of the new distribution to real data sets. Accordingly, the proposed distribution is compared with the well-known power distribution by means of two real data sets.


Introduction and motivation
The continuous probability distributions with bounded support are always a treasure for applied statisticians. They make it possible to model bounded characteristics by exploiting a maximum of information underlying the data. In the literature of reliability and survival analysis, lifetime distributions with unbounded support are very large in number, but not much work is available in the case of support equal to (0, 1). Uniform distribution, truncated normal distribution, beta distribution, Kumaraswamy distribution (see [12]), power distribution (see [2]), beta-power distribution (see [7]), log-Lindley distribution (see [10]), unit inverse Gaussian distribution (see [9]), unit Rayleigh distribution (see [3]), unit power-log distribution (see [6]), etc. are some available examples of distributions with bounded support. However, none of these distributions can claim to be able to model all potential characteristics with unit values efficiently; novel distributions with unique properties still have a place.
In this context, we consider an interesting new distribution with support equal to (0, 1), just like the beta and Kumaraswamy distributions. It has many of the same properties as the existing distributions but has some benefits in terms of flexibility. Before presenting it in detail, a retrospective on the inverse cosine function is necessary. Basically, the inverse cosine function or "arccosine" function, denoted by arccos(x), is a classic mathematical function satisfying arccos(cos(x)) = cos(arccos(x)) = x. It is involved in all the branches of mathematics, engineering and mathematical physics. The purpose of this note is based on a remark that does not appear to have been addressed in the literature: the arccosine function defined as g(x) = arccos(x) for x ∈ (0, 1) and g(x) = 0 elsewhere has the properties of a true probability density function (pdf). Indeed, it is continuous over R/{1}, satisfies g(x) ≥ 0 and is of integral equal to one, i.e., It is also decreasing and concave, which is not a frequent property among pdfs with support equal to (0, 1). The motivation of this study stems from the lack of information regarding the arccosine function as pdf and its possible uses in Probability and Statistics. We thus introduce the Arccos distribution defined by the following simple and flexible one-parameter extension of g(x) as pdf: f (x; α) = αx α−1 g(x α ) where α > 0 denotes a shape parameter. That is and f (x; α) = 0 for x / ∈ (0, 1). Hence, for any event A, the probability that a random variable X with the Arccos distribution belongs to A is given by P(X ∈ A) = A f (x)dx, where P denotes the probability operator. Therefore, in this paper, we study the main properties of the Arccos distribution. We begin with the analytical form expressions and characteristics of its cumulative distribution function (cdf), survival function (sf) and hazard rate function (hrf). Asymptotes and shape analysis are examined, revealing the role of α, and also the "round mesokurtic" and "left skewed" nature. Also, a complete part is devoted to various types of moments, skewness and kurtosis, expressing them with the use of the standard gamma function. In our statistical investigations, as the main competitor, the power distribution is considered. Power distribution is important in lifetime data analysis, particularly in environmental policy, public health, and financial engineering. In this regard, we may refer the reader to [4,15]. Keeping these applications in mind, two real data applications are considered and show that the Arccos distribution performs favorably against the power distribution.
The rest of the sections are composed as follows. Section 2 is devoted to the main functions and properties of the Arccos distribution. Section 3 focuses on interesting statistical applications. The article ends with concluding discussions and remarks.

Properties
Here, we examine some fundamental properties of the Arccos distribution.

Functions of the proposed distribution
The Arccos distribution is formally defined by the pdf presented in (1). Therefore, the cdf of the Arccos distribution is obtained as with F(x; α) = 0 for x ≤ 0 and F(x; α) = 1 for x ≥ 1. Immediately, we can derive the corresponding sf as with S(x; α) = 1 for x ≤ 0 and S(x; α) = 0 for x ≥ 1. The analytical form expressions of these two functions are advantages of the Arccos distribution for various purposes. They are involved in survival, reliability and diverse statistical analysis, among others. The same remark holds for the corresponding hrf specified by with h(x; α) = 0 for x / ∈ (0, 1). In particular, the analytical behavior of h(x; α) plays a determinant role for understanding the statistical features of the Arccos distribution, mainly for data fitting purposes. See, for instance, [1]. The determination of the asymptotes and the analysis of the shapes of these functions are performed in the next subsection.

Shape analysis
The asymptotically-equivalent functions and limits of the functions of the Arccos distributions are now determined. First, when x is in the neighborhood of 0, we have Thus, in all circumstances, we have f (1; α) = 0, lim x→1 h(x; α) = +∞ and, obviously, lim x→1 F(x; α) = 1.
For the analytical study of f (x; α), one can notice that According to the asymptotes of f (x; α), the maximum is reached at x = 0, which is finite if and only if α = 1. In the case α > 1, the function has a unique maximum point over (0, 1) given as the solution x m of the following nonlinear equation: Numerical techniques are needed to determine x m . Thus, the Arccos distribution is unimodal. All these mathematical facts are illustrated in Fig. 1.
In Fig. 1, as anticipated, various shapes for the pdf are observed, with maximum in x = 0 or x = x m , depending on the values of α. We also note various decreasing and left-skewed shapes, which are advantageous for the analysis of data sets presenting such tendencies.
The hrf is more complicated to study from the analytical point of view. We thus perform a graphical analysis to understand its shape behavior. The graphs of the hrf for different values for α are given in Fig. 2.
From Fig. 2, it is evident that the hrf of the Arccos distribution can be increasing and bathtub shaped.

Raw moments
Let X be a random variable following the Arccos distribution, i.e., with pdf and cdf given as (1) and (2), respectively. Then, since the support of X is equal to (0, 1), the raw moments of X always exist. The following proposition provides their mathematical expression in terms of the standard gamma function.

Proposition 2.1 The r -th raw moment of X is given by
where E denotes the expectation operator and (x) denotes the gamma function (i.e., Proof Owing to the definition of υ r and an integration by part, we get . Now, using the following well-known properties: By putting all the above equalities together, we get This proves Proposition 2.1. From Proposition 2.1, by taking r = 1, the mean of X becomes explicit, and it is given as and, by using r = 2 and the equation above, the variance of X can be expressed as The standard deviation specified by σ follows by taking the square-root of the above equation. Also, one can express the classical skewness and kurtosis coefficients of X (CS and CK) by applying Proposition 2.1 with well-chosen values of r ; they are given by respectively. Table 1 indicates numerical values for the four first raw moments, variance, CS and CK, for some selected values of α.
From Table 1, we see that, when α > 1, the value of σ 2 decreases as α increases, and when α < 1, the value of σ 2 increases as α increases. On the other hand, the values of CS are decreasing as the values of α gets larger (when either α > 1 or α < 1). Furthermore, the values of CK decrease as the values of α increase when α < 1 and increase when the values of α increase when α > 1. Also, positive and negative values for CS are observed, showing that the Arccos distribution can be right and left-skewed. The value of CK can be close to 3, a bit smaller, or very far away, demonstrating the flexibility of the tailness of the Arccos distribution.

Incomplete moments
Again, let X be a random variable following the Arccos distribution, i.e., with pdf and cdf given as (1) and (2), respectively. The incomplete moment of X always exists and is involved in a multitude of important probability functions, such as the residual life function with its raw moments, the revered residual life function with its raw moments, the Lorenz curve, etc. The complete list can be found in [8].
Here, we determine the expression of the incomplete moments of X in an analytical way.

Proposition 2.2 The r -th incomplete moments of X at t ∈ [0, 1] is given by
Proof We follow the lines of the proof of Proposition 2.1. Owing to the definition of υ r (t) and an integration by part, we get Applying the change of variable y = (x/t) 2α , we obtain By combining the above equalities, we get This ends the proof of Proposition 2.2.
It should be noted that the Gauss hypergeometric function is implemented in most mathematical software; the numerical evaluation of υ r (t) is quite manageable. Also, as an example of application, one can use υ r (t) to define the mean reversed residual life function. In our context, this function is defined by More considerations about the incomplete moments can be found in [8]. As a secondary remark, after some developments, one can show that υ r = lim t→+∞ υ r (t). The next result offers an alternative approach to Proposition 2.3; it presents a power series expansion for υ r (t) that can be used for approximation purposes, among others.
Based on Proposition 2.3, the following simple approximation can be useful for practical aims: with K large enough, and the following simple inequality holds: υ r (t) ≤ (π/2)αt α+r /(α + r ).

Applications
In this section, we show examples where the Arccos distribution is applicable. All computations were done using the R software, which is a free software environment for statistical computing and graphics (see [14]).

Data sets
We consider the two following real data sets to evaluate the fits of the Arccos distribution.
• Data set 1: the source of the first data set is [11]. It is about the times to infection of kidney dialysis patients in months. The data set is: 2.5, 2.5, 3.5, 3.5, 3.5, 4.5, 5.5, 6.5, 6.5, 7.5, 7.5, 7.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5, 12.5, 13.5, 14.5, 14.5, 21.5, 21.5, 22.5, 22.5, 25.5, 27.5. Now, following the spirit of [3], we perform a proportion operation on these data by dividing them by 30, yielding data ranging from 0 to 1. In this case, 30 is an arbitrary number chosen slightly higher than the maximum value of the data, which is 27.5; other numbers can be considered. After this transformation, the considered data set is given below: As a second application, we consider a real data set on the failure times of the air conditioning system of an airplane (in hours), it has been received from [13]. For previous research on these data sets, see [3].

Criteria of comparison
Based on data sets 1 and 2, we aim to evaluate and compare the fits of the Arccos distribution with the fits of a distribution of reference: the power distribution. We recall that the power distribution has the following cdf: and G(x; α) = 0 for x / ∈ (0, 1), where α > 0 is a shape parameter. For α = 1, we obtain as a special case the uniform distribution defined on the interval (0, 1).
We estimate the unknown parameters by the standard maximum likelihood method, and thus obtain the maximum likelihood estimates (MLEs). We use these estimates to derive estimates of the unknown functions Some of the fundamentals of these mathematical approaches and tools are summarized below. In the setting of the Arccos distribution, the MLE of α is obtained asα = argmax α L(α), where L(α) = n j=1 f (x j ; α) and x 1 , . . . , x n denote the data. The SE corresponding toα is equal to [{−∂ 2 log L(α)/∂α 2 } −1 | α=α ] 1/2 . We also have logL = log L(α), AIC = −2 logL + 2k, and BIC = −2 logL + k log(n), with k = 1 corresponding to the number of estimated parameter(s). The K-S statistic is specified by where x (1) , . . . , x (2) are the ordered values of x 1 , . . . , x n . Based on the null hypothesis H 0 : "the dataset values are from the Arccos distribution", the corresponding p value is defined as p value = P(K ≥ K-S), where K is random variable following the appropriate K-S distribution. Concerning the statistics A * and W , they are defined by respectively. We refer to [5] for more details on these statistical measures. Tables 2 and 3 provide the results of a descriptive summary for the fitted Arccos and power distributions for data sets 1 and 2, respectively. From the obtained results, the smallest − logL, AIC, BIC, K-S, A * , W * and the highest p values are acquired for the Arccos distribution. Therefore, the Arccos distribution gives significantly better fit to both datasets based on these measures and hence can be an adequate distribution for these data.

Analysis
Moreover, Figs. 3a and 4a present the estimated pdfs for data sets 1 and 2, respectively. In addition, Figs. 3b and 4b show the comparison of the estimated cdfs for the two distributions with the empirical cdfs.
Apparently, for the two considered data sets, the Arccos distribution captures the general pattern of the histograms. The same holds for the estimated cdfs and the empirical cdfs. In summary, the Arccos distribution gives the best fit.

Conclusions
In this paper, we propose and study an original and intriguing distribution of (0, 1), so-called the Arccos distribution. A detailed study of the asymptotes and shape properties of its main functions is obtained. There is a moments analysis, with analytical form expressions for the raw and incomplete moments. As previously stated, one notable feature of the Arccos distribution is the ability to have an increasing and bathtub-shaped hrf with a flat region, which may be very useful in applied areas. Therefore, it is much more flexible than the power distribution in some sense. This is supported in two applications to real data, where it is verified that the Arccos distribution provides consistently a better fit than the power distribution.