Topp-Leone Cauchy Family of Distributions with Applications in Industrial Engineering

The goal of this research is to create a new general family of Topp-Leone distributions called the Topp-Leone Cauchy Family (TLC), which is exceedingly versatile and results from a careful merging of the Topp-Leone and Cauchy distribution families. Some of the new family’s theoretical properties are investigated using specific results on stochastic functions, quantile functions and associated measures, generic moments, probability weighted moments, and Shannon entropy. A parametric statistical model is built from a specific member of the family. The maximum likelihood technique is used to estimate the model’s unknown parameters. Furthermore, to emphasize the new family’s practical potential, we applied our model to two real-world data sets and compared it to existing rival models.

Most statistical distributions face limitations when it comes to adapting to various types of data sets.Indeed, certain datasets exhibit specific characteristics like high skewness, kurtosis, heavy tails, inverted J-shapes, multimodality, and more.Distribution generators offer the capability to manage and manipulate these dataset characteristics effectively.In this research, our objective is to create a novel set of distributions by merging the Topp-Leone and Cauchy distribution families, and showing how distributions from the created family offer a large possibility of adapting to real life data.The Topp-Leone distributions are extensively employed due to their advantageous attributes, such as mathematical simplicity and the versatility of their probabilistic functions.Its cumulative distribution function (CDF) is given by: where  > 0 and 0 ≤ x ≤ 1 and the pdf is: The Topp-Leone-G distribution family emerges from the fusion of (1) and P, where P represents a CDF.It is precisely this CDF that defines the distinguishing features of this distribution family.P(x;) ∈ [0, 1],  > 0 , and P(x; ) is a basic continuous distribution's CDF depend- ing on = ( 1 , … , n ).
One way to define the Cauchy distribution is by its CDF, which is given by: (1) Journal of Statistical Theory and Applications (2023) 22:339-365 where a > 0 represents a scale parameter, b a position parameter, and x ∈ ]0; + ∞[ .By combining P(x, ) and the Cauchy distribution's CDF (4), we get: By merging the Topp-Leone distribution family (3) with the Cauchy distribution (5), we will create the new family.The representation of the CDF of the Topp-Leone distribution family (3) can be restated as: By combining expressions ( 5) and ( 6), we get the new family's CDF which is defined by: = ( , a, b, ) and , a,  > 0 The development of this novel distribution family is driven by several motivations.First, it aims to offer increased flexibility compared to traditional statistical distributions like the normal or exponential, which often have limited parameters, making them less suitable for modeling highly variable real-world data.This new family introduces more parameters, enhancing its ability to adapt to complex phenomena.Second, it extends beyond the restricted value ranges commonly seen in classic distributions, making it applicable to a wider range of fields where data may lack predefined bounds.Additionally, it seeks to model diverse real-world phenomena more accurately, accounting for behaviors such as heavy tails, asymmetries, and extreme values.Finally, its versatility allows for application across various disciplines, from environmental sciences to medicine and engineering.Ultimately, the overarching goal is to improve the quality of statistical model fitting to real data, leading to more precise and reliable results across diverse research and application domains.
From this new family of distributions, several are the special members with very interesting properties.Within this family, we have introduced a distinct member where the basic distribution is represented by the CDF G.This yields the Topp-Leone Cauchy Rayleigh distribution, which encompasses four parameters.To evaluate its effectiveness, we have applied this distribution to two actual datasets and compared its performance against other existing competing models.

A Special Member: The TLCAR Distribution
The TLCA-G family encompasses various distributions, and the process of discovering a new distribution within this family parallels the process of discovering a new base distribution.In this work, the TLCAR distribution is established by utilizing the Rayleigh distribution with a positive shape parameter as the foundational distribution.The CDF of the Rayleigh distribution can be expressed as: where  > 0 et x ∈ [0, +∞[.The associated probability density function (PDF) and hazard rate function (HRF) are given respectively, by: and The CDF that defines the TLCAR distribution is expressed as follows: where , a,  > 0 .The associated PDF and HRF of (8) are respectively: and .
Figure 1 displays a graphical representation of the PDF of the TLCAR distribution, while Fig. 2 illustrates its HRF.Notably, Fig. 1 highlights the potential for the PDF to display a positive asymmetry and a shape of type J inverted.The PDF can exhibit diverse shapes, including increasing, decreasing, inverted, or bathtub-shaped patterns.This observation aligns with previous findings documented in the literature.These curvature characteristics are well known to be useful for developing versatile statistical models.

Several Mathematical Properties of the TLCAR Distribution
This section highlights several important mathematical characteristics of the TLCAR distribution.

Serial Development of f
Proposition 1 The serial development of f is given by: where and 1 3 Journal of Statistical Theory and Applications (2023) 22:339-365 By deriving the expression (16) of F(x) with x, we obtain the serial development (11) of f(x).◻

Rényi Entropy
The Rényi entropy associated with the distribution is given by: where and Proof The Rényi entropy of X for a continuous random variable can be defined by: So we have: Therefore, the Renyi entropy is given by: ◻

Moments
At this point, we will take a closer look at the moments of our distribution.
Moment is a key statistical measure that allows us to characterize the shape of the distribution and to better understand its properties and behavior.
Proposition The moment of order s associated with our distribution is given by: where T i,j is defined in (12) Proof The moment of order s of a variable is determined by: so By using the serial development (11), we have: where S s (x; ) is defined in (23).

Moment Generating Function
Proposition The moment generating function is given by: Proof The moment generating function is determined by: By using the serial development of exponential function, we have: So we can write: (x r ) represents the order moment r of the distribution.So, By replacing (26) in (25), we obtain: where ).

Incomplete Moments
The incomplete moment of TLCAR distribution can be obtained as: By considering the expression obtained with the moment in (22), we have: where T i,j is defined in (12)

Moment-Weighted Probabilities (MWP)
The MWP is defined by: where and Journal of Statistical Theory and Applications (2023) 22:339-365 so By using the binomial formula, we have: The moment generating function is determined by: Therefore: where J i,j,s and I j (x; ) are defined respectively in (27) and (28).◻

Quantile Function
In this section, we will give with justification the quantile function.
Proposition The quantile function associated with the distribution is defined by: .

Proof Let us put
Then by the definition of the quantile function, x y satisfies the nonlinear equation:

So
By raising each member of Eq> (31) to the power 1 we have:

Let put
We get: Considering the expression (32), we can write: In our case, P( 2 2 , So we have: Therefore, the quantile function is given by: Journal of Statistical Theory and Applications (2023) 22:339-365 ◻

Survival Function
The survival function of TLCAR distribution is given by:

Hazard Function
The hazard function of TLCAR distribution is defined as:

Cumulative Hazard Function (Cf)
The cumulative hazard function is defined by the following expression: So, the TLCAR distribution's cumulative hazard function is given by:

Reserve Hazard Function
We use the following expression to determine the reserve hazard function: Therefore, the reserve hazard function of TLCAR distribution is given by:

Mean Waiting Time
The mean waiting time refers to the average time one has to wait for an event or an outcome to occur.It is a measure of central tendency that quantifies the typical or average duration of waiting.It is defined as: By using the expression (11) of f(x), we have: So, the mean waiting time of TLCAR distribution is given by: where

Mean Residual Life
The mean residual life is a concept that pertains to survival analysis and the study of lifetime or duration data.It is a measure that provides information about the average remaining lifetime of an individual or system given that it has already survived up to a certain point.The residual life function (mr) is defined as: By using the serial development (11) of f(x), we have: So, the mean residual life of TLCAR distribution is given by: where

Mean Deviation About Mean
The mean deviation about the mean (also known as the mean absolute deviation or simply the average deviation) is a measure of the average distance between each data point in a dataset and the mean of that dataset.It provides a measure of the dispersion or spread of the data.Suppose X has TLCAR distribution with a mean of .The mean absolute deviation is expressed as follows: By using (34), we have: So, the mean absolute deviation is given by: where

Mean Deviation About Median
The mean deviation about the median is a measure of the average distance between each data point in a dataset and the median of that dataset.It is similar to the mean deviation about the mean, but instead of using the mean as the central measure, the median is used.Let consider the TLCAR distribution's random variable X has a median value of Me.The mean deviation about median can be expressed as: By using (35), we have: 1 3 Journal of Statistical Theory and Applications (2023) 22:339-365 So, the mean absolute deviation is given by: where

Estimation
Let x 1 , x 2 , … , x n be a random sample of size n of a variable X.Then, using the pdf in (9), the likelihood function is given by: So we have: .
The log-likelihood function is defined as: We obtain: Defining the maximum likelihood estimators α , â , b and θ , we satisfy: We have: The log-likelihood function can therefore be rewritten as follows: l( , a, b, ) = n ln 2 + n ln − n ln − n ln a − 2n ln Journal of Statistical Theory and Applications (2023) 22:339-365 The first partial derivatives of l( α, â, b, θ) to be set to zero are provided as follows:

Simulation Study
To assess the consistency of maximum likelihood estimators (MLEs) within the context of our hybrid TLCAR distribution, a comprehensive simulation study is conducted in this section using the R package 'stats4.'We generate one thousand independent samples, each with varying sizes of n = 80, 100, 200, 300 , and 400, drawn from the TLCAR distribution.For each of the 1000 replications, we computed the MLEs for the parameters of interest.Subsequently, we assessed two fundamental statistical metrics: the average bias (Bias) and the root mean square error (RMSE).These metrics were used to evaluate the precision and consistency of the estimated parameters.The results of these calculations are documented in Table 1.This simulation-based approach provides a robust framework for scrutinizing the performance of MLEs and offers valuable insights into their consistency under various sample size scenarios within the unique context of the TLCAR distribution.

Data Analysis
To test the performance of our distribution in real life situations, we will apply it with two appropriate data sets and compare its performance to the existing competing models below: 1. Topp-Leone Compound Rayleigh (TLCR) [29].
We used Mathematica to determine the estimated values of the density function parameters and Matlab to plot them.The analysis of the results will allow us to determine whether the distribution lives up to its promise in real-life situations.

Dataset I:
The provided dataset exhibits failure times (measured in hours) obtained from an accelerated life test involving 59 conductors.The data are presented below [32,33] 2 displays the estimated values for the dataset I.The information criteria obtained from different models on the dataset I are presented in Table 3.Moreover, empirical PDFs and CDFs for dataset I can be visualized in Fig. 3.

Dataset II:
The dataset consists of observations on the fracture toughness of silicon nitride, measured in units of MPa m 1 2 [16].The set is consisted of 119 observations which are: 5.5, 5, 4.9, 6.4, 5.  4 displays the estimated values for the dataset II.The information criteria obtained from different models on the dataset II are presented in Table 5.From Fig. 4, we can see the empirical PDFs and CDFs for dataset II.
After analyzing the Tables 2, 3, 4, 5, and Figs. 3, 4, it can be deduced that the TLCAR model demonstrates a more robust compatibility with the datasets I and II examined in comparison to the rival models.The TLCAR model possesses the  advantage of flexibility, enabling its application to both industrial and artificial intelligence data.Hence, it can be concluded that the TLCAR model is the preferable option for modeling these datasets, owing to its superior performance and versatility in accommodating diverse data types.

Conclusion
Our study focused on the creation and use of a distribution to simulate data.We implemented a rigorous methodology and were able to create a reliable distribution based on empirical data and precise statistical calculations.The simulations performed with this distribution showed results very close to the real data, thus demonstrating the relevance and efficiency of our approach.Furthermore, we emphasized the relevance of assessing the suitability of statistical models to the data by comparing simulation outcomes to real-world data and employing visualization tools like histograms and density functions.
Finally, our study has shown that the creation of custom distributions can be an efficient approach to simulate data in different fields such as finance, biology, industry or physics.
We hope that our work will serve as a foundation for future research and contribute to the advancement of knowledge in these areas.

Future Work and Coming
This research intends to focus on various areas of exploration in the future.One of these topics will include investigating transformations as TX among others in order to create a new distribution for predicting unknown lifetime occurrences.This new distribution will provide a more sophisticated statistical model for studying and predicting lifetimes, as well as it has the potential to be employed in a wide range of applications.Another area of research will be the creation of a bivariate distribution, which will allow us to evaluate and forecast the interaction of two variables.We will also investigate copulas and other aspects of this novel distribution to better understand its behavior and applicability.
Last but not least, we also intend to use the new distribution with engineering and accelerated data to investigate the reliability function's behavior utilizing calibrated data.This will help to establish the new distribution's potential utility in industry and other sectors by providing insights into how it might be utilized in real-world circumstances.Overall, our future research will concentrate on the development of advanced statistical models and the analysis of their behavior in a variety of applications.We are excited about the possible insights and breakthroughs that these investigations may yield.

Fig. 3
Fig. 3 Visualization of Empirical PDFs and CDFs for dataset I

Fig. 4 3
Fig. 4 Visualization of empirical PDFs and CDFs for dataset II

Table 1
Results from Monte Carlo simulations for the TLCAR distribution: Calculations of Mean, RMSE, and Mean Bias

Table 3
The information criteria results for the hailing time data (dataset I)

Table 5
Results of information criteria for hailing time Data (dataset II)