Reliability model of open source software considering fault introduction with generalized inflection S-shaped distribution

Recently, the open source software (OSS) reliability has become one of hot issues. Owing to the uncertainty and complexity of OSS development, testing and debugging environments, OSS are completed dynamically. When detected faults are removed for OSS, they are likely to introduce new faults. Moreover, under the different OSS debugging environments, fault introduction will show different changes. For example, the fault introduction rate shows a decrease change, or increasing first and then decreasing change over time. Considering the complex and dynamic changes in fault introduction, an OSS reliability model that fault introduction obeys a generalized inflection S-shaped distribution is proposed in this paper. Experimental results indicate that the fitting and predictive performance of the proposed model is good. The established model in this paper can adapt the dynamical and complicated changes of fault introduction during OSS debugging. Moreover, the established model can accurately forecast the number of remaining faults in OSS, and assist developers to evaluate the actual OSS reliability. An OSS reliability model considering that fault introduction obeys GISS distribution is proposed. The developed model can be used to assess the reliability of OSS under complicated environments. The developed model can get with the complex testing and development environment of OSS. An OSS reliability model considering that fault introduction obeys GISS distribution is proposed. The developed model can be used to assess the reliability of OSS under complicated environments. The developed model can get with the complex testing and development environment of OSS.


Introduction
During the last several decades, OSS has been gradually accepted and used by people. The development mode of OSS is quite different from that of closed source software (CSS). CSS has hierarchical management in the development and testing processes, and is completed according to the pre-arranged plan and target. Therefore, developers, testers and debuggers are relatively stable during development and testing. However, OSS is developed, tested and debugged dynamically by developers, users and volunteers in the network and open environment. In order to improve the OSS reliability, the industry generally uses the method of frequent release to improve it. But there are two problems. One is that if the OSS is released too early, there will be many faults in the software. It will seriously affect the use of OSS. Second, if OSS is released too late, users and volunteers will be impatient to wait and turn to other OSS to replace it. Therefore, the reliability of OSS will be widely questioned.
In order to solve the problem of OSS reliability evaluation, some researchers have developed a few reliability models of OSS. For instance, Tamura and Yamada [1] established a reliability model of OSS using stochastic differential equation. Li et al. [2] observed that the error detection rate of OSS increased first and then decreased, and established the corresponding OSS reliability model. Yang et al. [3] built delayed OSS reliability models considering the relation between fault introduction and fault detection. Aiming at the debugging activities of OSS, Lin and Li [4] established an OSS reliability model with a rate-based queue theory. Huang et al. [5] proposed a reliability model of OSS considering fault detection with a bounded generalized Pareto distribution. Singh et al. [6] built a multiversion reliability model for OSS based on entropy and an optimal release strategy considering user and volunteer satisfaction level. Wang and Mi [7] established a reliability model for OSS considering that the fault detection rate has a decreasing change trend.
Although the reliability model for OSS can effectively assess the OSS reliability under certain open source conditions, owing to the variability and complexity of OSS development, testing and debugging environments, the established reliability model for OSS cannot fully satisfy the actual OSS reliability assessment.
In the development and testing processes of CSS, there are static testing methods, such as peer review, walk through and inspection, to ensure the quality improvement of CSS. When faults are detected, developers (or debuggers) and testers can communicate face to face and discuss the causes of faults and how to remove them. Thus, in the CSS testing environment, faults can be well described and completely removed. However, faults detected during the OSS test are passed to the developers by users and volunteers through the network, and then the developers organize personnel to remove the faults. In the open environment, users or volunteers cannot clearly describe fault information to developers, which will cause developers to remove the fault incompletely and new faults are introduced. Moreover, in the fault tracking system of OSS, the fault state will be modified from closed to reopened. This indicates that the removed faults have not been completely eliminated, or new faults may have been introduced. Therefore, in the process of reliability modeling of OSS, it certainly needs the reasonable and effective study of the phenomenon of fault introduction in the OSS debugging process.
Considering the complexity and non-linear changes of fault introduction during OSS debugging, the fault introduction rate will show the change of decreasing, or increasing first and then decreasing over time, etc. Therefore, assuming that fault introduction obeys a single law of change, it does not accord with the actual situation of fault introduction in the process of OSS debugging. The reliability model of OSS established by this method cannot satisfy the actual reliability assessment of OSS. At least, the adaptability of the reliability model of OSS established by this method is very poor. It cannot adapt to the complex reliability assessment of OSS.
Additionally, because OSS is very different from CSS in the testing and development processes, the software is evolved through the active participation of volunteers and users located at different geographical locations. Moreover, each release of OSS will add some features, functions or components compared with the previous version. Those changes in OSS lead to the complexity of fault introduction. One is the introduction of new faults for fault removal in the current version of software, and the other is the newly generated faults caused by newly added features, functions and components. When the newly generated fault is removed, a new fault will also be introduced. These two kinds of introduced faults make the debugging of OSS complex. Considering those characteristics, we can use the GISS distribution to simulate the complex changes of introduced faults for OSS. Since the GISS distribution can well simulate the complex changes of introduced faults for OSS, the proposed model in this paper is different from the CSS reliability model which fault detection obeys the GISS distribution in the existing literature [8]. Thus, the established model can fully get with the OSS testing and development environment, and effectively carry out residual fault prediction and software reliability evaluation in OSS.
In this paper, we develop an OSS reliability model considering that fault introduction obeys a generalized inflection S-shaped (GISS) distribution. Assuming that the fault introduction obeys the GISS distribution, the fault introduction rate will show a variety of complex nonlinear changes, for example, the fault introduction rate increases first, then decreases over time; the fault introduction rate decreases or increase over time, and linear changes, for instance, the fault introduction rate is a constant. The OSS reliability model established by this method will have strong adaptability and robustness, and it can adapt to the fault introduction changes during development, testing and debugging of various OSS. Therefore, the established model can be used to assess the OSS reliability.
The contributions of this paper are as follows, (1) To the best of our knowledge, it is the first to propose that fault introduction obeys GISS distribution. The structure of this paper is organized as follows, Related work is introduced in Sect. 2. Section 3 presents the fault introduction rate with the GISS distribution, and introduces the proposed model. Section 4 introduces OSS fault data sets, model comparison criteria and parameter estimation method of the proposed model. Experiments of model performance comparison are done in Sect. 4. Sensitivity analysis of the proposed model parameters is done in Sect. 5. The implication of the study is discussed in Sect. 6. Section 7 is threats to validity of the developed model. Last section is conclusions on this paper.

Related work
In the past of decades, the reliability of OSS has become one of the hot issues in the industry. "Release early, release often. " [9] has become a way for developers to improve the OSS reliability. But it faces two problems: (1) if the software is released too early, there will be too many faults in the software, and users will lose interest in using the software. As a result, the software cannot be widely tested and used. (2) If the software is released too late, users will use other open source software to replace the software. After the software was released, it was not used, tested and abandoned.
By studying the change rule of fault data of OSS, researchers propose that the reliability model of CSS can also be applied to evaluate the OSS reliability. For example, Zou and Davis [10] proposed that the reliability models of CSS can evaluate the OSS reliability, and the reliability model of CSS based on the Weibull distribution has the better fitting and predictive performance than other CSS reliability models. Anbalagan and Vouk [11] proposed that traditional reliability models can evaluate OSS reliability by studying problem reports. Similarly, Rossi et al. [12] study bug repositories and think that the reliability of OSS can be evaluated by traditional CSS reliability models. Chiu et al. [13] proposed a software reliability model in consideration of learning effects of fault detection. Okamura et al. [14] studied the distribution of failure time and established a software reliability model based on the normal distribution. Additionally, considering the weighted value changes of support vector regression (SVR), Utkin and Coolen [15] proposed a corresponding software reliability model. Wang and Zhang [16] used a deep learning method with RNN encoder-decoder to predict software reliability. Ke and Huang [17] developed software reliability models based on changing points of testing efforts.
Because the development and testing environment of OSS is completely different from that of CSS, the method of evaluating the reliability of OSS by using CSS reliability model is widely questioned. Therefore, researchers have proposed some reliability models of OSS. For instance, Tamura and Yamada [18] established an OSS reliability model of the logarithmic Poisson execution time and used the Analytic Hierarchy process to estimate the model parameters. Gyimothy et al. [19] used various object-oriented metrics to study fault prediction for OSS. Tamura and Yamada [20] used neural networks to build OSS reliability model and gave an optimal software release method. Syed-Mohamad and Mcbride [21] indicate that the testing and development process of each OSS is different, and the reliability model can be established according to the specific situation. Tamura and Yamada [22] developed an OSS reliability model using the deterministic chaos theory.
Singh et al. [23] built a reliability model of OSS considering different kinds of faults in the fault data sets. Singh et al. [24] proposed an OSS reliability model based on change-point. Considering the dynamic changes of OSS development and testing, some researchers use a random differential equation to build the corresponding OSS reliability model [1,[25][26][27]. Considering the frequent release of OSS, some researchers proposed a few reliability models of multi-version or multi-release OSS [6,[28][29][30]. Tamura et al. [31] proposed a reliability model of open source cloud computing based on jump diffusion. Tamura [32] proposed a reliability model of open source cloud computing considering the irregular fluctuation of the fault detection rate. Tandon et al. [33] developed multi-release OSS reliability models based on entropy. Through studying changing-point changes, Kapur et al. [34] developed twodimensional OSS reliability models. Ivanov et al. [35] used the goal question metric (GQM) method to evaluate and forecast software reliability under open source situation considering mobile operating environments. Additionally, Barack and Huang [36] used a few software reliability models to evaluate and forecast the reliability of OSS. Wang [37] proposed an OSS reliability model taking account into fault introduction based on the Pareto distribution. Yang et al. [38] built a reliability framework based on a change point for OSS using masked data, and used expectation maximization algorithm to solve the likelihood function as estimating model parameter values. Considering imperfect debugging and changing-point, Saraf et al. [39] developed a multi-release framework on fault correction and fault detection for OSS. Sun and Li [40] proposed a few software imperfect debugging models taking into account fault introduction and fault levels.
Owing to the complexity and dynamic change of the OSS test and development, software reliability model cannot be used in all OSS reliability assessment. Thus, Ullah and Morisio [41] used the optimal selecting model method to determine which software reliability model is used to estimate the current OSS reliability. Considering the complexity of fault detection during software development and tests, Raghuvanshi et al. [42] established a time-variant software reliability model. In addition, considering the dynamic changes of OSS testing and development processes, Saraf et al. [43]. developed multi-release reliability models with imperfect debugging and change-point for OSS. To evaluate the OSS reliability in various complex situations, we develop an OSS model which can adapt to various open source environments and considers fault introduction.

Fault introduction with a generalized inflection S-shaped distribution
Suppose that the fault introduction follows the GISS distribution, then the fault introduction rate can be expressed as follows, where (t) presents the fault introduction rate function. F(t) denotes the GISS distribution. d represents a shape parameter. and are scale parameters.
In general, two main fault types for open source software are stochastically dependent and mutually independent faults. When independent faults are removed, the possibility of introducing new faults is relatively small. When interdependent faults are removed, it is more likely to introduce new faults, or they cannot be completely removed. In other words, in the debugging process of open source software, the possibility of introducing new faults when simple faults are removed is less than that when complex faults are removed. Thus, we can establish the following differential equation, where and (1 − ) denote the fractions of introducing faults in terms of the first and second types, respectively. Solving the Eq. (4), we can derive the following distribution of fault introduction.
where is called as an inflection factor.
In the debugging process of open source software, software complexity is also an important factor affecting fault introduction. Herein, software complexity refers to the complexity of software itself and the complexity of software debugging environment. The former includes the complexity of software scale, algorithm complexity, logic complexity, architecture complexity, parallel storage and parallel computing, etc. The latter refers to the complexity of the debugger's personnel, the complexity of the debugging resources owned by the debugger, and the complexity of the debugger's debugging environment, etc. Because open source software is mainly completed by developers, users and volunteers, there is no fixed debugging personnel. The detected faults are randomly assigned to debuggers by developers. The skills, resources and experience of debuggers will affect the introduction of faults when detected faults in open source software are removed. Therefore, we can use a function to express the influence of software complexity on fault introduction in the debugging process of open source software, such as (t) , that is assumed to be integrable in (0,t) and nonnegative, denotes the effect on (t) for software complexity.
In consideration of minimizing the influence of too many parameters for (t) , we use two parameters for software complexity function (t) . Because the more parameters a function has, the more complex and difficult it is to estimate parameter values. We use the simple power-law function , which a good tradeoff is provides between flexibility and simplicity. Thus, we can extend Eq. (4) as follows, Solving the Eq. (6), we can derive the following distribution of fault introduction considering software complexity.
where d is a shape parameter. Despite Eq. (7) is a simple structure, it is very flexible considering that fault introduction obeys the GISS distribution.
Substituting (2) and (3) into (1), then where in Eq. (8), represents the scale parameter. d is the shape parameter. denotes an inflection factor. In Eq. (8), when t goes to infinity, (t) = dt d−1 . Thus, when t → ∞ and d < 1, (t) = 0. When t → ∞ and d = 1, (t) = d. When t → ∞ and d > 1, (t) tends to infinity. In Fig. 1a-c), the fault introduction rate (t) shows the complex changes when parameters d and change over time, respectively. From Fig. 1a, we can see that (t) tends to zero when d < 1 and t → ∞ . In Fig. 1a, when =100 , the fault introduction rate function first increases and then decreases over time. In Fig. 1b, when d = 1, (t) tends to the constant. In Fig. 1c, when d > 1, (t) tends to infinity. Moreover, when =100 , the fault introduction rate function shows an S-shaped change over time in Fig. 1c. Fig. 1 The changes of the fault introduction rate (t) over time Fault introduction which obeys the GISS distribution can represent a variety of complex changes, and can get with the complicated changes of actual fault introduction in the OSS debugging process. Thus, fault introduction with the GISS distribution is consistent with the situation of fault introduction in the actual OSS debugging process. The reliability model of OSS which fault introduction obeys the GISS distribution can fully satisfy the actual OSS reliability evaluation.

Proposed model
The assumptions of the proposed model are as follows, From Assumptions 1 and 2, the differential equations can be derived as follows, where (t) , and (t) denote the mean function and the content function, respectively. denotes the fault detection rate. From Assumptions 3, 4 and 5, the following equation can be derived, where (t) denotes the fault introduction rate function, and presents the expected number of initially detected faults.
Substituting (4) and (6) into (5), we can derive the following equation, Formula (11) is the developed model expression. Please see the Appendix for detailed derivation process.

Illustrations of fault data sets, comparison models and model comparison criteria
In this paper, we collected three fault data sets from three projects of Apache products of OSS. Its website is https:// issues. apache. org. Each OSS fault data set includes three successive fault data subsets. Please see Table 1 for details of the fault data sets. Note that detected faults of OSS are stored in the bug tracking system. Resolution of faults in the bug tracking system includes FIXED, INVALID, WONTFIX and DUPLICATE, etc. We remove faults whose resolutions are INVALID, WONTFIX and DUPLICATE, and the remaining faults are collected in our OSS fault data sets.
To fully verify the performance of the developed model, we compared the proposed model with other models using six model comparison standards [44]  The least square estimation (LSE) method can be written as follows, where (t i ) represents the cumulative number of detected faults. i denotes the number of actually observed faults. is the sample size. The smaller MSE value is, the better performance the model has 2 where (t i ) represents the cumulative number of detected faults. i denotes the number of actually observed faults. is the sample size. The bigger R 2 value is, the better fitting power the model has where (t i ) represents the cumulative number of detected faults. i denotes the number of actually observed faults. is the sample size. The smaller RMSE value is, the better performance the model has 4 where (t i ) represents the cumulative number of detected faults. i denotes the number of actually observed faults. is the sample size. The smaller TS value is, the better performance the model has Where a and b represent the expected number of initially detected faults and the fault detection rate, respectively Weibull distribution model [46] Where a, b and c represent the expected number of initially detected faults, the fault detection rate and the shape parameter, respectively Generalized Inflection S-shaped (GISS) model [8] Where a, b and c represent the expected number of initially detected faults, the fault detection rate and the shape parameter, respectively. denotes the inflection factor Li model [2] m Where a denotes the expected number of initially detected faults. denotes the shape parameter. N and represent the scale parameters, respectively Wang model [7] m(t) = a d (1 − exp (−bt))∕(1 + exp (1 − exp (−bt))) Where a and b represent the expected number of initially detected faults and the fault detection rate, respectively. d and denote the scale parameters and the inflection factor, respectively Proposed model Where , , , and d represent the expected number of initially detected faults, the fault detection rate, the fault introduction rate, the inflection factor and the shaped parameter, respectively where (t i ) , i and represent the cumulative number of detected faults, the actual observed number of faults and the sample size, respectively. Equation (12) calculates the partial differential. The maximum likelihood estimation (MLE) method can be denoted as follows,   where N(t i ) and n i denote a count process and the number of actual detected faults, respectively.
The model Parameters' values can be estimated by partial differential on both sides of Eq. (14), By solving the differential Eqs. (9,11), the estimated parameter value ( * , * , * , *,d* ) of the developed model can be obtained using LSE and MLE, respectively.

Model performance comparison for model parameters estimation using LSE
In terms of goodness-of-fit, from Tables 4, 5, 6, 7, 8 and 9, as can be seen that, in terms of the fitting performance, the developed model is the best among models. In Table 4, we can see that MSE of the developed model is nearly 2.3 times as small as that of the G-O model using 100% of data for DS1-1. Table 5 shows that RMSE of the Weibull distribution model is nearly 1.74 times as larger as that of the developed model using 100% of data for DS1-2. Table 6 indicates that MSE of the developed model is about 1.44 times as small as that of the Li model using 100% of data for DS1-3. In Table 7 On the whole, the MSE, RMSE, TS and Bias values of the developed model are lower than those of other models using 100% of DS1 and DS2, respectively. Moreover, the R 2 values of the developed model are larger than those of other models using 100% of DS1 and DS2, respectively. Thus, the established model has the better fitting power than other models. From Fig. 2a-f, we can see clearly that the fitting power of the developed model is better than other models.
In terms of prediction, from Tables 4, 5, 6, 7, 8 and 9, the predictive power of the established model is the best among all models. Table 4 shows that MSE predict of the developed model is nearly 6.69 times less than that of the GISS model using 90% of data for DS1-1. From Table 5, we can see that RMSE of the proposed model is approximately 1.29 times less than that of the Li model using 90% of data        Table 6 shows that TS of the GISS model is about 2.3 times larger than that of the developed model using 90% of data for DS1-3. Table 7 shows that RMSE of the G-O model is about 1.24 times as big as that of the developed model using 90% of data for DS2-1. Table 8 shows that MSE predict of the Weibull distribution model is approximately 3.5 times larger than that of the developed model using 90% of data for DS2-2. From Table 9, we can see that TS of the developed model is about three times less than that of the GISS model using 95% of data for DS2-3.
In Tables 4, 5, 6, 7, 8 and 9, the MSE predict , RMSE, TS and Bias values of the developed model are less than those of other models using 90% of DS1, DS2-1 and DS2-2, respectively. Furthermore, the MSE predict , RMSE, TS and Bias values of the developed model are less than those of other models using 95% of DS2-3, respectively. Therefore, the predictive power of the developed model is better predictive power than that of other models. From Fig. 3a-f, we can see clearly that the predictive power of the developed model is the best among models used in this paper. Note that 95% of the fault data sets (DS2-3) are used, mainly because the random selection of fault data sets can more fairly compare the power of the model than the subjective selection of fault data sets.
Overall, the fitting and predictive performance of the Weibull distribution model and the GISS model is better than other models except for the developed model. It also verifies that the CSS reliability model based on the Weibull distribution can be used to assess the reliability of OSS [10]. However, the reliability models of OSS, such as Wang model and Li model does not perform well. It is more advantageous to explain the complexity of OSS, especially the different forms of fault introduction in different open source development environments. Because fault introduction of the developed model has many forms, the proposed model which well meets the changes of OSS fault introduction has better adaptive ability than other models used in this paper. Therefore, the developed model has good adaptability and robustness, which can assist developers to assess the actual OSS reliability in the development and testing processes.

Model performance comparison for model parameters estimation using MLE
In order to compare AIC values, we use the third fault data set to compare the model performance, and use MLE method to estimate the model parameter values. From Table 10, MSE and AIC values of the developed model are less than those of other models and R 2 values of the developed model are larger than those of other models. R 2 values of the developed model are larger than 0.9. These show which the fitting power of the developed model is  better than those of other models. Except for DS3-2, MSE, R 2 and AIC values of other models are very close. In DS3-2, the AIC value of the G-O model is closed to that of the proposed model, and the R 2 value of the Weibull distribution model is close to that of the developed model. From Table 11, we can see that MSE and AIC values of the proposed model are less than those of other models. In DS3-1, the AIC value of the G-O model is closed to that of the developed model. In DS3-2, the MSE value of the Li model is close to that of the developed model and the AIC value of the Weibull distribution model is close to that of the developed model. In DS3-3, MSE values of other models are very close and larger than those of the developed model.
In summary, through the above comparison, it can be concluded that the fitting and prediction performance of the developed model is better than other models. And the fitting and prediction power of other models is not stable. This shows which the developed model has good adaptability and flexibility in the complex environment of OSS testing and development, while other models cannot work well in the complex environment. Finally, it should be noted that when using MLE to estimate the model parameters for the Wang model, there is no existing maximum likelihood function value. So we didn't compare the model in Tables 10, 11.

Sensitivity analysis
Sensitivity analysis is carried out by changing a parameter and fixing other parameters of the proposed model. From Fig. 4, we can see that parameter changes of the developed model. The expected number of initially detected faults ( ) , the fault detection rate ( ) , the fault introduction rate ( ) and the shape parameter (d) for the proposed model have a major impact.
There are mainly the following reasons, (1) The number of faults in OSS has an important influence on OSS Therefore, estimating the total number of faults in OSS is the basis of establishing the OSS reliability model.   In addition, parameter is not an important parameter. Owing to the diversity and complexity of OSS testing and development environments, the cumulative number curve of introduced faults can show a variety of complex changes, not necessarily S-shaped curve. Parameter is also called an inflection factor. Therefore, it can be seen from this point that the fault introduction changes of OSS and CSS are different.

Implication of the study
The implication of this study has the following points,

Threats to validity
The threats to validity of the developed model have two factors. One is external factors, the other is internal factors. External factors: To evaluate the performance of OSS reliability model, more kinds of fault data sets should be selected. However, we have selected three OSS fault data sets to fully evaluate the power of software reliability models. Three OSS fault data sets can meet the basic experimental requirements.  Internal factors: we used Taylor formula to simplify the equation and gave an approximate solution. In order to simplify the calculation, we assume that the failure detection rate is constant. Owing to the complexity of OSS reliability modeling, it is very beneficial to simplify the calculation properly. Moreover, OSS reliability modeling is also a trade-off between complex modeling and practical simplified use.

Conclusions
In this paper, we develop the OSS reliability model considering that fault introduction obeys the GISS distribution. Because fault introduction obeys the GISS distribution, the fault introduction rate can be represented by complex nonlinear changes, such as decreasing, increasing first and then decreasing over time. Therefore, the developed model can get with a variety of different reliability evaluation of OSS. To verify the adaptive and robust performance of the developed model, we use three OSS fault data sets, six model comparison criteria and five comparison models for sufficient experiments. The experimental results show that the developed model has best fitting and prediction performance among all models. We also carry out the relevant model parameter sensitivity analysis, and the results show that the parameters of the developed model, such as the expected number of initially detected faults ( ) , the fault detection rate ( ) , the fault introduction rate ( ) and the shape parameter (d), have an important impact. The developed model can be used to assess the reliability for OSS, and it can also assist developers to evaluate OSS reliability.
Due to the complex and dynamic changes of OSS testing, debugging and development, we will consider various random changes of fault introduction to develop the corresponding OSS reliability model in the future.