Handling uncertainty issue in software defect prediction utilizing a hybrid of ANFIS and turbulent flow of water optimization algorithm

Elsabagh, M. A.; Emam, O. E.; Gafar, M. G.; Medhat, T.

doi:10.1007/s00521-023-09315-0

Handling uncertainty issue in software defect prediction utilizing a hybrid of ANFIS and turbulent flow of water optimization algorithm

Original Article
Open access
Published: 14 December 2023

Volume 36, pages 4583–4602, (2024)
Cite this article

Download PDF

You have full access to this open access article

Neural Computing and Applications Aims and scope Submit manuscript

Handling uncertainty issue in software defect prediction utilizing a hybrid of ANFIS and turbulent flow of water optimization algorithm

Download PDF

M. A. Elsabagh ORCID: orcid.org/0000-0001-8704-3887¹,
O. E. Emam²,
M. G. Gafar¹ &
…
T. Medhat³

817 Accesses
1 Citation
Explore all metrics

Abstract

During the development cycle of software projects, numerous defects and challenges have been identified, leading to prolonged project durations and escalated costs. As a result, both product delivery and defect tracking have become increasingly complex, expensive, and time-consuming. Recognizing the challenge of identifying every software defect, it is crucial to foresee potential consequences and strive for the production of high-quality products. The goal of software defect prediction (SDP) is to identify problematic locations within software code. This study presents the first experimental investigation utilizing the turbulent flow of water optimization (TFWO) in conjunction with the adaptive neuro-fuzzy inference system (ANFIS) to enhance SDP. The TFWO_ANFIS model is designed to address the uncertainties present in software features and predict defects with feasible accuracy. Data are divided randomly at the beginning of the model into training and testing sets to avoid the local optima and over-fitting issues. By applying the TFWO approach, it adjusts the ANFIS parameters during the SDP process. The proposed model, TFWO_ANFIS, outperforms other optimization algorithms commonly used in SDP, such as particle swarm optimization (PSO), gray wolf optimization (GWO), differential evolution (DE), ant colony optimization (ACO), standard ANFIS, and genetic algorithm (GA). This superiority is demonstrated through various evaluation metrics for four datasets, including standard deviation (SD) scores (0.3307, 0.2885, 0.3205, and 0.2929), mean square error (MSE) scores (0.1091, 0.0770, 0.1026, and 0.0850), root-mean-square error (RMSE) scores (0.3303, 0.2776, 0.3203, and 0.2926), mean bias error (MBE) scores (0.1281, 0.0860, 0.0931, and 0.2310), and accuracy scores (87.3%, 90.2%, 85.8%, and 89.2%), respectively, for the datasets KC2, PC3, KC1, and PC4. These datasets with different instances and features are obtained from an open platform called OPENML. Additionally, multiple evaluation metrics such as precision, sensitivity, confusion matrices, and specificity are employed to assess the model’s performance.

An Improved Firefly Algorithm for Software Defect Prediction

Feature selection using firefly algorithm in software defect prediction

Article 23 October 2017

DIFACONN-Miner II Algorithm to Discover Causes of Quality Defects

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Defects are the most significant problems in the current situation, and forecasting them is a difficult procedure or process. This bug or defect’s presence increases the likelihood that the project will fail. Consequently, it may result in a drop in project quality as well as an increase in time and cost. As a result, finding these problems early in the software development life cycle (SDLC) reduces both the time and financial costs of the project as a whole. Therefore, defect prediction plays an essential role in the developing and testing phases and contributes to the success of the entire project. At the beginning of the SDLC, defects should be anticipated. For this reason, a variety of SDP models have been developed for professionals to locate the modules that are initially identified as defective [1, 2]. To meet user goals in a constrained amount of time, software engineering requires excellent quality and stability. Quality assurance teams can efficiently allocate their limited resources using SDP models to inspect and test software products [3, 4].

Initially, software businesses relied on manual testing, which consumed 27% of the project’s time and could not address all software defects. Typically, these businesses lack the resources and time to resolve every issue before product release, resulting in harm to their reputation and product value. SDP models provide a solution, allowing businesses to prioritize critical issues and allocate resources efficiently to the most defect-prone code [5].

Machine learning (ML) is one of the promising methods that are having a big impact on prediction. ML is concerned with the creation of algorithms that can recognize patterns in known data to create models and then use those models to predict outcomes from unknown data. This is especially true when combined with data mining methods [6, 7]. As a result, deep learning (DL) and ML approaches have been widely used in SDP to enhance its performance.

Various methods, including support vector machine (SVM) [8], bagging [9], Naïve Bayes (NB) [10], boosting [11], C4.5 [12], random forest (RF) [13], artificial neural network (ANN) [14], and K-nearest neighbor (KNN) [15], have been used in SDP. Despite the fact that these individual nonlinear machine learning algorithms outperform conventional models in SDP, these algorithms have issues with the accuracy of handling uncertainty in SDP and with over-fitting and parameter optimization [7]. As a result, composite algorithms have been developed to improve prediction accuracy and address the shortcomings of single models [16,17,18]. Moreover, meta-heuristic algorithms have been used in SDP to enhance the accuracy of prediction due to their ability to decrease complexity issues in real life, find the best solution, and search globally [7]. Every instance in the population offers a potential solution, and compared to other traditional approaches currently in use, they are more popular because of their intricacy and efficiency [19, 20]. According to the no free lunch (NFL) theorem [21], no single meta-heuristic method can solve all optimization problems. In other words, a specific meta-heuristic algorithm may produce good results in some situations but perform poorly in others.

In the context of SDP, addressing uncertainty is crucial. ANFIS, a form of soft computation, combines ANN capabilities with fuzzy inference processes. ANFIS offers strong adaptation abilities and a rapid, precise learning process [22, 23]. However, a significant challenge in real-world applications is training ANFIS parameters. Researchers prioritize adjusting these parameters for improved precision and accuracy. Various training techniques have emerged, typically categorized as probabilistic and deterministic methods.

Least square estimator (LSE) and gradient descent (GD) [24,25,26] are two examples of deterministic categories that are slow and occasionally fail to converge. Additionally, because the chain rule deployed creates the gradient computation at each step, the conventional ANFIS learning systems employ the GD algorithm, leading to a large number of local optimums. In contrast, a novel optimization technique based on TFWO is utilized in this paper. The random and natural behavior of vortices in oceans, rivers, and seas served as an inspiration for this technique [27].

In this paper, we employ a novel optimization technique based on TFWO to optimize ANFIS parameters. This optimization model takes advantage of the random and natural behavior of vortices in oceans, rivers, and seas.

The contributions of the study include the enhanced handling of uncertainty with greater accuracy in SDP through the proposed TFWO_ANFIS model. This model leverages the advantages of TFWO for adapting the ANFIS model’s parameters. The ANFIS training process uses the TFWO technique as a method for parameter adaption. The fuzzification and the defuzzification layers (premise and consequent parameters) are where the adaptive parameters are located. Four datasets were used with various evaluation criteria to assess the effectiveness of the proposed TFWO algorithm for adapting ANFIS parameters such as RMSE, MSE, SD, and accuracy. TFWO_ANFIS outperformed all other compared techniques with standard ANFIS and with specific optimization techniques [28], such as GA, DE [29], ACO, PSO [30], and GWO [31,32,33].

Given the rapid utilization of ML and artificial intelligence (AI)-based software-intensive systems in semi-autonomous automobiles, recommendation systems, and various real-world applications, there are concerns about the outcomes of their use, especially when these systems have the potential to affect the environment or people, as in the case of self-driving cars or the medical field. In such situations, addressing these uncertainties is crucial [34]. The developed model is used to predict defects in software with higher accuracy under uncertainty. The outcomes show that the recommended model TFWO_ANFIS outperformed the alternative optimization techniques in terms of the ANFIS’s training and testing error rates.

This research highlights the presence of uncertainty in software features, leading to adverse outcomes in SDP, including low product quality, increased defects during the SDLC, and extended delivery time and costs. To address this issue, a solution lies in combining the capabilities of an ANN with a fuzzy inference system known as ANFIS. The research proposed an enhanced variation of ANFIS termed turbulent flow of water optimization algorithm (TFWO) that increases ANFIS’s overall optimization performance. The proposed upgrade focuses on training ANFIS parameters with a novel optimization technique, as opposed to LSE and GD, which are time-consuming, prone to a large number of local optima, and sometimes fail to converge. The TFWO_ANFIS model aims to better manage software metric uncertainty and predict defects with higher accuracy. Dealing with these problems leads to predicting defects in software with a feasible accuracy. Improving software performance, meeting customer needs in a short period of time, and assisting quality control teams in effectively allocating their limited assets during software system evaluation are the motivating factors behind handling uncertainty in software defect prediction and obtaining higher accuracy in the suggested model.

The following are the benefits of treating uncertainty in SDP:

1.
Models become more dependable when uncertainty is considered during software development. Additionally, appropriate software model validation helps reduce uncertainty in later phases of development.
2.
Applying software uncertainty modeling can improve decision-making during the development process.

The major contribution of this research can be summarized as follows:

(1)
Four datasets from NASA named KC2, PC3, KC1, and PC4 are utilized with different instances and features. They are obtained from an open platform called OPENML.
(2)
Proposed a novel model for predicting defects in software with higher accuracy in uncertain environments.
(3)
Utilizing the TFWO algorithm for adapting ANFIS’s parameter optimization rather than traditional algorithms.
(4)
Comparing the suggested TFWO_ANFIS with conventional ANFIS, ACO_ANFIS, DE_ANFIS, PSO_ANFIS, GWO_ANFIS, and GA_ANFIS.
(5)
Evaluating the suggested TFWO_ANFIS against some recent relevant metrics in SDP such as SD, MSE, RMSE, MBE, and accuracy.

The rest of this paper is structured as follows: Sect. 2 presents the related works on software defect prediction, the optimization process of ANFIS, and uncertainty analysis. Section 3 shows methods and materials. Section 4 presents the results and discussion. Finally, Sect. 5 presents conclusions and future work.

2 Related works

The related literature is organized into three subsections to precisely cover the essential topics in this research and present the latest findings in each field. First, software defect prediction is the process of identifying and rectifying flaws. In the realm of developing embedded software, this task is particularly time-consuming and expensive due to the complex infrastructure, large scale, time constraints, and cost considerations. Measuring and achieving quality becomes a significant challenge, especially in automated systems. Second, the optimization process of ANFIS, where the ANFIS model offers the advantage of integrating linguistic and numerical expertise. Additionally, ANFIS harnesses the data categorization and pattern recognition capabilities of artificial neural networks (ANN). This organization aims to provide a comprehensive understanding of the critical aspects of this research. The ANFIS architecture is less prone to memorization problems and is clearer to the user than the ANN.

As a result, the ANFIS has a number of benefits, such as the ability to adapt, nonlinearity, and quick learning [35, 36]. Third, uncertainty analysis in SDP, especially in software features, can be handled in this research by adapting the parameters of ANFIS architecture. As a result, it is important to study the related work of these subsections in detail severally.

2.1 Software defect prediction (SDP)

Software testing is a crucial phase in the software development life cycle, as it identifies defects in the system and ensures that the software passes input test cases. Testing is not only time-consuming but also costly. While some automated technologies can help reduce testing effort, their high maintenance costs often contribute to increased expenses. Early software defect prediction decreases work and budget greatly without compromising limitations. It highlights the modules that are more prone to defects and need more thorough testing. The difficulties in dimensionality reduction and class imbalance located in SDP, demand for a realistic and efficient defect prediction technique. Recently, ML has become a potent method for making decisions in this area [37]. SDP primarily relies on prediction models to anticipate software defects. Although various strategies and algorithms have been employed to enhance the performance, the fundamental processes of SDP are illustrated in Fig. 1 [38]:

(1) Accumulate clean and flawed code sample data from software systems; (2) collect characteristics to create a dataset; (3) adjust the source data if it is unstable; (4) train an SDP model on a set of data; (5) forecast the flawed parts for a dataset obtained from new software; and (6) assess the accuracy of the SDP model. This process involves iterations.

The process begins with gathering samples of both clean and flawed codes, as shown in Fig. 1. There are numerous formats in which software data are available, including commit messages, source codes, defect files, and other software artifacts. Typically, these data are taken from repositories and archives.

The feature extraction (collect characteristics) phase of SDP is the next stage. Software artifacts, source codes, messages, and commit logs, among others, are transformed into metrics at this phase and utilized as input data for training models. The feature extraction stage depends heavily on the type of input data, which can include McCabe metrics [39], Chidamber and Kemerer (CK) metrics [40], modification histories, assembly code, and source code. A number of DL algorithms today offer automatic feature extraction from more complicated, high-dimensional data in addition to metric-based data. Defect data from well-known open defect repositories, such as the NASA [41] and PROMISE [42] databases, have been used in types of researches in the literature.

Usually, the next stage is elective. Since defect datasets often include a lot fewer faulty parts than non-faulty ones, this phase entails balancing the data. Consequently, this class imbalance issue affects the majority of SDP approaches, as it causes false results for various metrics used to assess SDP performance [43]. This problem can be resolved, and SDP performance can be improved by a number of methods, such as oversampling.

The fourth phase in the SDP process involves finding defective software components. Identifying suitable DL techniques, which can encompass various topologies such as convolutional neural networks and ML types, whether supervised or not, is a key consideration at this stage. Additionally, it is crucial to determine the granularity of the defective sections to be identified, which may range from file and module levels to function, class, or even phrase levels.

The following phase involves utilizing the trained model from the previous stage to forecast the flawed portions of new (test) data. The final phase of the SDP steps uses the prediction made here as its input.

The final stage of the SDP process involves evaluating the created model. Two commonly used metrics for assessing the SDP model are the area under the curve and F-measure. These metrics are employed when evaluating prediction models and making comparisons with other relevant studies.

Tang et al. [44] applied a swarm intelligence optimization technique to offer the model’s ideal parameters in an effort to enhance SDP. This study suggested an adaptive variable sparrow search algorithm (AVSSA) focused on different logarithmic spirals and variable hyper-parameters. This work conducted AVSSA investigations on eight benchmark functions and received positive results.

Elsabagh et al. [5, 45] suggested an innovative classifier based on the spotted hyena optimizer algorithm (SHO) to anticipate defects in both single and cross-projects. SHO acts as a classifier by identifying the most suitable rules among populations. To find the optimal classification criteria, confidence and support are used as a multi-objective fitness function. These classification criteria are applied to other projects with incomplete data or new projects to forecast faults. Four software datasets from NASA were used for experiments.

Kakkar et al. [46] proposed a novel approach that relies on the ANFIS that is optimized by PSO. For improved performance, the PSOANFIS method integrates the adaptability of the ANFIS model with PSO’s capability for optimization. The dataset from various-sized open-source Java projects is used to test the presented model. They suggested an SDP model-based PSOANFIS that provided software engineers with the amount of defects as an output. The data can then be used by engineers to allocate their limited resources, such time and labor, more effectively. The method called PSOANFIS makes use of the ANFIS model’s flexibility and employs PSO to optimize it. The PSOANFIS findings were excellent, and it can also be inferred that the size of the projects may have an impact on how well the SDP model based on PSOANFIS performs.

In response to the class imbalance issue, Somya Goyal [15] proposed the novel neighborhood under-sampling (N-US) approach. This work aims to demonstrate the effectiveness of the N-US approach in accurately predicting damaged modules. N-US samples the dataset to enhance the visibility of minority data points while minimizing the removal of majority data points to avoid information loss.

Nasser et al. [47] offered robust-tuned-KNN (RT-KNN), an ML method for SDP based on the K-nearest neighbors classifier. Their work was summarized as follows: (1) adjusting KNN and determining the ideal value for k in both the testing and training stages that may produce accurate prediction outcomes. (2) Rescaling the many independent inputs using the robust scalar.

Lei Qiao et al. [48] put out a fresh strategy that makes use of DL methods to forecast the occurrence of defects. First, they refine a dataset that is openly accessible by performing data normalization and log transformation. To build the data input for the DL method, they next undertook data modeling. Third, they sent the generated data to a deep neural network-based algorithm that was specifically created to forecast the number of faults. The following table presents a comparative study of SDP and illustrates the contribution to the most common literature review and the future possibilities for improving the SDP field.

2.2 Optimization process of ANFIS

ANFIS offers all the advantages of fuzzy systems and neural networks. However, when used for real-world applications, one of the major issues is learning ANFIS parameters. The problem of ANFIS learning has been addressed in numerous prior research using methods based on various algorithms, including the PSO, GWO, and GA.

Hasanipanah et al. [54] proposed a contemporary method for predicting rock fragmentation using the PSO method for parameter optimization in conjunction with ANFIS learning. Their model has shown efficacy when compared to SVM and multiple regression (MR) techniques.

Lin et al. [55] developed a method for learning ANFIS parameters based on the PSO. The system concentrated on applying quantum behaving PSO (QPSO) for setting the parameters of ANFIS. While the premise parameters were changed using the QPSO algorithm, the LSE was used to define the subsequent parameters.

Rahnama et al. [56] utilized ANFIS fuzzy c-means, ANFIS subtractive clustering, ANFIS grid partitioning, and radial basis function (RBF) to anticipate the sodium adsorption rate of different areas in Iran. Also, Asadollahfardi et al. [57] used the GA algorithm to detect the optimal combination for optimizing the tracking stations of water quality. Asadollahfardi et al. [58] applied three models: fuzzy regression analysis, ANFIS, and RBF to predict the reactor efficiency of eliminating acid red 14.

In rainfall gage only areas, Aghelpour et al. [59] developed an efficient ANFIS method for agricultural drought detection, utilizing a minimal number of variables. They applied ANFIS in conjunction with bio-inspired optimization methods, including ANFIS-PSO, ANFIS-GA, and ANFIS-ACO. Among these, GA and ACO proved to be the most effective algorithms for ANFIS optimization.

On the other hand, a lot of research has gone into describing how the GA for adjusting ANFIS parameters works. For the purposes of predicting rainfall on river, Panda et al. [60] presented and applied the MR and the ANFIS method. Both methods have been used to predict the outcome as learning models. To obtain the hydrological parameter condition, the GA is next coupled with the MR training technique. The goal function’s optimal control factor value is obtained via a GA. A novel modified GA was developed by Sarkheyli et al. [61] using various population structures to improve the parameters for the fuzzy membership functions and rules of ANFIS.

Raftari et al. [31] calculated the friction strength ratio using a technique that employed two-parameter optimization methods, GA and PSO. Dehghani et al. [62] created a method for forecasting and simulating the short to long-term influence flow rate. To anticipate the quick, short, and long flow rates, ANFIS and GWO were combined. GWO optimized and modified each parameter of ANFIS.

Maroufpoor et al. [63] created a method that combined the ANFIS with the GWO. The method outperformed the SVM, neural network, and standard ANFIS methods in terms of performance. A strategy for compressive power forecasting of energy, expense, and timeframe was presented by Golafshani et al. [64]. They employed the GWO and ANFIS methodologies to modify the ANN’s initial weights and parameters. A method for whale optimization algorithm (WOA) that used 28 days for the assessment of compressive power of concrete was proposed by Bui et al. [65]. The WOA is used to optimize its computational parameters in conjunction with a neural network (NN).

2.3 Uncertainty analysis

In risk evaluation, information currently available is gathered and used to inform judgments about the risk connected to a specific stressor, such as a physical, biological, or chemical factor. Risk assessment decisions are generally not made with complete clarity, which leads to confusion and uncertainty. Risk assessment includes a section called uncertainty analysis, which concentrates on the assessment’s uncertainties. The qualitative analysis that detects the uncertainties, the quantitative analysis that examines how the uncertainties affect the decision-making process, and the communication of the uncertainty are crucial elements of uncertainty analysis. The problem will determine how to analyze the uncertainty [66]. The way a scientist views uncertainty frequently differs by field. A risk manager would frequently perceive uncertainty as a decision-making process, assessing the costs and errors of actions. Uncertainty is perceived as a bothersome element that impairs decisions.

Kläs et al. [34] proposed three effective categories for identifying the primary sources of uncertainty in practice: model fit, data quality, and scope compliance. They emphasize the significance of these categories in the context of AI and ML model development and testing by establishing connections with specific tasks and methods for assessing and addressing these uncertainties.

One of the hardest issues in medical image analysis is accurate automated medical picture classification, covering segmentation and classification. DL techniques have recently achieved success in the classification and segmentation of medical images, indeed emerging as state-of-the-art techniques. However, most of these techniques are frequently overconfident and unable to offer uncertainty quantification (UQ) for their results, which can have severe effects. To solve this problem, Bayesian DL (BDL) techniques can be employed to quantify the uncertainty of conventional DL techniques. Three strategies for identifying uncertainty are used by Abdar et al. [67] to address uncertainty in the classification of skin cancer images. They are ensemble Monte Carlo (EMC) dropout, deep ensemble (DE), and Monte Carlo (MC) dropout. They offered a novel hybrid dynamic BDL method that accounts for uncertainty and relies on the three-way decision (TWD) theory to address the ambiguity or uncertainty that remains after using the MC, EMC, and DE approaches.

Walayat et al. [68] introduced a novel predictive model based on fuzzy time series, weighted averages (WA), and induced ordered weighted averages (IOWA).

A recent development in water engineering is fuzzy logic, a soft computing approach of AI. It is a fantastic mathematical tool for dealing with system uncertainty brought on by fuzziness or ambiguity. Bisht et al. [69] applied fuzzy logic modeling and ANFIS as soft computing methodologies. These systems start with some fundamental guidelines that define the procedure. To predict the elevation of the ground water table, two methods using fuzzy rules and two methods using ANFIS have been created. Out of all the generated methods, ANFIS produced the best results based on performance criteria [69].

Finally, based on the literature review, traditional techniques such as LSE and GD have been employed to modify the parameters of ANFIS [24,25,26] to handle uncertain environments. However, these techniques are often slow and may fail to converge. Furthermore, using the chain rule in conventional ANFIS learning systems, which employ the GD algorithm, can result in many local optimums. Consequently, optimizing ANFIS parameters becomes a significant issue in real-world applications to handle uncertainty and improve accuracy. Hence, there is a growing demand to learn ANFIS parameters in SDP and choose the appropriate optimization algorithm for their management. In this study, the TFWO algorithm is selected to fine-tune ANFIS parameters due to its stable architecture, enhanced convergence capability, and effectiveness in addressing the control parameter selection issue. TFWO is inspired by the random and natural behavior of vortices in oceans, rivers, and seas.

3 Methods and materials

In this research, methods and materials to handle uncertainty in SDP are organized into three subsections: (1) ANFIS that represents human reasoning to address uncertainty problems. Fuzzy logic is used by ANFIS to turn information connections and fully integrated components of NN inputs into the desired output. (2) The TFWO algorithm is used as an optimization algorithm for modifying the parameters of the ANFIS during the SDP process due to its efficiency and reliability. (3) Adaptation of ANFIS utilizing TFWO: This subsection demonstrates the configuration of ANFIS with TFWO. ANFIS system is trained using the TFWO algorithm to optimize its parameters. This adaptation is illustrated through the flowchart of TFWO in Fig. 5, algorithm 1, and the architecture of TFWO_ANFIS model in Fig. 6.

3.1 ANFIS: adaptive neuro-fuzzy inference system

Jang [70] introduced ANFIS, an AI technique that emulates human thought processes to address inaccuracies. ANFIS utilizes fuzzy logic to process inputs from integrated neural network components and information links to produce appropriate outputs. This method is a straightforward approach to data learning. ANFIS combines fuzzy logic and ANN, making it capable of handling complex nonlinear problems, imprecise data, and human cognitive uncertainty within a single structure [71]. ANFIS is a widely used significant contribution approximated where the relationship among both the input and output dimensions of the problem is represented as a collection of if–then rules.

The Mamdani fuzzy technique and the Takagi–Sugeno (T–S) fuzzy technique are two popular fuzzy rule-based inference systems [71]. The Mamdani fuzzy technique has some benefits: 1. It makes sense. 2. It is generally accepted. 3. It is compatible with human cognition [72,73,74].

The T–S system ensures output surface continuity and performs well with linear techniques [75, 76]. However, it faces challenges in handling multi-parameter synthetic assessment and weighing each input while applying fuzzy rules. On the other hand, the Mamdani system is known for its readability and understandability to a broad audience. In this work, we employ the Mamdani system, which proves beneficial in output expression.

It is necessary to designate a function for each of the following operators to fully describe the behavior of a Mamdani system:

1.
For the computation of the rule firing strength with AND’ed premises, use the AND operator (often T-norm).
2.
OR operation for estimating the firing strength of a rule with OR’ed premises (often T-conorm).
3.
An operator for computing suitable consequent membership functions (MFs) depending on the firing strength provided, often a T-norm.
4.
An aggregate operation, typically a T-conorm, for combining qualified consecutive MFs to produce an overall output MF.
5.
A defuzzification operation that converts a sharp single output value from an output MF.

The following theorem is derived if the AND operation and implication operation are product, the aggregate operation is sum, and the defuzzification operation is centroid of area (COA) [77]. Implementing such composite inference has the benefit of allowing the Mamdani ANFIS to learn due to differentiability during processing (Table 1).

Table 1 Comparative study of SDP

Handling uncertainty issue in software defect prediction utilizing a hybrid of ANFIS and turbulent flow of water optimization algorithm

Abstract

Similar content being viewed by others

An Improved Firefly Algorithm for Software Defect Prediction

Feature selection using firefly algorithm in software defect prediction

DIFACONN-Miner II Algorithm to Discover Causes of Quality Defects

1 Introduction

2 Related works

2.1 Software defect prediction (SDP)

2.2 Optimization process of ANFIS

2.3 Uncertainty analysis

3 Methods and materials

3.1 ANFIS: adaptive neuro-fuzzy inference system

3.2 TFWO: turbulent flow of water optimization

3.2.1 The whirlpool concept: an introduction to turbulent water flow

3.2.2 TFWO algorithm

3.2.2.1 The impacts of whirlpools on its set of objects and other whirlpools

3.2.2.2 Centrifugal power \(({\mathbf{F}\mathbf{E}}_{\mathbf{i}})\)

3.2.2.3 The whirlpools’ interactions

3.3 Adaptation of ANFIS utilizing TFWO

4 Results and discussion

4.1 Evaluation performance

4.2 TFWO_ANFIS evaluation

4.2.1 Tools and environment

4.2.2 Output of experiment

4.2.3 Result discussion

5 Conclusions and future work

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation