1 Introduction

Defects are the most significant problems in the current situation, and forecasting them is a difficult procedure or process. This bug or defect’s presence increases the likelihood that the project will fail. Consequently, it may result in a drop in project quality as well as an increase in time and cost. As a result, finding these problems early in the software development life cycle (SDLC) reduces both the time and financial costs of the project as a whole. Therefore, defect prediction plays an essential role in the developing and testing phases and contributes to the success of the entire project. At the beginning of the SDLC, defects should be anticipated. For this reason, a variety of SDP models have been developed for professionals to locate the modules that are initially identified as defective [1, 2]. To meet user goals in a constrained amount of time, software engineering requires excellent quality and stability. Quality assurance teams can efficiently allocate their limited resources using SDP models to inspect and test software products [3, 4].

Initially, software businesses relied on manual testing, which consumed 27% of the project’s time and could not address all software defects. Typically, these businesses lack the resources and time to resolve every issue before product release, resulting in harm to their reputation and product value. SDP models provide a solution, allowing businesses to prioritize critical issues and allocate resources efficiently to the most defect-prone code [5].

Machine learning (ML) is one of the promising methods that are having a big impact on prediction. ML is concerned with the creation of algorithms that can recognize patterns in known data to create models and then use those models to predict outcomes from unknown data. This is especially true when combined with data mining methods [6, 7]. As a result, deep learning (DL) and ML approaches have been widely used in SDP to enhance its performance.

Various methods, including support vector machine (SVM) [8], bagging [9], Naïve Bayes (NB) [10], boosting [11], C4.5 [12], random forest (RF) [13], artificial neural network (ANN) [14], and K-nearest neighbor (KNN) [15], have been used in SDP. Despite the fact that these individual nonlinear machine learning algorithms outperform conventional models in SDP, these algorithms have issues with the accuracy of handling uncertainty in SDP and with over-fitting and parameter optimization [7]. As a result, composite algorithms have been developed to improve prediction accuracy and address the shortcomings of single models [16,17,18]. Moreover, meta-heuristic algorithms have been used in SDP to enhance the accuracy of prediction due to their ability to decrease complexity issues in real life, find the best solution, and search globally [7]. Every instance in the population offers a potential solution, and compared to other traditional approaches currently in use, they are more popular because of their intricacy and efficiency [19, 20]. According to the no free lunch (NFL) theorem [21], no single meta-heuristic method can solve all optimization problems. In other words, a specific meta-heuristic algorithm may produce good results in some situations but perform poorly in others.

In the context of SDP, addressing uncertainty is crucial. ANFIS, a form of soft computation, combines ANN capabilities with fuzzy inference processes. ANFIS offers strong adaptation abilities and a rapid, precise learning process [22, 23]. However, a significant challenge in real-world applications is training ANFIS parameters. Researchers prioritize adjusting these parameters for improved precision and accuracy. Various training techniques have emerged, typically categorized as probabilistic and deterministic methods.

Least square estimator (LSE) and gradient descent (GD) [24,25,26] are two examples of deterministic categories that are slow and occasionally fail to converge. Additionally, because the chain rule deployed creates the gradient computation at each step, the conventional ANFIS learning systems employ the GD algorithm, leading to a large number of local optimums. In contrast, a novel optimization technique based on TFWO is utilized in this paper. The random and natural behavior of vortices in oceans, rivers, and seas served as an inspiration for this technique [27].

In this paper, we employ a novel optimization technique based on TFWO to optimize ANFIS parameters. This optimization model takes advantage of the random and natural behavior of vortices in oceans, rivers, and seas.

The contributions of the study include the enhanced handling of uncertainty with greater accuracy in SDP through the proposed TFWO_ANFIS model. This model leverages the advantages of TFWO for adapting the ANFIS model’s parameters. The ANFIS training process uses the TFWO technique as a method for parameter adaption. The fuzzification and the defuzzification layers (premise and consequent parameters) are where the adaptive parameters are located. Four datasets were used with various evaluation criteria to assess the effectiveness of the proposed TFWO algorithm for adapting ANFIS parameters such as RMSE, MSE, SD, and accuracy. TFWO_ANFIS outperformed all other compared techniques with standard ANFIS and with specific optimization techniques [28], such as GA, DE [29], ACO, PSO [30], and GWO [31,32,33].

Given the rapid utilization of ML and artificial intelligence (AI)-based software-intensive systems in semi-autonomous automobiles, recommendation systems, and various real-world applications, there are concerns about the outcomes of their use, especially when these systems have the potential to affect the environment or people, as in the case of self-driving cars or the medical field. In such situations, addressing these uncertainties is crucial [34]. The developed model is used to predict defects in software with higher accuracy under uncertainty. The outcomes show that the recommended model TFWO_ANFIS outperformed the alternative optimization techniques in terms of the ANFIS’s training and testing error rates.

This research highlights the presence of uncertainty in software features, leading to adverse outcomes in SDP, including low product quality, increased defects during the SDLC, and extended delivery time and costs. To address this issue, a solution lies in combining the capabilities of an ANN with a fuzzy inference system known as ANFIS. The research proposed an enhanced variation of ANFIS termed turbulent flow of water optimization algorithm (TFWO) that increases ANFIS’s overall optimization performance. The proposed upgrade focuses on training ANFIS parameters with a novel optimization technique, as opposed to LSE and GD, which are time-consuming, prone to a large number of local optima, and sometimes fail to converge. The TFWO_ANFIS model aims to better manage software metric uncertainty and predict defects with higher accuracy. Dealing with these problems leads to predicting defects in software with a feasible accuracy. Improving software performance, meeting customer needs in a short period of time, and assisting quality control teams in effectively allocating their limited assets during software system evaluation are the motivating factors behind handling uncertainty in software defect prediction and obtaining higher accuracy in the suggested model.

The following are the benefits of treating uncertainty in SDP:

  1. 1.

    Models become more dependable when uncertainty is considered during software development. Additionally, appropriate software model validation helps reduce uncertainty in later phases of development.

  2. 2.

    Applying software uncertainty modeling can improve decision-making during the development process.

The major contribution of this research can be summarized as follows:

  1. (1)

    Four datasets from NASA named KC2, PC3, KC1, and PC4 are utilized with different instances and features. They are obtained from an open platform called OPENML.

  2. (2)

    Proposed a novel model for predicting defects in software with higher accuracy in uncertain environments.

  3. (3)

    Utilizing the TFWO algorithm for adapting ANFIS’s parameter optimization rather than traditional algorithms.

  4. (4)

    Comparing the suggested TFWO_ANFIS with conventional ANFIS, ACO_ANFIS, DE_ANFIS, PSO_ANFIS, GWO_ANFIS, and GA_ANFIS.

  5. (5)

    Evaluating the suggested TFWO_ANFIS against some recent relevant metrics in SDP such as SD, MSE, RMSE, MBE, and accuracy.

The rest of this paper is structured as follows: Sect. 2 presents the related works on software defect prediction, the optimization process of ANFIS, and uncertainty analysis. Section 3 shows methods and materials. Section 4 presents the results and discussion. Finally, Sect. 5 presents conclusions and future work.

2 Related works

The related literature is organized into three subsections to precisely cover the essential topics in this research and present the latest findings in each field. First, software defect prediction is the process of identifying and rectifying flaws. In the realm of developing embedded software, this task is particularly time-consuming and expensive due to the complex infrastructure, large scale, time constraints, and cost considerations. Measuring and achieving quality becomes a significant challenge, especially in automated systems. Second, the optimization process of ANFIS, where the ANFIS model offers the advantage of integrating linguistic and numerical expertise. Additionally, ANFIS harnesses the data categorization and pattern recognition capabilities of artificial neural networks (ANN). This organization aims to provide a comprehensive understanding of the critical aspects of this research. The ANFIS architecture is less prone to memorization problems and is clearer to the user than the ANN.

As a result, the ANFIS has a number of benefits, such as the ability to adapt, nonlinearity, and quick learning [35, 36]. Third, uncertainty analysis in SDP, especially in software features, can be handled in this research by adapting the parameters of ANFIS architecture. As a result, it is important to study the related work of these subsections in detail severally.

2.1 Software defect prediction (SDP)

Software testing is a crucial phase in the software development life cycle, as it identifies defects in the system and ensures that the software passes input test cases. Testing is not only time-consuming but also costly. While some automated technologies can help reduce testing effort, their high maintenance costs often contribute to increased expenses. Early software defect prediction decreases work and budget greatly without compromising limitations. It highlights the modules that are more prone to defects and need more thorough testing. The difficulties in dimensionality reduction and class imbalance located in SDP, demand for a realistic and efficient defect prediction technique. Recently, ML has become a potent method for making decisions in this area [37]. SDP primarily relies on prediction models to anticipate software defects. Although various strategies and algorithms have been employed to enhance the performance, the fundamental processes of SDP are illustrated in Fig. 1 [38]:

Fig. 1
figure 1

SDP process

(1) Accumulate clean and flawed code sample data from software systems; (2) collect characteristics to create a dataset; (3) adjust the source data if it is unstable; (4) train an SDP model on a set of data; (5) forecast the flawed parts for a dataset obtained from new software; and (6) assess the accuracy of the SDP model. This process involves iterations.

The process begins with gathering samples of both clean and flawed codes, as shown in Fig. 1. There are numerous formats in which software data are available, including commit messages, source codes, defect files, and other software artifacts. Typically, these data are taken from repositories and archives.

The feature extraction (collect characteristics) phase of SDP is the next stage. Software artifacts, source codes, messages, and commit logs, among others, are transformed into metrics at this phase and utilized as input data for training models. The feature extraction stage depends heavily on the type of input data, which can include McCabe metrics [39], Chidamber and Kemerer (CK) metrics [40], modification histories, assembly code, and source code. A number of DL algorithms today offer automatic feature extraction from more complicated, high-dimensional data in addition to metric-based data. Defect data from well-known open defect repositories, such as the NASA [41] and PROMISE [42] databases, have been used in types of researches in the literature.

Usually, the next stage is elective. Since defect datasets often include a lot fewer faulty parts than non-faulty ones, this phase entails balancing the data. Consequently, this class imbalance issue affects the majority of SDP approaches, as it causes false results for various metrics used to assess SDP performance [43]. This problem can be resolved, and SDP performance can be improved by a number of methods, such as oversampling.

The fourth phase in the SDP process involves finding defective software components. Identifying suitable DL techniques, which can encompass various topologies such as convolutional neural networks and ML types, whether supervised or not, is a key consideration at this stage. Additionally, it is crucial to determine the granularity of the defective sections to be identified, which may range from file and module levels to function, class, or even phrase levels.

The following phase involves utilizing the trained model from the previous stage to forecast the flawed portions of new (test) data. The final phase of the SDP steps uses the prediction made here as its input.

The final stage of the SDP process involves evaluating the created model. Two commonly used metrics for assessing the SDP model are the area under the curve and F-measure. These metrics are employed when evaluating prediction models and making comparisons with other relevant studies.

Tang et al. [44] applied a swarm intelligence optimization technique to offer the model’s ideal parameters in an effort to enhance SDP. This study suggested an adaptive variable sparrow search algorithm (AVSSA) focused on different logarithmic spirals and variable hyper-parameters. This work conducted AVSSA investigations on eight benchmark functions and received positive results.

Elsabagh et al. [5, 45] suggested an innovative classifier based on the spotted hyena optimizer algorithm (SHO) to anticipate defects in both single and cross-projects. SHO acts as a classifier by identifying the most suitable rules among populations. To find the optimal classification criteria, confidence and support are used as a multi-objective fitness function. These classification criteria are applied to other projects with incomplete data or new projects to forecast faults. Four software datasets from NASA were used for experiments.

Kakkar et al. [46] proposed a novel approach that relies on the ANFIS that is optimized by PSO. For improved performance, the PSOANFIS method integrates the adaptability of the ANFIS model with PSO’s capability for optimization. The dataset from various-sized open-source Java projects is used to test the presented model. They suggested an SDP model-based PSOANFIS that provided software engineers with the amount of defects as an output. The data can then be used by engineers to allocate their limited resources, such time and labor, more effectively. The method called PSOANFIS makes use of the ANFIS model’s flexibility and employs PSO to optimize it. The PSOANFIS findings were excellent, and it can also be inferred that the size of the projects may have an impact on how well the SDP model based on PSOANFIS performs.

In response to the class imbalance issue, Somya Goyal [15] proposed the novel neighborhood under-sampling (N-US) approach. This work aims to demonstrate the effectiveness of the N-US approach in accurately predicting damaged modules. N-US samples the dataset to enhance the visibility of minority data points while minimizing the removal of majority data points to avoid information loss.

Nasser et al. [47] offered robust-tuned-KNN (RT-KNN), an ML method for SDP based on the K-nearest neighbors classifier. Their work was summarized as follows: (1) adjusting KNN and determining the ideal value for k in both the testing and training stages that may produce accurate prediction outcomes. (2) Rescaling the many independent inputs using the robust scalar.

Lei Qiao et al. [48] put out a fresh strategy that makes use of DL methods to forecast the occurrence of defects. First, they refine a dataset that is openly accessible by performing data normalization and log transformation. To build the data input for the DL method, they next undertook data modeling. Third, they sent the generated data to a deep neural network-based algorithm that was specifically created to forecast the number of faults. The following table presents a comparative study of SDP and illustrates the contribution to the most common literature review and the future possibilities for improving the SDP field.

2.2 Optimization process of ANFIS

ANFIS offers all the advantages of fuzzy systems and neural networks. However, when used for real-world applications, one of the major issues is learning ANFIS parameters. The problem of ANFIS learning has been addressed in numerous prior research using methods based on various algorithms, including the PSO, GWO, and GA.

Hasanipanah et al. [54] proposed a contemporary method for predicting rock fragmentation using the PSO method for parameter optimization in conjunction with ANFIS learning. Their model has shown efficacy when compared to SVM and multiple regression (MR) techniques.

Lin et al. [55] developed a method for learning ANFIS parameters based on the PSO. The system concentrated on applying quantum behaving PSO (QPSO) for setting the parameters of ANFIS. While the premise parameters were changed using the QPSO algorithm, the LSE was used to define the subsequent parameters.

Rahnama et al. [56] utilized ANFIS fuzzy c-means, ANFIS subtractive clustering, ANFIS grid partitioning, and radial basis function (RBF) to anticipate the sodium adsorption rate of different areas in Iran. Also, Asadollahfardi et al. [57] used the GA algorithm to detect the optimal combination for optimizing the tracking stations of water quality. Asadollahfardi et al. [58] applied three models: fuzzy regression analysis, ANFIS, and RBF to predict the reactor efficiency of eliminating acid red 14.

In rainfall gage only areas, Aghelpour et al. [59] developed an efficient ANFIS method for agricultural drought detection, utilizing a minimal number of variables. They applied ANFIS in conjunction with bio-inspired optimization methods, including ANFIS-PSO, ANFIS-GA, and ANFIS-ACO. Among these, GA and ACO proved to be the most effective algorithms for ANFIS optimization.

On the other hand, a lot of research has gone into describing how the GA for adjusting ANFIS parameters works. For the purposes of predicting rainfall on river, Panda et al. [60] presented and applied the MR and the ANFIS method. Both methods have been used to predict the outcome as learning models. To obtain the hydrological parameter condition, the GA is next coupled with the MR training technique. The goal function’s optimal control factor value is obtained via a GA. A novel modified GA was developed by Sarkheyli et al. [61] using various population structures to improve the parameters for the fuzzy membership functions and rules of ANFIS.

Raftari et al. [31] calculated the friction strength ratio using a technique that employed two-parameter optimization methods, GA and PSO. Dehghani et al. [62] created a method for forecasting and simulating the short to long-term influence flow rate. To anticipate the quick, short, and long flow rates, ANFIS and GWO were combined. GWO optimized and modified each parameter of ANFIS.

Maroufpoor et al. [63] created a method that combined the ANFIS with the GWO. The method outperformed the SVM, neural network, and standard ANFIS methods in terms of performance. A strategy for compressive power forecasting of energy, expense, and timeframe was presented by Golafshani et al. [64]. They employed the GWO and ANFIS methodologies to modify the ANN’s initial weights and parameters. A method for whale optimization algorithm (WOA) that used 28 days for the assessment of compressive power of concrete was proposed by Bui et al. [65]. The WOA is used to optimize its computational parameters in conjunction with a neural network (NN).

2.3 Uncertainty analysis

In risk evaluation, information currently available is gathered and used to inform judgments about the risk connected to a specific stressor, such as a physical, biological, or chemical factor. Risk assessment decisions are generally not made with complete clarity, which leads to confusion and uncertainty. Risk assessment includes a section called uncertainty analysis, which concentrates on the assessment’s uncertainties. The qualitative analysis that detects the uncertainties, the quantitative analysis that examines how the uncertainties affect the decision-making process, and the communication of the uncertainty are crucial elements of uncertainty analysis. The problem will determine how to analyze the uncertainty [66]. The way a scientist views uncertainty frequently differs by field. A risk manager would frequently perceive uncertainty as a decision-making process, assessing the costs and errors of actions. Uncertainty is perceived as a bothersome element that impairs decisions.

Kläs et al. [34] proposed three effective categories for identifying the primary sources of uncertainty in practice: model fit, data quality, and scope compliance. They emphasize the significance of these categories in the context of AI and ML model development and testing by establishing connections with specific tasks and methods for assessing and addressing these uncertainties.

One of the hardest issues in medical image analysis is accurate automated medical picture classification, covering segmentation and classification. DL techniques have recently achieved success in the classification and segmentation of medical images, indeed emerging as state-of-the-art techniques. However, most of these techniques are frequently overconfident and unable to offer uncertainty quantification (UQ) for their results, which can have severe effects. To solve this problem, Bayesian DL (BDL) techniques can be employed to quantify the uncertainty of conventional DL techniques. Three strategies for identifying uncertainty are used by Abdar et al. [67] to address uncertainty in the classification of skin cancer images. They are ensemble Monte Carlo (EMC) dropout, deep ensemble (DE), and Monte Carlo (MC) dropout. They offered a novel hybrid dynamic BDL method that accounts for uncertainty and relies on the three-way decision (TWD) theory to address the ambiguity or uncertainty that remains after using the MC, EMC, and DE approaches.

Walayat et al. [68] introduced a novel predictive model based on fuzzy time series, weighted averages (WA), and induced ordered weighted averages (IOWA).

A recent development in water engineering is fuzzy logic, a soft computing approach of AI. It is a fantastic mathematical tool for dealing with system uncertainty brought on by fuzziness or ambiguity. Bisht et al. [69] applied fuzzy logic modeling and ANFIS as soft computing methodologies. These systems start with some fundamental guidelines that define the procedure. To predict the elevation of the ground water table, two methods using fuzzy rules and two methods using ANFIS have been created. Out of all the generated methods, ANFIS produced the best results based on performance criteria [69].

Finally, based on the literature review, traditional techniques such as LSE and GD have been employed to modify the parameters of ANFIS [24,25,26] to handle uncertain environments. However, these techniques are often slow and may fail to converge. Furthermore, using the chain rule in conventional ANFIS learning systems, which employ the GD algorithm, can result in many local optimums. Consequently, optimizing ANFIS parameters becomes a significant issue in real-world applications to handle uncertainty and improve accuracy. Hence, there is a growing demand to learn ANFIS parameters in SDP and choose the appropriate optimization algorithm for their management. In this study, the TFWO algorithm is selected to fine-tune ANFIS parameters due to its stable architecture, enhanced convergence capability, and effectiveness in addressing the control parameter selection issue. TFWO is inspired by the random and natural behavior of vortices in oceans, rivers, and seas.

3 Methods and materials

In this research, methods and materials to handle uncertainty in SDP are organized into three subsections: (1) ANFIS that represents human reasoning to address uncertainty problems. Fuzzy logic is used by ANFIS to turn information connections and fully integrated components of NN inputs into the desired output. (2) The TFWO algorithm is used as an optimization algorithm for modifying the parameters of the ANFIS during the SDP process due to its efficiency and reliability. (3) Adaptation of ANFIS utilizing TFWO: This subsection demonstrates the configuration of ANFIS with TFWO. ANFIS system is trained using the TFWO algorithm to optimize its parameters. This adaptation is illustrated through the flowchart of TFWO in Fig. 5, algorithm 1, and the architecture of TFWO_ANFIS model in Fig. 6.

3.1 ANFIS: adaptive neuro-fuzzy inference system

Jang [70] introduced ANFIS, an AI technique that emulates human thought processes to address inaccuracies. ANFIS utilizes fuzzy logic to process inputs from integrated neural network components and information links to produce appropriate outputs. This method is a straightforward approach to data learning. ANFIS combines fuzzy logic and ANN, making it capable of handling complex nonlinear problems, imprecise data, and human cognitive uncertainty within a single structure [71]. ANFIS is a widely used significant contribution approximated where the relationship among both the input and output dimensions of the problem is represented as a collection of if–then rules.

The Mamdani fuzzy technique and the Takagi–Sugeno (T–S) fuzzy technique are two popular fuzzy rule-based inference systems [71]. The Mamdani fuzzy technique has some benefits: 1. It makes sense. 2. It is generally accepted. 3. It is compatible with human cognition [72,73,74].

The T–S system ensures output surface continuity and performs well with linear techniques [75, 76]. However, it faces challenges in handling multi-parameter synthetic assessment and weighing each input while applying fuzzy rules. On the other hand, the Mamdani system is known for its readability and understandability to a broad audience. In this work, we employ the Mamdani system, which proves beneficial in output expression.

It is necessary to designate a function for each of the following operators to fully describe the behavior of a Mamdani system:

  1. 1.

    For the computation of the rule firing strength with AND’ed premises, use the AND operator (often T-norm).

  2. 2.

    OR operation for estimating the firing strength of a rule with OR’ed premises (often T-conorm).

  3. 3.

    An operator for computing suitable consequent membership functions (MFs) depending on the firing strength provided, often a T-norm.

  4. 4.

    An aggregate operation, typically a T-conorm, for combining qualified consecutive MFs to produce an overall output MF.

  5. 5.

    A defuzzification operation that converts a sharp single output value from an output MF.

The following theorem is derived if the AND operation and implication operation are product, the aggregate operation is sum, and the defuzzification operation is centroid of area (COA) [77]. Implementing such composite inference has the benefit of allowing the Mamdani ANFIS to learn due to differentiability during processing (Table 1).

Table 1 Comparative study of SDP

The following theorem [78] is provided by the sum-product; look at Eqs. 1 and 2. When utilizing centroid defuzzification, the final crisp result is equal to the weighted average of the centroids of the subsequent MFs, where:

$$ \psi \left( {r_{i} } \right) = \omega \left( {r_{i} } \right) \times a $$

where \(\psi \left({r}_{i}\right)\) is the factor weight of \({r}_{i}\); \(i\)th is the rule; \(\omega \left({r}_{i}\right)\) is the strength of firing the rule \({r}_{i}\); and \(a\) is the area of MFs in the consequent part of the rule \({r}_{i}\).

$$ Z_{{{\text{COA}}}} = \frac{{\smallint z\mu c^{\prime}\left( z \right)z{\text{d}}z}}{{\smallint z\mu c^{\prime}\left( z \right){\text{d}}z}} = \frac{{\omega_{1} a_{1} z_{1} + \omega_{2} a_{2} z_{2} }}{{\omega_{1} z_{1} + \omega_{2} a_{2} }} = \omega ^{\prime}_{1} a_{1} \cdot z_{1} + \omega ^{\prime}_{2} a_{2} \cdot z_{2} $$

where \({a}_{i}\) is the area, and \({z}_{i}\) is the center of the consequent.

MF \(\mu {c}_{i}(z)\). We obtain the relevant Mamdani ANFIS using Eqs. 1 and 2 as in Fig. 2.

Fig. 2
figure 2

ANFIS model of Mamdani

Rule (1): If \(x\) is \({A}_{1}\) and \(y\) is \({B}_{1}\) then \({f}_{1}={\omega {\prime}}_{1}{a}_{1}\cdot {z}_{1}\)

Rule (2): If \(x\) is \({A}_{2}\) and \(y\) is \({B}_{2}\) then \({f}_{2}={\omega {\prime}}_{2}{a}_{2}\cdot {z}_{2}\)

where \({A}_{1}\) and \({A}_{2}\) are sets of fuzzy for input \(x\); \({B}_{1}\) and \({B}_{2}\) are sets of fuzzy for input \(y\).

The outcome of each layer in the five-layer Mamdani ANFIS design is as follows [64, 71, 79,80,81].

Layer (1) Create the membership degrees \({\mu }_{A},{\mu }_{B}\)

$${O}_{1,i}={\mu }_{{A}_{i}}\left(x\right),\quad i=\mathrm{1,2}$$
$$ O_{{1,i}} = \mu _{{B_{{i - 2}} }} \left( y \right),i = 3,4 $$

The MF is the generalized Gaussian function which is described by two parameters (d,\(\sigma \)):

$${\mu }_{{A}_{i}}\left(x\right)= {e}^{-\frac{1}{2}{(\frac{x-d}{\sigma })}^{2}}$$

Although center d and width \(\sigma \) govern the Gaussian MF, they are sometimes referred to the parameters of premise.

Layer (2)

$${O}_{2,i}={\omega }_{i}={\mu }_{{A}_{i}}\left(x\right)\times {\mu }_{{B}_{i}}\left(y\right),\quad i=\mathrm{1,2}$$

The product approach generates the firing strength \({\omega }_{i}\).

Layer (3)

$${O}_{3,i}={\omega {\prime}}_{i}=\frac{{\omega }_{i}}{{\omega }_{1}+{\omega }_{2}},\quad i=\mathrm{1,2}$$

Layer (4)

$$ O_{4,i} = f_{i} = \omega ^{\prime}_{i} a_{i} \cdot z_{i} ,\quad i = 1,2 $$

where the consequential parameters, \({ a}_{i}\) and \( {z}_{i}\), are, respectively, the area and center of the resulting MFs.

Layer (5)

$$ O_{5,i} = \sum f_{i} = \sum \omega ^{\prime}_{i} a_{i} \cdot z_{i} ,\quad i = 1,2 $$

As shown in Fig. 3, a general M-ANFIS system can be generated.

Fig. 3
figure 3

General M-ANFIS system

Rule (1): If \(x\) is \({A}_{1}\) and \(y\) is \({B}_{1}\) then \(Z={C}_{1}\)

Rule (2): If \(x\) is \({A}_{2}\) and \(y\) is \({B}_{2}\) then \(Z={C}_{2}\)

The outcome of each layer in the five layers of general M-ANFIS design is as follows.

Layer (1) Layer of fuzzification

$${O}_{1,i}={\mu }_{{A}_{i}}\left(x\right),\quad i=1, 2$$
$${O}_{1,i}={\mu }_{{B}_{i-3}}\left(y\right),\quad i=4, 5$$

The MF is the generalized Gaussian function which is described by two parameters (d,\(\sigma \)):

$${\mu }_{{A}_{i}}\left(x\right)= {e}^{-\frac{1}{2}{(\frac{x-d}{\sigma })}^{2}}$$

Layer (2) Layer of rules

$${O}_{2,i}={\omega }_{i}={\mu }_{{A}_{i}}\left(x\right)\times {\mu }_{{B}_{i}}\left(y\right),\quad i=\mathrm{1,2}$$

The product approach generates the firing strength \({\omega }_{i}\)

Layer (3)

$${O}_{3,i}={\omega }_{i}^\circ {C}_{i},\quad i=\mathrm{1,2}$$

Product is the implication operator.

Layer (4) Layer of aggregation

$${O}_{4}=\sum {\omega }_{i}^\circ {C}_{i},\quad i=\mathrm{1,2}$$

Sum is the aggregate operator. \({C}_{i}\) Establishes the consequential parameters.

Layer (5) Layer of defuzzification

$${O}_{5}=f=D^\circ {O}_{4}$$

The defuzzification approach COA yields a crisp or sharp output.

The ANFIS training process employed both forward and backward training techniques to update its parameters. ANFIS improves its parameters to reduce errors between predicted and target outcomes by using a hybrid GD (gradient descent) and LSE (least squares error) estimator, as shown in 2.

In the forward pass of the learning method, node outputs progressed from layers 1 to 4, and the consequential parameters were chosen and updated using the LSE. In the backward pass, GD updated the premise parameters as error signals propagated backward from the output to the input. The NN learned and trained to select parameter values that best fit the training data.

3.2 TFWO: turbulent flow of water optimization

A novel and effective optimization technique based on TFWO is utilized in this paper. The random and natural behavior of vortices in oceans, rivers, and seas served as an inspiration for this technique. TFWO is selected due to its stable structure, which increases the power of convergence, and overcomes the issue of determining control parameters. The TFWO is utilized to locate the overall solutions in various dimensions [27]. In addition, two real-world technical field optimization challenges are addressed using TFWO, including reliability–redundancy allocation optimization for the excessive speeding security mechanism of a gas turbine and different kinds of nonlinear economic load dispatch optimization issues in energy systems. The outcomes demonstrate the TFWO algorithm’s superiority and reliability in contrast with other optimization techniques, such as meta-heuristic techniques.

3.2.1 The whirlpool concept: an introduction to turbulent water flow

A whirlpool forms when water moves turbulently in a narrow, circular path, typically around a submerged obstacle like a rock. The gravitational force influences this circular motion, causing the water to follow a downward-spiraling pattern. As the water spirals, it accelerates, creating a small hole at its center, which further increases the flow speed. The formation of a whirlpool occurs as water is drawn into this central hole, causing a spinning motion [27].

3.2.2 TFWO algorithm

Seas, rivers, and oceans all have whirlpools as a random act of nature. In whirlpools, the middle of the whirlpool functions as a sucking hole, pulling the surrounding particles and objects toward its core and interior or applying centripetal force on them. In reality, a whirlpool is a body of moving water that is mostly caused by ocean tides. Where there are a few little ridges next to one another on the streamlet’s surface, whirlpools can emerge. These ridges bump into the rushing water, which then circles back around itself. This causes the water to progressively amalgamate around this circuit and form a funnel as it passes in a restricted path around the ridges. Centrifugal force is what causes the water to flow in this way. Sometimes, whirlpools near to one another interact in addition to having an impact on the particles and objects in their immediate surroundings, as shown in the next subsections [27]. The impacts of whirlpools on its set of objects and other whirlpools

The starting population \(({X}^{o}\), consisting of \({N}_{p}\) members) of the technique is equally distributed between whirlpool group or \({N}_{wh}\) sets, then the strongest object of every set of whirlpool (the item with the better objective values \((f\)) is taken into account as the whirlpool that pulls the objects \((X),\) including, \({N}_{p}-{N}_{wh}\).

Every whirlpool \((wh)\) functions as a sucking hole or well and, by doing the force of centripetal on the particles in its set \((X)\), tends to bring their locations into alignment with the well’s central position. Because of this, the \(j\)th whirlpool behaves in a way that makes the position of the \(i\)th object \(({X}_{i})\) equal to the position of the \(i\)th whirlpool, i.e.,\( { X}_{i}={wh}_{j}\). However, other whirlpools give certain deviations \((\Delta {X}_{i})\), depending on how far away from the objective method \((f)\) they are \((wh-{wh}_{j})\). The updated position of the \(i\)th particle or object would then be equal to \({X}_{i}^{new}={wh}_{j}-{\Delta X}_{i}\). The objects \((X)\) rotate around the center of their whirlpool at their unique angle \((\theta )\). As a result, this angle changes with every iteration of the algorithm, as shown in Fig. 4.

Fig. 4
figure 4

Whirlpool optimization

$${\theta }_{i}^{new}={\theta }_{i}+{rand}_{1}*{rand}_{2}*\pi $$

To compute and determine \(\Delta {X}_{i}\), the farthest and closest whirlpools, or the whirlpools with the most and least weighed distance from all particles, are computed as Eq. (16), then \(\Delta {X}_{i}\) is computed as Eq. (17). To update the object’s position, apply Eq. (18).

$${\Delta }_{t}={f(wh}_{t})*{{abs(wh}_{t}-sum({X}_{i}))}^{0.5}$$
$${\Delta {\text{X}}}_{i}=({\text{cos}}({\theta }_{i}^{new})*rand\left(1,D\right)*\left({wh}_{f}-{X}_{i}\right)-{\text{sin}}({\theta }_{i}^{new})*rand(1,D)*\left({wh}_{w}-{X}_{i}\right))*(1+abs({\text{cos}}({\theta }_{i}^{new})-{\text{sin}}({\theta }_{i}^{new})))$$
$${X}_{i}^{new}={wh}_{j}-{\Delta X}_{i}$$

where \({wh}_{w}\) and \({wh}_{f}\) are the whirlpools with the highest and lowest values of \({\Delta }_{t}\), respectively. \({\theta }_{i}\) is the \(i\)th particle’s angle. Centrifugal power \(({\mathbf{F}\mathbf{E}}_{\mathbf{i}})\)

While centripetal force attracts moving objects toward the whirlpool’s center, centrifugal force pushes them away from that center, as represented in Eq. (19). If this force exceeds a randomly generated number between 0 and 1, the centrifugal operation is performed on the randomly chosen dimension according to Eq. (20).

$${FE}_{i}={({({\text{cos}}({\theta }_{i}^{new}))}^{2}*{({\text{sin}}({\theta }_{i}^{new}))}^{2})}^{2}$$
(20) The whirlpools’ interactions

Whirlpools interact and move around one another in a manner similar to that of a whirlpool on the particles in its surrounding as shown in Eqs. 21, 22, and 23.

$${\Delta }_{t}={f(wh}_{t})*abs({wh}_{t}-sum({wh}_{j}))$$
$${\Delta wh}_{j}=rand\left(1,D\right)*abs({\text{cos}}{(\theta }_{j}^{new})+{\text{sin}}{(\theta }_{j}^{new}))*({wh}_{f}-{wh}_{j})$$
$${wh}_{j}^{new}= {wh}_{f}-{\Delta wh}_{j}$$

where \({\theta }_{j}\) is the angle of the \(j\)th whirlpool opening.

Finally, the individual of the new particles of the whirlpool’s set is picked as a new particle if it has more strength (i.e., its value of the objective method is lower) than its related whirlpool. So, it is decided to use it as the new whirlpool in the following iteration. Therefore, all the previous steps are shown briefly in Fig. 5.

Fig. 5
figure 5

Flowchart of TFWO

3.3 Adaptation of ANFIS utilizing TFWO

In this study, both the subsequent and antecedent (premise) parameters of the ANFIS model are adjusted using the TFWO algorithm. The ANFIS training algorithms employ the conventional hybrid optimization technique, GD_LSE, which combines GD and LSE. This traditional hybrid technique uses LSE for modifying parameter values in the forward pass and GD for parameter modification in the membership function during the backward pass, similar to back propagation (as shown in Table 2). As a result, Table 2 is updated in accordance with the proposed model, as shown in Table 3.

Table 2 Training general ANFIS
Table 3 Training ANFIS with TFWO

Traditional mathematical programming methods often fail to provide optimal solutions for real-world optimization problems due to the large number of parameters involved [27, 82]. GD and LSE are two examples of deterministic categories that are slow and occasionally fail to converge, and a major critique of GD is that it tends to stick to local minima, which is avoided by TFWO. In comparison with GD, TFWO performed the function of learning ANFIS parameters more quickly and flexibly since it is computationally less expensive. The total number of ANFIS adjustable parameters is a crucial element in the development of an ANFIS network due to the processing effort required for the adaptation process. Therefore, attention should be paid when choosing the membership categories. Better than other member functions is the Gaussian function, which simply requires the two parameters center and width as illustrated in Eq. 4.

The complete TFWO cycles with ANFIS are depicted in Fig. 6 and Algorithm 1. They outline the steps of the proposed TFWO_ANFIS as follows:

  1. 1.

    Data are divided at the beginning of the model into training and testing sets. Data for both training and testing are chosen at random to avoid the local optima and over-fitting issues. Data are trained using (70% of the datasets). This maintains the proper level of population variety and increases the capability of global search.

  2. 2.

    Create initial ANFIS model utilizing fuzzy C-means clustering (FCM) to find the degree of membership. ANFIS model contains a set of premise and consequential parameters that describe the parameters of membership functions in the two parts of if–then rule. ANFIS model is created utilizing the equations described in layers from 1 to 5 in Sect. 3.1.

  3. 3.

    Feed parameters of ANFIS (premise and consequential parameters) to TFWO algorithm with training data.

  4. 4.

    Initialization step that includes, creating initial population randomly, assesses the fitness function of initialized population MSE and split the population into \({N}_{wh}\) sets of whirlpools.

  5. 5.

    The TFWO algorithm uses its advantages iteratively, to reach its best whirlpool to modify the parameters of ANFIS based on MSE fitness function for each whirlpool.

  6. 6.

    Till MaxDecades, best ANFIS model is returned, then compute the result of best ANFIS with training data.

  7. 7.

    Find the result of best ANFIS with the rest testing datasets (30%).

  8. 8.

    The evaluation performance is applied using SD, RMSE, MSE, MBE, and accuracy.

Fig. 6
figure 6

The architecture of TFWO_ANFIS model

Algorithm 1
figure a

The steps of the proposed TFWO_ANFIS.

When comparing the target amount to the actual performance, the fitness function was measured as a mean square error (MSE) as shown in Eq. 23.

$$ {\text{MSE}} = \frac{{\mathop \sum \nolimits_{m = 1}^{K} \left( {out_{m} - out_{m}^{{\Lambda }} } \right)^{2} }}{K} $$

where \({out}_{m}\) is the target (desired outcome), \({out}_{m}^{\Lambda }\) is the predicted outcome, and \(K\) is the volume of data.

As depicted in Fig. 6, the initial stage of the model involves randomly splitting the data into training and testing sets to avoid issues such as local optima and over-fitting. This random selection maintains population diversity and enhances global search capabilities. Additionally, the ANFIS system design utilizes fuzzy C-means clustering (FCM) for identifying the degree of membership. The model includes a set of premise and consequential parameters that describe the parameters of membership functions in the two parts of if–then rule. The parameters of ANFIS are used as input for TFWO algorithm. TFWO creates an initial population randomly, assesses the fitness function of initialized population by MSE, and splits the population into Nwh sets of whirlpools. After computing fitness function, these processes are repeated till the maximum of iterations. Afterwards, the system finds the result of best ANFIS with testing data. The evaluation performance is applied using SD, RMSE, MSE, and accuracy. Finally, evaluating the performance of TFWO_ANFIS is performed utilizing accuracy metric.

4 Results and discussion

This section details the evaluation of TFWO_ANFIS efficiency. The experiment assesses the effectiveness and efficiency of the TFWO_ANFIS model in addressing uncertainty in the SDP field with higher accuracy and achieving the lowest error on four datasets obtained from OPENML [83]. This experiment marks the first use of TFWO with ANFIS to enhance SDP. The TFWO_ANFIS model is designed to better manage software metrics’ uncertainty and predict defects with higher accuracy. We compare TFWO_ANFIS with conventional ANFIS, ACO_ANFIS, DE_ANFIS, PSO_ANFIS, GWO_ANFIS, and GA_ANFIS. The evaluation of TFWO_ANFIS against recent relevant studies in SDP demonstrates its superior performance over all other techniques.

4.1 Evaluation performance

To evaluate the effectiveness of the recommended TFWO ANFIS technique and the performance of the results, various metrics are employed. These metrics are as follows:

  1. 1.


    $$ {\text{MSE}} = \frac{{\mathop \sum \nolimits_{m = 1}^{K} \left( {out_{m} - out_{m}^{{\Lambda }} } \right)^{2} }}{K} $$

    where \({out}_{m}\) is the target (desired output), \({out}_{m}^{\Lambda }\) is the predicted output, and \(K\) is the size of data.

  1. 2.


    $$ {\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{m = 1}^{K} \left( {out_{m} - out_{m}^{{\Lambda }} } \right)^{2} }}{K}} $$
  1. 3.


    $$ {\text{SD}} = \sqrt {\frac{{\mathop \sum \nolimits_{m = 1}^{K} \left( {X_{m} - \mu } \right)^{2} }}{m}} $$
  1. 4.

    Mean bias error (MBE):

    $$ {\text{MBE}} = \frac{1}{k}\mathop \sum \limits_{m = 1}^{k} \left| {\frac{{out_{m} - out_{m}^{{\Lambda }} }}{{out_{m}^{{\Lambda }} }}} \right| $$

    where \({X}_{m}\) is every value from population, \(\mu \) is the mean, and \(m\) is the size.

  2. 5.

    Accuracy (ACC):

    $$ {\text{ACC}} = \left( {\text{TP + TN}} \right)/ ({\text{T P + TN + FP + FN)}} $$
  3. 6.

    Specificity (SP):

    $$ {\text{SP}} = {\text{ T N}}/ \left( {{\text{T N}} + {\text{F P}}} \right) $$
  4. 7.

    Sensitivity (S):

    $$ S = {\text{TP}}/ \left( {{\text{TP}} + {\text{FN}}} \right) $$
  5. 8.

    Precision (P):

    $$ P = {\text{T P}}/ \left( {{\text{T P}} + {\text{ F P}}} \right) $$

The model is considered suitable for training when MBE equals zero. A negative MBE suggests an underestimated model, while a positive MBE indicates overestimations during the training phase [58].

where TP, TN, FP, and FN are shown in confusion matrix’s, Table 4.

Table 4 Confusion matrix

A common way to display the efficiency of a classification technique is by using a confusion matrix [84]. This matrix includes both the predicted class value and its corresponding actual class. These values are employed to assess the classifier’s performance, as shown in Table 4.

4.2 TFWO_ANFIS evaluation

4.2.1 Tools and environment

This subsection includes four software defect datasets obtained from OPENML [83]. These datasets are used to evaluate the effectiveness and efficiency of the proposed technique (TFWO_ANFIS) in addressing uncertainty issues in the field of software defect prediction (SDP). These datasets were selected based on their variations in sample sizes, features, and the number of defects, which reflects the diversity needed for the study’s accuracy. These datasets include essential information for SDP, and they were made publicly available to support the development of reliable, measurable, debatable, and enhanced software development prediction models. These datasets originated from source code extractors by McCabe and Halstead, designed to accurately define code aspects related to software quality, such as lines of code, cyclomatic complexity, volume, Halstead’s line count, unique operators, and operands. Detailed characteristics of these datasets are presented in the following table (Table 5).

Table 5 Datasets descriptions

In this experiment, the TFWO_ANFIS model is tested against various meta-heuristic methods, including ACO, PSO, GWO, standard ANFIS, DE [85], and GA. The dataset is split into 70% for training and 30% for testing. Parameters for each algorithm are found in Table 6. The experiments were conducted on a system running Windows 10 Pro (64-bit) with an Intel(R) Core(TM) i5 CPU and 4 GB of RAM. MATLAB (R2016a) [86] was used for all implementations.

Table 6 Parameters configuration of different models

All optimization techniques have the following parameters: maximum decades (iteration) = 100, size of population = 93 according to this equation of TFWO \({N}_{pop}={Nw}_{h}+{Nw}_{h}*{N}_{obw}\) such that \({Nw}_{h}=3\), \({N}_{obw}=30\), the upper and lower bound are 10 and − 10, respectively (Fig. 7).

Fig. 7
figure 7

Convergence rate of TFWO

4.2.2 Output of experiment

The experiment used common metrics such as accuracy, RMSE, precision, SD, specificity, sensitivity, and MSE to evaluate the TFWO_ANFIS model’s performance in optimizing ANFIS parameters. Average results for the ten experiments are presented in Tables 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, and 16 and Figs. 8, 9, 10, and 11, respectively. These figures and tables demonstrate that TFWO_ANFIS outperforms the other algorithms across all four datasets. Tables 10, 12, 14, 16, and 8 demonstrate that TFWO_ANFIS outperformed all other algorithms in terms of accuracy across all datasets used in this experiment. This validates the effectiveness and efficiency of our recommended model, which can enhance ANFIS parameter tuning. Additionally, a typical metric for optimization techniques is the convergence rate [87]. As shown in Fig. 7, convergence describes how a solution’s progression through the iterations to an appropriate point in less time. In this study, the TFWO converges at a rate of about 3% of the total number of iterations. As a result, the best-fit individual is satisfied more quickly.

Table 7 MSE in testing
Table 8 RMSE testing
Table 9 SD in testing
Table 10 Confusion matrix for testing KC2
Table 11 Comparative between TFWO_ANFIS and others for KC2
Table 12 Confusion matrix for testing PC3
Table 13 Comparative between TFWO_ANFIS and others for PC3
Table 14 Confusion matrix for testing KC1
Table 15 Comparative between TFWO_ANFIS and others for KC1
Table 16 Confusion matrix for testing PC4
Fig. 8
figure 8

TFWO_ANFIS and other models in terms of accuracy

Fig. 9
figure 9

TFWO_ANFIS and other models in terms of MSE

Fig. 10
figure 10

TFWO_ANFIS and other models in terms of RMSE

Fig. 11
figure 11

TFWO_ANFIS and other models in terms of SD

Table 7 and Fig. 9 show the MSE metric that can be calculated according to Eq. 25. They show the comparison between our proposed TFWO_ANFIS with common meta-heuristic optimization techniques in the literature such as PSO, GA, GWO, DE, ACO, and standard ANFIS. The scores of the proposed TFWO_ANFIS in terms of MSE are 0.1091, 0.0770, 0.1026, and 0.0850 for the KC2, PC3, KC1, and PC4 datasets, respectively. Table 8 and Fig. 10 present RMSE metrics that can be computed regards to Eq. 26. Our proposed model (TFWO_ANFIS) scores lowest results 0.3303, 0.2776, 0.3203, and 0.2926 in terms of RMSE for KC2, PC3, KC1, and PC4 datasets, respectively. SD metric presented in Table 9 and Fig. 11 is calculated as shown in Eq. 27. It is also used to evaluate the performance of proposed TFWO_ANFIS with other techniques. The SD of the proposed model scores 0.3307, 0.2885, 0.3205, and 0.2929. From Tables 7 and 9 and Figs. 9 and 11, the MSE, RMSE, and SD of TFWO_ANFIS are the lowest, so the proposed model has a better performance.

Table 10 displays the confusion matrix results for the TFWO_ANFIS applied to the KC2 dataset. From this table, evaluation metrics such as P, SP, S, and ACC can be calculated using Eqs. 29, 30, 31, and 32. Accuracy is one of the most important metrics in this study, and the proposed TFWO_ANFIS achieves the highest accuracy of 87.3%, outperforming other techniques.

Tables 12 and 13 describe the confusion matrix and comparative between the proposed TFWO_ANFIS with other meta-heuristic techniques on the PC3 dataset. TFWO_ANFIS scores 90.2% in terms of accuracy that is the best score.

Table 15, derived from Table 14, provides a comparison between TFWO_ANFIS and other techniques using the KC1 dataset. The confusion matrix for the tested KC1 dataset is presented in Table 13. When TFWO_ANFIS is applied to the test data, it achieves the highest accuracy among all techniques, scoring 85.8%.

Finally, TFWO_ANFIS is applied to the PC4 dataset. The results are shown in Tables 16 and 17. Table 16 represents the confusion matrix resulting from applying TFWO_ANFIS on the tested PC4 dataset, and Table 17 describes the comparative analysis between the proposed TFWO_ANFIS with other techniques. Table 16 shows that TFWO_ANFIS has better accuracy than others, with a score 89.2%.

Table 17 Comparative between TFWO_ANFIS and others for PC4

Table 18 presents the most common metrics for evaluating the model, such as MSE, MBE, RMSE, and SD. This table depicts the different datasets utilized in the proposed research KC2, PC3, KC1, and PC4 with their number of data and features, respectively. This table concludes the outperformance of the TFWO_ANFIS.

Table 18 Various metrics for estimating TFWO_ANFIS efficiency

4.2.3 Result discussion

The research results offer several advantages in the field of software defect prediction (SDP) and related areas. Firstly, when compared to optimization algorithms such as PSO, GWO, DE, ACO, standard ANFIS, and GA, the TFWO_ANFIS model demonstrates superior accuracy in predicting software defects. This enhanced accuracy is valuable for software development teams and organizations as it enables them to identify and address potential issues early, thereby improving software quality and reliability. Secondly, thanks to the underlying TFWO algorithm, the TFWO_ANFIS model provides stability and convergence power, ensuring consistent performance across various datasets and instances. This stability makes it a reliable choice for real-world applications. Furthermore, the proposed TFWO_ANFIS model effectively handles uncertainty in software features, a common issue in real-world software engineering. The TFWO_ANFIS model solves this problem by offering a more accurate defect prediction, enabling quality assurance teams to allocate their resources and efforts. Also, the research findings’ practical usefulness is improved by the use of publicly available datasets from platforms such as OPENML. The model’s performance and accuracy may be verified and extended to other software development scenarios and contexts by using real-world datasets.

There are four datasets explained in Table 4, namely, KC2, PC3, KC1, and PC4, with different instances and features in our experiment to examine and assess the effectiveness and efficiency of the proposed TFWO_ANFIS in handling uncertainty in software features. In every tested dataset, the TFWO_ANFIS produced good results.

Case KC2 TFWO_ANFIS results in 87.3%, 0.1091, 0.1281, 0.3303, and 0.3307 in terms of accuracy, MSE, MBE, RMSE, and SD, respectively.

Case PC3 TFWO_ANFIS achieves 90.2%, 0.0770, 0.0860, 0.2776, and 0.2885 in terms of accuracy, MSE, MBE, RMSE, and SD, respectively.

Case KC1 TFWO_ANFIS fulfills 85.8%, 0.1026, 0.0931, 0.1026, and 0.3205 in terms of accuracy, MSE, MBE, MBE, RMSE, and SD, respectively.

Case PC4 TFWO_ANFIS obtains 89.2%, 0.0850, 0.2310, 0.2926, and 0.2929 in terms of accuracy, MSE, MBE, RMSE, and SD, respectively.

These cases conclude that the TFWO_ANFIS outperformed the traditional ANFIS model and other meta-heuristic optimization techniques such as GA, PSO, GWO, ACO, and DE in terms of training and testing accuracy while also having the lowest error rate. The outcomes show that the suggested TFWO_ANFIS performed better than all of them in terms of accuracy, MSE, SD, and RMSE.

This study has significant theoretical and practical implications. Theoretical implications arise from addressing the limitations of conventional methods such as LSE and GD when optimizing ANFIS parameters in uncertain scenarios. The research enhances optimization strategies for handling uncertainty and improving software defect prediction (SDP) accuracy through the introduction of the TFWO algorithm.

In practical terms, the TFWO_ANFIS model offers valuable applications. Its improved convergence power and stable architecture allow for efficient adjustment of ANFIS parameters, resulting in enhanced SDP accuracy. The model proves its utility in practice by outperforming alternative optimization algorithms across various evaluation measures. The study also emphasizes the importance of effective algorithm selection and parameter optimization in SDP. However, it is crucial to be aware of practical considerations, such as the additional time required for configuration and the complexity of implementing the suggested algorithm. These insights provide valuable guidance for those considering the use of the TFWO_ANFIS model in software defect prediction and related fields. This research contributes to the fields of software engineering and optimization, highlighting both theoretical advancements and their practical applications. It holds value for both researchers and professionals in the industry.

To summarize, the characteristics of the ANFIS, which is used to anticipate software defects, are the primary parameters the proposed research attempted to enhance in the suggested study. The ANFIS system combined the interpretability of fuzzy logic with the learning powers of NNs. To achieve accurate predictions in conventional ANFIS learning systems, characteristics such as membership function shapes, the number of fuzzy rules, and consequent parameters are essential. Nevertheless, there is a big issue in optimizing these parameters when there is uncertainty and when SDP is involved. The proposed research aimed to use TFWO to enhance the parameters of ANFIS in SDP, increasing accuracy and handling uncertainty.

The no free lunch theorem asserts that no single optimization algorithm can effectively address every optimization problem. Therefore, it is important to recognize that the TFWO algorithm may not be suitable for all optimization issues. Additionally, the proposed model may require additional iterations during the training process. While TFWO with ANFIS demonstrates efficiency, it is worth noting that implementing the proposed algorithm can be quite complex, and configuring it may take more time.76

5 Conclusions and future work

This study introduces a model called TFWO_ANFIS to address uncertainty in SDP with improved accuracy. Unlike traditional methods such as GD and LSE, TFWO_ANFIS leverages the turbulent flow of water optimization (TFWO) to optimize parameters in the adaptive neuro-fuzzy inference system (ANFIS), including membership function shapes, fuzzy rule numbers, and consequent parameters.

The proposed TFWO_ANFIS outperformed other optimization algorithms and recent literature in SDP such as particle swarm optimization (PSO), gray wolf optimization (GWO), differential evolution (DE), ant colony optimization (ACO), standard ANFIS, and genetic algorithm (GA) in terms of standard deviation (SD), mean square error (MSE), mean bias error (MBR), root-mean-square error (MSE), and accuracy. Four datasets with different instances and features from open platform for publishing datasets called OPENML are utilized. The proposed TFWO_ANFIS had an accuracy 87.3%, 90.2%, 85.8%, and 89.2%, respectively, for the datasets KC2, PC3, KC1, and PC4. Moreover, many evaluation metrics are utilized such as precision, sensitivity, confusion matrices, and specificity.

The results indicate that TFWO_ANFIS has better and outperformed than the previous algorithms across all four datasets. Moreover, they showed that the suggested TFWO_ANFIS outscored all other algorithms in all used datasets in terms of accuracy and other evaluation metrics. Finally, this experiment validates the effectiveness and efficiency of the recommendation model and can be used to enhance the method for ANFIS’s parameter tuning to handle uncertainty in SDP with higher accuracy.

Future research is expected to enhance the described TFWO_ANFIS model by incorporating additional real-world fields and datasets. Addressing software feature uncertainty in SDP with alternative methods is also considered a critical challenge.