Automatic program bug fixing by focusing on finding the shortest sequence of changes

Yousofvand, Leila; Soleimani, Seyfollah; Rafe, Vahid; Esfandyari, Sajad

doi:10.1007/s10462-023-10686-y

Automatic program bug fixing by focusing on finding the shortest sequence of changes

Open access
Published: 07 February 2024

Volume 57, article number 39, (2024)
Cite this article

Download PDF

You have full access to this open access article

Artificial Intelligence Review Aims and scope Submit manuscript

Automatic program bug fixing by focusing on finding the shortest sequence of changes

Download PDF

Leila Yousofvand^1,2,
Seyfollah Soleimani²,
Vahid Rafe^2,3 &
…
Sajad Esfandyari⁴

913 Accesses
1 Altmetric
Explore all metrics

Abstract

Automatic bug repair as the last step in program repair has attracted a lot of research attention. Various ideas and techniques have been presented in this field. Recent bug fixing techniques use machine learning and graphs to generate fixes. Despite the promising results of recent approaches, maintaining high speed and accuracy as well as recording a wide range of errors may still be a problem. In this paper, a new approach is presented in the field of automatic bug fixing based on graphs and model checking. For this purpose, we have used the graph transformation and model checking system to create a sequence of edits and produce fixes. Then, using meta-heuristic algorithms, we have selected the best solution and fix from the generated solutions. We use the extracted graphs from the buggy JavaScript code and their corresponding bug-free ones. In evaluating the effectiveness of the proposed method, we implement it in GROOVE, which is a toolbox used to design and check graph transformation systems. Experimental results on identical dataset demonstrate that the proposed method outperforms other related methods in generating fixes. Also, this method covers a wider range of bugs compared to previous methods.

Evaluating and Integrating Diverse Bug Finders for Effective Program Analysis

Mining Python fix patterns via analyzing fine-grained source code changes

Article 28 January 2022

A Novel Metaheuristic Based Method for Software Mutation Test Using the Discretized and Modified Forrest Optimization Algorithm

Article 20 June 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Automatic program repair operation includes detecting, locating and fixing bugs in the source code. A software bug is a flaw or error in the source code that produces unexpected results. The time interval when the first software bug is reported, as well as the time to correct the bug, has a main impact on the reliability of the software (Yousofvand et al. 2023). In producing a fix for a discovered bug, the goal is to find a fix that involves the minimum number of changes. In recent years, many software debugging methods have been developed. Each of these approaches has used a specific technique, including genetic programming techniques (Forrest et al. 2009; Weimer et al. 2010; Goues et al. 2012; Oliveira et al. 2016), formal program logic techniques (Fischer et al. 2009; Kalvala and Warburton 2011; Failed 2021), search-based techniques (Failed 2018; Trujillo et al. 2021) and techniques based on machine learning (Tufano et al. 2018; Chen et al. 2019; Liva et al. 2019; Dinella et al. 2020; Yuan et al. 2022). But most of the existing methods cover specific types of bugs, such as the variable misuse bug, variable naming, etc., or they operate on specific codebases such as Facebook. While an ideal tool should cover a wide range of bugs. Recently, techniques have been presented that have faced this challenge and achieved promising results (Dinella et al. 2020; Berabi et al. 2021). These methods are based on machine learning and graphs. Graphs are very useful data structures that can be used to model real-world phenomena. Programs written in high-level languages have useful structures. Researchers have presented graph-based representations to extract this structure (Dinella et al. 2020; Allamanis et al. 2018). They have extracted a graph for each program code and represented it as Program Dependency Diagram (PDG) to capture syntactic and semantic information. Since many program bugs are related to syntactic and grammatical errors, graphs can be used to detect and correct these types of bugs and model the relationships between the components of a program well. In this article, to improve the existing approaches in terms of accuracy of bug fixing, a new approach is proposed using graphs and graph transformation system (GTS). The goal of finding a fix is a fix that has the least number of corrections and changes. As far as we know, the existing approaches have not addressed this issue. A number of rules must be applied to fix a buggy graph. Model checking systems are relevant for bug fixing as they take an initial state and apply a number of rules to create the next state. In this approach meta-heuristic algorithms are applied on graphs to find the shortest and best editing sequence for the detected bug. In this way, we first map the input code into a graph and start from the node where the bug occurred and apply the rules to that node and its children, these rules are the same rules defined in GTS. By applying the rules, the state space of the problem is created and different solutions are produced. Now, using meta-heuristic algorithms, we find the best path in this state space to fix the detected bug. The main contributions of our approach are the use of reachability testing and model checking to find the best repair solution and find the shortest sequence of changes. Model checking is a well-known technique in which, given an initial configuration, all reachable states of a system are generated (Baier and Katoen 2008; Pira et al. 2018). This set of generated states is called state space. A formal language is required to implement proposed method. Because in the model checking, a formal language is used to describe the proposed system. There are many formal languages, of which the Graph Transformation System (GTS) is an example. GTS (Rozenberg 1997) is for modeling dynamic behaviors of different systems (Pira et al. 2018). We use GTS as a test bed to implement the proposed approach. Among the various tools for GTS modeling and verification, GROOVE (Kastenberg and Rensink 2006) is used. In this paper to improve the speed of approach, we used PSO, BOA and GA for reachability test. The results showed using BOA had the highest speed among all approaches.

Also, we have used the codes available in GitHub, which are not for a specific database and cover a wider range of bugs. Our approach works well for more complex bugs that require more than one fix.

The content of this article is organized in this way, in part 2, the necessary background is presented, including information about bugs in JavaScript codes, Abstract syntax tree, model checking, graph transformation system and a number of meta-heuristic algorithms. In Sect. 3, we provide a complete explanation of the proposed method. We will test the proposed method and present the results, in Sect. 4. Finally, in Sect. 5, we will mention conclusions and suggestions.

2 Preliminaries

2.1 Bug in JavaScript program

JavaScript is a dynamic and object-based programming language that is used both on the client side and on the server side to interact with web pages. This has led to the creation of complete web applications using this language. Despite the great popularity of this language, the inherent characteristics of JavaScript, such as weak typing, runtime evaluation, etc., have made this language one of the languages prone to programming errors. One type of bug in this language is related to capitalization of letters. JavaScript is case sensitive. This means that the variables that are created must be used with the same initial form each time. It also means that for the program to work correctly, the letters in the function names must be correctly capitalized. Figure 1 a shows an example of incorrect use of this method. Another type of bug in JavaScript code is the undefined property. This bug occurs when a script tries to access an object property that does not exist. Figure 1 b shows an example of this type. The username property has not been defined before. This property does not exist for user but the Username property exist.

2.2 Abstract syntax tree

An abstract syntax tree (AST) is a tree that shows the syntactic structure of the source code. It is similar to a parse tree, except that it preserves the important syntactic structure of the program while removing non-terminals that are not necessary to understand that structure (Meyers 2001). To extract an abstract syntax tree from the parsing tree, at first the parentheses, which are separators and priority indicators, are removed from the parsing tree. Then the only children replace their father. Finally, the remaining intermediate terms are replaced with operators that are children of that term.

2.3 Model checking

Model checking is the most successful approach that has been developed to check the requirements (Esfandyari and Rafe 2021). A model checker takes the requirements or system design (the model) and a specification (the specification). Then the model checker automatically checks the property's truth or falsity through a thorough exploration of all possible states of the model and generates a violation/witness example at the output. A violation example explains in detail why the model does not satisfy the specification. A witness/counterexample specifies a finite and initial path such that the property of the given state is satisfied/violated in this state (target state). In generating the state space, the model checker uses graph search algorithms such as depth-first search and breadth-first search.

To perform model checking, the generated state space must be converted to Kripke structure. A Kripke structure M consists of a tuple M = < AP,S,s₀,R,I > . In this tuple, AP is a set of atomic propositions, S is a set of states. R is a transition relation from one state to another, that is, R ⊆ S × S and s0 represents the initial state of the system. I is a labeling function of the form I: S → 2^AP that maps each state s to a subset of atomic propositions. In model checking, a formal method called temporal logic is used to describe the properties in the property checking operation. Linear Temporal Logic (LTL) and Computation Tree Logic (CTL) are examples of temporal logic. In LTL, there is only one next state per state. While in CTL, each state may have several subsequent states (Esfandyari and Rafe 2020). In CTL logic, a formula is composed of atomic propositions, temporal operators and path quantifiers. Suppose g is an atomic proposition. The path quantifiers are: A and E. A g: g is valid in all paths starting from the current state. E g: g exists in at least one path starting from the current state. The temporal operators in this temporal logic are X, < > , [], U, !. For example, [] g: g must be present in all states of the path.

In CTL, we have two types of formulas: state formula and path formula. State formulas are combinations of atomic statements that can be evaluated for a state without examining the behavior of the system. But path formulas are defined along the path. Each state in the model state space is a graph. The property g is a graph by which a desirable and specific configuration of the given system is described. These formulas can express three important properties in the software systems checking as follows:

1.
Reachability: This property asserts that there is a state in the state space where property g is satisfied (Pira et al. 2018). In other words, there is a path in the calculation tree that starts from the initial state and ends in this state. In CTL, to express this property, E <> g is used (Fig. 2a).
2.
Safety: This property asserts that something good should always happen in the system. The safety property g asserts that there is a state in the model state space in which the property g is not occurred. In CTL, to express this property, A[] g is used (Fig. 2b).
3.
Liveness: This property asserts that something good should eventually happen in the system. The liveness property g asserts that there is at least a state in all paths starting from initial state in which the property g occurred. In CTL, to express this property, A <> g is used (Fig. 2c).

2.4 Graph transformation system

In model checking, the purposed system to be checked must be described with a formal modeling language. Among the formal languages used in describing models are textual languages and graphic languages. A graph transformation system is a graphical language that describes the states and behaviors of a system. For this description, it uses graphs and graph transformation (Pira 2021).

Graph transformation system (GTS) (Rozenberg 1997) is defined by a 3-tuple GTS = (TG, HG, R) where TG is the type graph, HG is the host graph and R is a set of rules (Baier and Katoen 2008). A type graph is a general representation of a system and is also known as a metamodel. The host graph represents the initial state of the system. The key concept of graph transformations is transformations based on graph rules. Rules determine the changes that can be made to the smallest part of a graph (node or edge). These changes include the addition or deletion of each of these small sections. A transformation rule consists of two graphs; one is related to the right part (LHS) and the other is related to the left part (RHS), which overlaps with the left graph (partially). By executing a rule, a new graph is produced, which is actually the result of the transformation. The most important requirement for a rule to be executed is that the left graph exists in the host graph. All nodes and edges that are on both sides will remain after the rule is executed. The LHS may have nodes and edges that are not in the RHS segment; these nodes and edges will be removed by executing the rule. Also, the RHS can have edges and nodes that are not present in the LHS; These nodes and edges will be added after the execution of the rule. A rule may have negative application conditions (NAC) that actually limit the execution of a rule. A NAC is a graph that partially overlaps the LHS of the rule. When there is a match of a rule in the host graph, that rule will be selected to be executed, but even if one of the rule's NACs are not accepted, that rule cannot be executed. A NAC is accepted when it is not in the current host graph. The complete behavior of a model is determined by its state space, where all rules must be repeatedly applied to the initial state of the model to generate the state space.

There are various tools that are based on GTS and are used in systems modeling and analysis, such as ATOM3 (Lara and Vangheluwe 2002), VIATRA2 (Varró and. Balogh 2007), GROOVE (Kastenberg and Rensink 2006) and AGG (Taentzer 2003).We use the GROOVE tool for our implementation. Groove has a collection of tools for modeling the object-oriented systems at compiling, runtime, transforming and designing the model based on graph transformation, which focuses on using the model checking technique to determine the correctness of the systems. The main applications of this tool are providing model transformation formalism and verifying properties to confirm the correctness of software systems. To create the system state space in this tool, a graph can be used as the initial state and a set of graph transformation rules. For example, we have modeled the philosophers dining problem with two philosophers. Figure 3 shows the host graph. This graph has four nodes: two forks and two philosophers. Each fork is to the right of one philosopher and to the left of another. The left and right side of the fork is determined by the edge according to Fig. 3. The initial values are also written inside each node.

After defining the host graph, rules must be defined. In GROOVE tool for a rule, the RHS, LHS and NAC graphs are merged together and each component is identified by color coding. If edges and nodes belong to both RHS and LHS graphs, they are marked with a thin black border. Nodes and edges with dotted blue borders belong to the LHS graph. The elements that must be removed from the graph after running delete rule on the host graph are added to the node values with a “−” sign and blue color. The green edges and nodes are in the RHS graph and after running insert rule, they must be created in the host graph and are added to the node along with the “+” sign and green color. The elements that belong to the NAC graph are marked with a dotted red border that prevents the execution of the rule. According to Fig. 4, the get_left rule can be applied when the current state (host graph) has a node with the label Hungry and another philosopher does not hold the right fork, and by applying this rule to this state, a new edge with the label Hold is created and the status of the node is changed to HasLeft.

Figure 5 shows generated state space for the philosopher’s problem with 2 philosophers. go_hungry, get_right, get_left,release_left and release_right are rules. In the models described with the formal language of graph transformation, for example in Fig. 5, in the example of philosophers, only the go_hungry rule is applicable at first, and then this rule causes the status of the philosopher to change from thinking to hungry, and this causes the get_left rule to be applicable.

2.5 Meta-heuristic techniques

In the past years, a new type of approximation algorithms have emerged, which basically aim to combine heuristic methods in larger frameworks more effectively explore the search space (Pira et al. 2018). Meta-heuristic algorithms use exploration strategies to detect an optimal solution among a number of solutions (solution space). The performance of these algorithms is independent of the problem and iteratively tries to improve one or more selected solutions in a reasonable time; these algorithms can potentially use information about previously explored solutions. Therefore, they only need to have information on how to evaluate the selected solution. There are many meta-heuristic algorithms, such as Genetic Algorithm (GA), Bayesian Optimization Algorithm (BOA), Particle Swarm Optimization (PSO), etc.

2.5.1 Genetic algorithm

Today, genetic algorithm is widely used in solving optimization problems. In general, in nature, better generations emerge from the combination of suitable chromosomes. Sometimes there are mutations in the chromosomes that may make the next generation better. Genetic meta-heuristic algorithm also solves problems using this idea. Figure 6 shows the steps of solving an optimization problem using genetic meta-heuristic algorithm. In the genetic algorithm, initial population (initial generation) of solutions (chromosomes) is randomly generated. At each stage, a number of more appropriate solutions are preserved (selected) and the rest are lost (Esfandyari and Rafe 2018). Then, the next generation is produced by using more suitable solutions and by applying merging and mutation operations on them. This process is repeated alternately until one of the generated solutions shows the optimal solution or the maximum number of iterations is reached.

2.5.2 Bayesian optimization algorithm

The Bayesian Optimization Algorithm (BOA) relies on the use of Bayesian networks to model promising solutions. In BOA, the initial population is randomly generated and then better cases are selected from the current population. Afterwards, a Bayesian network is constructed according to these selected solutions. Using a set of good solutions as well as prior information about the problem can improve convergence and increase estimation. The constructed network produces new solutions by using encrypted joint distribution (Pira et al. 2017). Solutions are added to the old population and replace some old solutions. This action is repeated until one of the stopping conditions considered for the algorithm such as the existence of an optimal solution in the population or an upper bound for the number of iterations is fulfilled.

2.5.3 Particle Swarm Optimization algorithm

Particle Swarm Optimization (PSO) algorithm (Kennedy and Eberhart 1995) is a social search algorithm modeled on the social behavior of flocks of birds.

Optimization algorithm based on swarm movement (PSO) is a basic population optimization method that is taken from the collective behavior of birds or fish. In this algorithm, the population consists of a number of particles that represent different representatives of the problem solution. The solution space of the problem is defined as the search space in such a way that every position in the search space is a solution to the problem and has the full ability to solve the problem. Particles interact to find the best position in the search space to finally be defined based on the cost function and achieve the best solution. This algorithm starts its work with a set of potential responses (particles) and seeks to find the optimal point by updating the state of the particles. Each particle has a memory and stores the best position it has achieved so far and the best position of its neighbors in each iteration. In each iteration, each particle adjusts its velocity vector based on its best position and the best position of its neighbors and moves in the search space.

2.6 Using meta-heuristic techniques in model checking

In software systems formally characterized through graph transformations, meta-heuristic/evolutionary techniques have been used to analyze and check reachability properties (Pira et al. 2019). In these methods, to estimate the quality of the current solution (particle in PSO and chromosome in GA and BOA algorithms) in reaching the target state a fitness function is used. At each stage of model checking, the most promising states will be stored.

3 The proposed method

To generate a fix for the detected bug, we transform the problem into a path-finding problem that starts from the initial state and ends at the final state of the system graph (i.e., the state space). Using model checking and graph transformation specifications, we generate candidate fixes, which aims to extract the changes used in the bug fixing. The main idea is to use reachability testing to find the best debugging solution.

The block diagram of the proposed method is given in Fig. 7. The steps in the proposed method are as follows:

1.
Mapping the program source code to the graph.
2.
Finding buggy nodes.
3.
Considering buggy graph as host graph.
4.
Apply rules on buggy nodes and their children to build the problem state space.
5.
Finding the most similar buggy graph by comparing the input buggy graph with the graphs of the training dataset.
6.
Considering the bug-free graph corresponding to the buggy graph found in the previous step as the target state.
7.
Searching the target state in the problem state space with accessibility testing.
8.
Selection of paths starting from the starting state and ending in the target state as candidate paths (these candidate paths are actually a sequence of changes).
9.
Using meta-heuristic algorithms to select the best and shortest path from the candidate paths.

In the following, the details of the proposed method are described. At first, the source code of the program is mapped to the graph based on AST. Then, by using the graph as well as the information obtained about the location of the discovered bug by Yousofvand et al. (2023) technique, we specify the buggy code through GST. In this problem, the buggy graph is considered as the host graph. We start from the node where the bug occurred and apply the rules to it and its children. These rules are the same rules defined in the graph transformation system. There are 4 categories of rules that can be applied, including: rules for deleting nodes, inserting nodes, replacing node values, and replacing node types. By applying the rules on the host graph, the state space of the problem is constructed. In this paper, a dataset has been used that has buggy and corresponding bug-free codes. To generate the fix, we search the training dataset to find the most similar graph to the input buggy graph, which we do by comparing the graphs. After finding the most similar buggy graph, we consider its bug-free corresponding as the target state. Search for this state in the problem state space with the reachability test, and if the target state is found, the paths from the starting state to this state are selected as candidate paths. These candidate paths, which are fix solutions, are actually a sequence of changes that can be applied to the bugged graph. By applying consecutive changes on the buggy graph, the bug-free graph can be obtained. Now, using meta-heuristic algorithms, we select the best path from these candidate paths. In this paper, we use BOA, GA, and PSO meta-heuristic algorithms to verify the reachability property in the buggy codes identified through GTS. In the following, the details of mapping the code representation to the graph, and the important parameters, as well as the GA-based approach, are described.

3.1 Mapping program source code to a graph

In this paper, a graph-based approach that generates a symbolic representation of the code is used. In this production, a combination of ASTs with additional edges is used, which leads to the representation of data flow and control flow (Dinella et al. 2020). As we can see in Fig. 8, syntactic nodes from the program grammar are labeled with non-terminal names, while syntactic tokens are labeled with the string they represent. ASTedge edges are used to connect nodes according to abstract syntax tree. Since this does not impose the order of syntactic nodes on the children, SuccToken edges have been added that connect each syntactic token to its successor. Also added Value nodes that store the actual content of the leaf nodes. Then the leaf nodes are connected to the actual values with the ValueLink edge.

3.2 Structure of chromosomes

In the graph search problem(GSP), the output is a path that starts from the initial state and leads to the target state. Each chromosome is considered as a sequence of transitions in a path from the initial state in the state space. Therefore, there are different chromosomes due to the existence of different paths starting from the initial state. In this paper, we choose value encoding to encode chromosomes, where all chromosomes are strings of numbers. The numbers are values between 0 and the maximum number of outgoing transitions in the state space. Figure 9 shows an example showing the chromosome “2101” and the encoded path “r2r1r0r1” in the state space. Here, r2, r1 and r0 are the applied rules. This path is shown with colored states and edges.

As mentioned, each chromosome encodes a path in the state space. If the last state of a path is similar to the given reachability property, that path can be considered as a promising path. So, the similarity between the last state of a path encoded by a chromosome and a given property can be used as a measure of fitness. We have modeled the system with GTS. Therefore, each state in the model state space and each feature are modeled using graphs whose edges and nodes have labels. To check which state is promising, the similarity of that state with a state property should be considered. In verifying reachability property EF t assuming that t is a state property, the similarity between t and all states is measured. If the similarity of a state is more than other states of this level, the state is promising. The similarity is measured by considering the value of each node and the output nodes and edges of each node.

Algorithm 1 shows the fitness function. First, all pairs of nodes (tn ∈ Gt, pn ∈ Gp) with identical nodes (node value and node type) are found. Then, identical tags are checked. After finding these nodes, the fitness of the given chromosome is obtained from the total number of identical labels. The output of the similarity function is denoted by ToT_Eq. Any state with a larger ToT_Eq is more promising. After calculating the similarity of states, some states with maximum similarity are selected as promising states. Now, by selecting promising states, we save all paths that lead to a promising state.

Algorithm 1 fitness function
Input: t; a reachability property to be checked, p; a chromosome; Output: the fitness value of p; 1. Graph Gt = graph t; 2. Graph Gp = the graph of the last state of the path which is encoded by p; 3. Allpairs = all node pairs (tn ∈ Gt, pn ∈ Gp) with the identical node values and node labeles 4: Find all pairs (kn, pn) in Allpairs with identical labels (eq-nodes); 5: ToT_EQ = the total number of the identical labels for eq-nodes; 6: return ToT_EQ

After calculating the fitness values for all chromosomes, a subset of them is selected to learn a Bayesian network. There are different selection methods, including Tournament, Roulette-wheel and Truncation. In this article, we use the Truncation selection method. In this method, the chromosomes are sorted based on the fitness values and then according to the threshold value T (selection rate), some of the best chromosomes are selected. T is a measure that determines the percentage of selection of the current population as parents for the next generation chromosomes.

After selecting the best chromosomes, a Bayesian network is learned using them and sampling (Pira et al. 2019) is done from the network to generate new chromosomes. Then the worst chromosomes should be replaced in the current population. The number of replaced chromosomes in each iteration of the Bayesian optimization algorithm is determined by the replacement rate. The Bayesian optimization algorithm is repeated until one of the end conditions, such as finding a target state or an upper limit for the number of iterations, is fulfilled.

4 Experimental results

4.1 Metrics

In this article, we used three criteria of accuracy, average execution time and the length of the shortest path. We also examined the effect of parameters such as replacement rate and selection rate. We used the accuracy criterion to compare the proposed method with other bug correction methods in previous JavaScript codes. This metric shows the ratio of bug items that have been completely fixed to the total number of bug files. In order to know which heuristic algorithm gives us the answer faster, we have also used the average execution time criterion. Moreover, because each path specifies the sequence of edits to convert the buggy code to non-bugged code, and a shorter length means fewer edits, so we need a measure to calculate the length of the shortest path.

We implemented the proposed method in this research with Java programming language and tested it on the dataset of JavaScript codes. In the evaluation of this method, the JavaScript codes provided in Dinella et al. (2020), (https://github.com/AI-nstein/hoppity) were used, which include the buggy and corresponding bug-free codes. Also, we have used the GROOVE tool as an environment for implementing and evaluating the proposed technique. In this section, we have evaluated the effectiveness of the proposed technique for bug fixing and finding the best sequence of changes with EF t reachability property where t is a bug-free code. For this purpose, BOA algorithm presented in Pira et al. (2019), GA algorithm and PSO algorithm have been used. We have changed the similarity function which is the fitness function in these algorithms and used the modified version. Also, in this article, we have focused on the delete rule. The values of the required parameters for the implementation of this algorithm are given in Table 1. Selection rate and Replacement rate parameters have been obtained by experiment. The parameters related to GA and PSO are selected according to reference (Pira et al. 2019).

Table 1 Fixed parameters along with their values for the implementation of the technique

Full size table

In this method, the selection rate parameter indicates the proportion of chromosomes that are the fittest chromosomes to be selected. In GA, some of the best chromosomes are randomly paired using mutation and crossover operations, while these chromosomes are used to learn BN in the BOA algorithm. In GA, a single point crossover is used and in mutation, some genes of a chromosome are randomly replaced with new random values. It should be noted that we only consider successful run. The term "Not found" is used in the time table cells when an approach cannot detect any target state in a model due to reaching the maximum number of iterations or due to consuming all the available memory.

Assuming that the state property q in this problem is “corresponding graph of JavaScript bug-free code”, Table 2 shows the results of BOA algorithm, genetic algorithm (GA) and PSO algorithm for verification EF q in several samples of the dataset. In each case, the average time of 30 runs is considered. As can be seen, in most cases, the goal state and the shortest path have been found. The reason for not finding the target state in several cases is the state space explosion problem that happened for graphs with a high number of nodes. All algorithms have an acceptable run time, but the best path discovery time with BOA algorithm is better than all algorithms, that is, BOA has the highest speed among all approaches. While GA has the longest run time (slowest) in most cases. One of the important goals of these techniques is to find paths with shorter length. In this table, the length of the shortest path created through the BOA technique is given.

Table 2 Comparing the results of implementing different techniques to verify the reachability property in some examples of JavaScript codes in the dataset

Full size table

To investigate the effect of replacement and selection rates on the BOA-based technique, we execute 3 problems with different values for these parameters. If the replacement rate is set to high values (for example, 80% or higher), some of the good chromosomes will also be replaced. If the replacement rate is set at low values (eg, 20%), most of the bad chromosomes in the current population will be retained in the next population. If the selection rate is set to high values (for example, 80%), some bad chromosomes are selected along with the good chromosomes, and if the selection rate is set to low values (for example, 20%), the number of selected chromosomes are few for BN learning. Therefore, it seems that a value between 30 and 60 percent is appropriate for these parameters. Figure 10 shows the results of tests on the first 3 problems from Table 2. Approximately, the minimum average run time is obtained when the replacement and selection rates are 30% and 30%, respectively.

Table 3 shows the comparison results of the proposed bug fixing method with Hoppity (Dinella et al. 2020) and TFIX (Berabi et al. 2021) approaches. Considering that we implemented the proposed method on a system with RAM 12 GB and CPU corei7 and GEFORCE 920MX and we have memory limitations due to the high number of states created in the state space. Due to this limitation, 10,874 graphs have been selected from the dataset described in the previous sections. As mentioned earlier, an undefined property indicates that a variable has not been assigned a value or that the variable has not been declared at all. Our proposed method can identify these types of bugs well because among the considered features, we introduced the parent feature of each node, which can be identified by following the sequence of nodes. The results show that the proposed approach has an acceptable performance compared to the compared methods.

Table 3 Comparison of the proposed bug fixing method with Hoppity and TFIX

Full size table

Table 4 shows the accuracy of the proposed method versus the number of edits. For this comparison, 2000 buggy codes were selected from the mentioned dataset. The accuracy of the Hoppity method decreases significantly when the number of revisions increases. The reason is that in Hoppity method, the search space becomes very large and by searching the entire space, its performance deceases. But our proposed method is less affected due to the exploration of fewer states in the state space and the use of heuristic algorithms.

Table 4 Comparison of the proposed method with the Hoppity method (accuracy/ #edits)

Full size table

For a simple comparison of the performance of the approaches in terms of runtime, we conducted tests on 10,874 samples and recorded the runtime. The total model training and testing in the Hoppity method takes approximately 8 h. TFIX is implemented using a text-to-text approach and providing a powerful Transformer based model, and its runtime takes about 3.5 h. Our approach results in faster search by intelligently searching the search space and exploring fewer states. As a result, it takes less time to find the edits. We present the runtimes in Fig. 11 on a logarithmic scale.

5 Conclusion and future works

The purpose of this article was to present a new method based on the use of graphs and the graph transformation system, as well as model verification to automatically correct bugs and also find the shortest sequence of changes. The problem of finding correction can be represented by a GSP, by which a path (also called a witness) is found such that it starts from an initial node and leads to a target node where the reachability property (target state = bug-free graph) is satisfied. As shown in this article, the use of meta-heuristic algorithms are very effective in discovering this shortest sequence of changes. It should be noted that the proposed method for graphs with a high number of nodes faces the problem of state space explosion and our system is limited in the implementation of graphs with a high number of nodes. In the future, the algorithm can be extended to apply other rules such as inserting and changing the value of the node.

References

Allamanis M, Brockschmidt M, Khademi M (2018) Learning to represent programs with graphs. In: International conference on learning representations (ICLR)
Baier C, Katoen J-P (2008) Principles of Model Checking. MIT Press, New York
Google Scholar
Berabi B, He J, Raychev V, Vechev M (2021) TFix: learning to fix coding errors with a text-to-text transformer. In: International conference on machine learning
Chen Z, Kommrusch S, Tufano M, Pouchet L-N, Poshyvanyk D, Monperrus M (2019) SequenceR: sequence-to-sequence learning for end-to-end program repair. IEEE Trans Softw Eng 47(9):1943–1959
Google Scholar
de Lara J, Vangheluwe H (2002) AToM3: a tool for multi-formalism and meta-modelling. In: Fundamental approaches to software engineering (FASE)
Dinella E, Dai H, Li Z, Naik M, Song L, Wang K (2020) Hoppity: learning graph transformations to detect and fix bugs in programs. In: International conference on learning representations (ICLR)
Esfandyari S, Rafe V (2018) A tuned version of genetic algorithm for efficient test suite generation in interactive t-way testing strategy. Inf Softw Technol 94:165–185
Article Google Scholar
Esfandyari S, Rafe V (2020) Extracting combinatorial test parameters and their values using model checking and evolutionary algorithms. Appl Soft Comput 91:106219
Article Google Scholar
Esfandyari S, Rafe V (2021) GALP: a hybrid artificial intelligence algorithm for generating covering array. Soft Comput 25(11):7673–7689
Article Google Scholar
Fischer B, Saabas A, Uustalu T (2009) Program repair as sound optimization of broken programs. In: 2009 third IEEE international symposium on theoretical aspects of software engineering
Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: Proceedings of the 11th annual conference on genetic and evolutionary computation
https://github.com/AI-nstein/hoppity
Kalvala S, Warburton R (2011) A formal approach to fixing bugs. In: Formal methods, foundations and applications
Kastenberg H, Rensink A (2006) Model checking dynamic states in GROOVE. In: International SPIN workshop on model checking of software
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN'95—international conference on neural networks
Le Goues C, Nguyen T, Forrest S, Weimer W (2012) GenProg: a generic method for automatic software repair. IEEE Trans Softw Eng 38(1):54–72
Article Google Scholar
Liva G, Taimoor Khan M, Pinzger M, Spegni F, Spalazzi L (2019) Automatic repair of timestamp comparisons. IEEE Trans Softw Eng 47(11):2369–2381
Article Google Scholar
Mahajan S, Alameer A, McMinn P, Halfond WGJ (2018) Automated repair of internationalization presentation failures in web pages using style similarity clustering and search-based techniques. In: International conference on software testing, verification and validation
Meyers RA (2001) Encyclopedia of physical science and technology, 3rd edn. Academic Press, New York
Google Scholar
Nilizadeh A, Leavens GT, Le X-BD, Păsăreanu CS, Cok DR (2021) Exploring true test overfitting in dynamic automated program repair using formal methods. In: 2021 14th IEEE conference on software testing, verification and validation (ICST), 2021.
Oliveira VPL, Souza E, Le Goues C, Camilo-Junior C (2016) Improved crossover operators for genetic programming for program repair. In: International symposium on search based software engineering
Pira E (2021) Using knowledge discovery to propose a two- phase model checking for safety analysis of graph transformations. Softw Qual J 30(1):37–64
Article Google Scholar
Pira E, Rafe V, Nikanjam A (2017) Deadlock detection in complex software systems specified through graph transformation using bayesian optimization algorithm. J Syst Softw 131:181–200
Article Google Scholar
Pira E, Rafe V, Nikanjam A (2018) Searching for violation of safety and liveness properties using knowledge discovery in complex systems specified through graph transformations. Inf Softw Technol 97:110–134
Article Google Scholar
Pira E, Rafe V, Nikanjam A (2019) Using evolutionary algorithms for reachability analysis of complex software systems specified through graph transformation. Reliab Eng Syst Saf 191:106577
Article Google Scholar
Rozenberg G (1997) Handbook of Graph Grammars and Computing by Graph Transformation. World scientific, Singapore
Book Google Scholar
Taentzer G (2003) AGG: a graph transformation environment for modeling and validation of software. In: International workshop on applications of graph transformations with industrial relevance
Trujillo L, Villanueva OM, Eduardo Hernandez D (2021) A novel approach for search-based program repair. IEEE Softw 38(4):36–42
Article Google Scholar
Tufano M, Watson C, Bavota G, Di Penta M, White M, Poshyvanyk D (2018) An empirical investigation into learning bug-fixing patches in the wild via neural machine translation. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering
Varró D, Balogh A (2007) The model transformation language of the VIATRA2 framework. Sci Comput Programm 68:214–234
Article MathSciNet Google Scholar
Weimer W, Forrest S, Le Goues C, Nguyen T (2010) Automatic program repair with evolutionary computation. Commun ACM 53(5):109–116
Article Google Scholar
Yousofvand L, Soleimani S, Rafe V (2023) Automatic bug localization using a combination of deep learning and model transformation through node classification. Softw Qual J
Yuan W, Zhang Q, He T, Fang C, Viet Hung NQ, Hao X, Yin H (2022) CIRCLE: continual repair across programming languages. In: Proceedings of the 31st ACM SIGSOFT international symposium on software testing and analysis

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Faculty of Engineering, Lorestan University, Khorramabad, Iran
Leila Yousofvand
Department of Computer Engineering, Faculty of Engineering, Arak University, Arak, 38156-8-8349, Iran
Leila Yousofvand, Seyfollah Soleimani & Vahid Rafe
Department of Computing, Goldsmiths University of London, London, UK
Vahid Rafe
Department of Computer Engineering, Malayer University, Malayer, Iran
Sajad Esfandyari

Authors

Leila Yousofvand
View author publications
You can also search for this author in PubMed Google Scholar
Seyfollah Soleimani
View author publications
You can also search for this author in PubMed Google Scholar
Vahid Rafe
View author publications
You can also search for this author in PubMed Google Scholar
Sajad Esfandyari
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The initial idea was from VR The design, provision of resources and data collection were performed by LY. Data analysis was done by LY, SS, and VR. The manuscript was written and revised by LY, SS and SE.

Corresponding author

Correspondence to Seyfollah Soleimani.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1

Here are the steps to generate a fix for a buggy graph (01-10-2019_03_570_ SHIFT_1index_buggy.json.gps in Table 2) in GROOVE. The buggy graph is defined as the host graph (Fig. 12). The delete rules are also defined as Rules (Figs. 13, 14, 15, 16, 17). Now, the state space is created by applying rules on the host graph (Fig. 18). The bug-free graph is defined as the reachability property (this graph obtained by searching the training dataset to find the most similar graph to the input buggy graph, which we do by comparing the graphs. After finding the most similar buggy graph, we consider its bug-free corresponding as the target state). With the generated state space and reachability property (Fig. 19), the best path starting from the initial state and ending in the target state (reachability property) is found using Bayesian algorithm (Fig. 20).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yousofvand, L., Soleimani, S., Rafe, V. et al. Automatic program bug fixing by focusing on finding the shortest sequence of changes. Artif Intell Rev 57, 39 (2024). https://doi.org/10.1007/s10462-023-10686-y

Download citation

Published: 07 February 2024
DOI: https://doi.org/10.1007/s10462-023-10686-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Automatic program bug fixing by focusing on finding the shortest sequence of changes

Abstract

Similar content being viewed by others

Evaluating and Integrating Diverse Bug Finders for Effective Program Analysis

Mining Python fix patterns via analyzing fine-grained source code changes

A Novel Metaheuristic Based Method for Software Mutation Test Using the Discretized and Modified Forrest Optimization Algorithm

1 Introduction