Evolutionary neighborhood discovery algorithm for agricultural routing planning in multiple fields

In recent years, operations research in agriculture has improved the harvested yield, reduced the cost and time required for field operations, and maintained economic and environmental sustainability. The heuristics method, named Evolutionary neighborhood discovery algorithm (ENDA), is applied to minimize the inter-field and intra-field distance of the routing planning of machines in multiple agricultural fields. The problem is an extended version of the Agricultural Routing Planning (ARP) that takes into consideration the different capacity of the machines and multiple agricultural fields. This research also describes the mathematical model to represent the proposed problem formulated as an integer program. The experimental results show that ENDA successfully solves ARP instances, giving the best results and the fastest running time compared to those obtained by Genetic Algorithms and Tabu Search. The results also show that ENDA can save an average of 11.72% of the distance traveled by the machines outside the working path (when making maneuvers, going to or from the entrances and going from and returning to the Depot).


Introduction
Operations research in the domain of agriculture has become increasingly important since it can optimize the whole supply chain process and facilitate decision making (Zhang et al., 1990). Instances in this area include, among others, optimization of forest harvest scheduling (Neto et al., 2017;St. John and Tóth, 2013;Alonso-Ayuso et al., 2011), sugarcane harvest planning (Florentino et al., 2018;Sethanan and Neungmatcha, 2016), and crop cultivation (Aliano Filho et al., 2019). The domain also includes the management of farm and machinery as well as the routing of the machine(s) inside the field (Plà et al., 2014).  (Utamima et al., 2018) Agricultural Routing Planning (ARP) in farm management is intended to design or schedule the movements of machines inside agricultural fields. A good design can minimize the distance of the machine's tours, thereby leading to cost savings. Hence, it is essential to have an optimized plan for the routing of the machines being used for the agricultural tasks in the field .
The focus of this research is on the routing optimization of machines inside multiple agricultural fields. Figure 1 illustrates an agricultural field while Fig. 2 shows the possible graph representation (right) of an ARP problem (left). The field has several established tracks with symmetrically-planted crops. These tracks can be traversed by both agricultural machine(s) and harvesters. The headland area to the North and South of the field is the nonworking area (crop-free) where machines perform maneuvers to go to the next track. The farmer needs to determine which sequence of tracks will cover the shortest distance, thereby reducing the overall cost of harvesting.
To date, most research on ARP has considered only single-field configurations Conesa-Muñoz et al., 2016;Seyyedhasani and Dvorak, 2018a). Most of the ARP research also utilized a single machine without a capacity constraint (Hameed et al., 2011;Edwards et al., 2017). Therefore, ARP with multiple fields and capacitated machines is still a potential research area. ARP belongs to the class of NP-complete problems; therefore, an exact optimization is too time-consuming and complex to be applied, and a heuristic algorithm is needed (Marinakis et al., 2017). Hence, this research develops the evolutionary neighborhood discovery algorithm (ENDA) to deal with ARP.
Most of the current research focuses on real-world scenarios and utilizes general algorithms rather than addressing unique ARP configurations and algorithm improvements. This study is a continuation of our recent work (Utamima et al., 2019) by extending the ARP problem. This study makes three significant contributions to the literature. First, the mathematical model for the extended ARP is formulated, taking into account multiple fields, barriers, and multiple machines with heterogeneous capacities. Second, seven problem sets for the extended ARP are generated and presented. Third, a hybrid heuristic and a set of This section is followed by a review of the literature pertaining to ARP and the algorithms that have been proposed to solve related problems (Sect. 2). The definition of the problem, the mathematical model, and the proposed algorithm, ENDA, are given in Sect. 3. Section 4 provides the experimental results, including descriptions of the problem set, the parameter settings, numerical results, and the analysis. Finally, Sect. 5 offers suggestions for future research in ARP, and concludes the paper.

Agricultural routing planning
The ARP problem involves minimizing the distance traveled by machines when performing field operations inside an agricultural field (Utamima et al., 2018). This problem has been altered and extended in terms of the targets [e.g., improvement of time (Seyyedhasani and Dvorak, 2018b), minimization of the headland distance (Bochtis and Vougioukas, 2008)], specific field operations [e.g. herbicide application (Conesa-Muñoz et al., 2016), potato cultivation , or orchard operation ], and limitations [e.g., restricted machine limit (Bakhtiari et al., 2013) and obstacles (Zhou et al., 2014)].
The optimization of the non-working distance, first introduced by (Bochtis and Vougioukas, 2008), is intended to minimize the distance in the headland area. The machine is not performing an agricultural operation while making the turning maneuvers in the headland area. Jensen et al. (2015) extend the problem by using capacity constraints. The model was adapted by Bochtis et al. (2015) for orchard operations, Conesa-Muñoz et al. (2016) for a weed-killing problem and by Seyyedhasani and Dvorak (2018a) for reducing the field work time. Barrientos et al. (2011) and Valente et al. (2013) also applied the same concept to aerial coverage planning. Utamima et al. (2019) gathered several published ARP datasets from previous publications and presented a mathematical model to generalize the problem. All the datasets are singlefield configuration. The collected datasets can be used as validation data for a new method when solving a new ARP problem. The work also conducts a comparative study of several metaheuristic algorithms in order to deal with the collected dataset.
Many of the recent ARP studies focus on a single field. Seyyedhasani and Dvorak (2017), Conesa-Muñoz et al. (2016), Zhou et al. (2014), Bakhtiari et al. (2013), Valente et al. (2013) and Hameed et al. (2011) consider both rectangular and non-convex fields. The rectangular field is the primary type that most researchers use in ARP. Whereas, the non-convex field is closer to actual cases and most farm situations (Zhou et al., 2014).
Currently, research on multiple crop fields is found in sugarcane harvesting (Kittilertpaisan and Pathumnakul, 2017;Sethanan and Neungmatcha, 2016). Sethanan and Neungmatcha (2016) focused on sugarcane field operations that minimized the traveling distance in multiple fields, while Kittilertpaisan and Pathumnakul (2017) focused on improving the planning of the cultivation of a new crop. However, neither study considered the tracks inside the fields and the maneuver of the machines. Hence, this study considers the tracks inside the fields and the machines' maneuvers to better address the harvesting problem.

Heuristic algorithms for agricultural routing planning
Previous studies on ARP focused mostly on real cases and solved the related problems with several different algorithms. The most frequently used algorithm is the Genetic Algorithm (GA). GA is an optimisation method based on the concept of genetics and natural selection. GA uses a population formed of individuals to evolve to a state that maximises fitness under the defined selection procedure (Haupt and Haupt, 2004;Karatas et al., 2021). In general, GA simulates a natural evolution process that generates individuals by selecting the best candidates from the current generation to be applied crossovers and mutations.
GA has been adapted for machine routing to decrease the total distance traveled in biomass transportation (Gracia et al., 2014). GA was also used for track sequence optimization in a field, and its solutions have proven to be more efficient compared to the conventional fieldwork patterns (Hameed et al., 2011). Recently, GA has been used in optimizing the scheduling of crop cultivation (Aliano Filho et al., 2019). GA has also been utilized to optimize the sugarcane harvesting plan in order to make the harvesting point as close as possible to the ideal (Florentino et al., 2018).
TS is an algorithm that traverses the solution space by moving iteratively from the current solution to a (best) solution in its neighborhood. TS improves the search by not re-evaluating points already visited in the search space. Such a point is considered 'tabu' to be rechecked (for several iterations).This method makes use of flexible memory structures with restrictions and aspiration criteria that exploit the search spaces (Glover and Laguna, 1997).
TS was applied to reduce the fieldwork time in agricultural operations (Seyyedhasani and Dvorak, 2018b). Dvorak (2017, 2018a) utilized TS to improve field efficiency and the dynamic directing of several machines in farming tasks. The results show that TS can optimize the field work time more effectively than do the Clarke-Wright heuristics (Seyyedhasani and Dvorak, 2017). The TS approach was also applied to solve the location problem of a forest harvesting machine (Weintraub, 2007). Sethanan and Neungmatcha (2016) used Particle Swarm Optimization (PSO) with a different structure for route planning in sugarcane fields, while Valente et al. (2013) employed Harmony Search to optimize coverage path planning in vineyard parcels. A hybrid Simu- To conclude, there are several research gaps, as indicated earlier. First, there is very limited research on ARP involving multiple fields, and most of the ARP research utilized a single machine without capacity constraints. Hence, this study is the first to address several ARP constraints simultaneously. Second, the results of experiments conducted in this study show that the solutions given for several datasets in previous research can still be improved. Therefore, an improved, more efficient algorithm needs to be developed to enhance the quality and management of the solution. Figure 3 illustrates the problem with three fields. The blue lines indicate the tracks while the red lines are the field border. Each track is differentiated by a number. Field 1 has 18 tracks (West to East, 1 to 18), Field 2 has 22 tracks (West to East, 19 to 40), and Field 3 has 16 tracks (South to North, 41 to 56). The barriers in the problem are the field borders and the rocks, Fig. 4 Illustration of a machine's tour and its maneuvers which cannot be traversed by the machine. The proposed algorithm addresses the barriers by limiting the connection between different fields. Therefore, the machines can enter or exit a field or move from one field to another only from a specific location/entrance. In Fig. 3, the machines can enter the field only through the field's entrance (green square). The machine's tour starts from and ends at the Depot (red square). The goal is to discover the minimum length of the non-working distance in all tours for a set of machines to harvest all tracks. The non-working distance in this study refers to the distance traveled by the machines outside the tracks when making maneuvers, going to or from the entrances and going from and returning to the Depot. This research assumes that there are several available machines with different capacities to harvest several fields. The optimization will assign machines to the given routes, taking their capacities into consideration. If a machine reaches the maximum amount of crop it can hold, it needs to return to the Depot to unload the harvested crop. This study assumes a uniform amount of harvest per traveled distance. Hence, the machine capacity can be described as the maximum distance it can travel on tracks on one trip. Figure 4 depicts a machine's trip in two fields and its maneuvers. There are two adjacent fields in Fig. 4 where the field in the North can be entered only through the field in the South. The non-working distance is visualized with green lines. The numbers (near the green lines in Fig. 4) indicate the sequence of tracks that the machine uses. The non-working distances can be differentiated by the maneuvers (3 to 6 and 9) and movements between Depot and entrances or the tracks to the entrances (1, 2, 7, 8, 10, 11, 12 and 13).

Problem description
The machine starts from the Depot and traverses the south field's entrance first (1) before harvests several tracks in the southern field (2-6). The machine then goes to the northern field's entrance (7), harvests two tracks in the northern field (8-9), and returns to the entrance (10). Next, the machine harvests a track in the southern field (11) before going back to the Depot via the southern field's entrance (12). Figure 4 also depicts the four kinds of maneuvers that are considered in this study. The assumption in Fig. 4 is that the machine needs to skip at least one track to make a flat turn. Hence, if a machine's next track is directly besides the current track, then the bulb or bulb turns will be made. In Fig. 4, the maneuvers are flat (4, 6), bulb (9), flat (3) and bulb (5). The flat and bulb turns are made when the current and next track are aligned with an acute angle θ(θ < 90 • ), while the flat and bulb turns are performed when the current and the next track are aligned with an acute angle θ(θ = 90 • ) and have the same Y-axis coordinate in the maneuver's area. The conditions of these maneuvers are listed in Constraint 1 in Sect. 3.2 and adapted from Utamima et al. (2019) and Jin and Tang (2010).

Mathematical model
The mathematical model used to solve the extended ARP is formulated as an integer programming model. The group of tracks in the fields are assumed to be a set of vertices and edges in Graph G. Table 1 describes the parameters used in the mathematical model. The decision variables are listed in Table 2.
Equations 1-5, adapted from Jin and Tang (2010) and Utamima et al. (2019), are used to compute the distance the machine travels in order to make a turning maneuver in the headland area. The description of each notations is given in Table 1.
The objective function is given in Eq. 6 which is to minimize the sum of all included edges in the solution, the distance of the tour between fields, the distance traveled by each machine to and from the Depot, and the distance between the entrance to the first and last track to be harvested. Meanwhile, Constraints 7-12 are the restrictions related to the tour between tracks inside a field. Constraint 7 and 8 specify that each route should start and end at the Depot (Vertex 0). Constraints 9 and 10 ensure that each node is visited only once.
The distance between two fields (from entrance field a The distance from field f to Depot The degree of tilt between track i and track j and driving direction ψ, note that 0 The flat type of maneuver turn executed by a machine in the headland area The bulb type of maneuver turn executed by a machine in the headland area  Constraint 11 ensures that when a machine enters a vertex, it will also leave that vertex. Constraint 12 is a sub-tour limitation that eliminates any disjoint sub-tours from a solution. Then, Constraint 13 ensures that the amount of harvested crop does not exceed the capacity of a machine. The distance traveled inside the track has the same dimension as the capacity of a machine. The assumption is that when a machine traverses a track, it completely harvests that track, so the maximum distance inside the track will depend on the maximum capacity of a machine. Finally, Constraints 14 and 15 ensure that each field is visited only once.
subject to: k∈K j∈V k∈K i∈V i∈S j∈S The total distance traveled for harvesting the crop fields is found with Eq. 16 which is the sum of the distance between the fields, the length of each track in each field, and the objective function.

Proposed algorithm
This study builds a new hybrid algorithm, named ENDA, to solve ARP. Algorithm 1 lists the ENDA pseudocode. ENDA has two main stages: the exploration stage that searches the solution among the individuals in a population, and the exploitation stage that searches for the best solution found so far. ENDA is a hybrid of three kinds of heuristics, namely distribution search, inter-neighborhood discovery, and intra-neighborhood discovery. The exploration stage consists of distribution search and inter-neighborhood discovery, while the exploitation stage contains the intra-neighborhood discovery procedure. The individuals are the candidate solution to the problem; the population refers to the group of individuals in the algorithm. The representation of an individual with 8 tracks is illustrated in Fig. 5.

Algorithm 1
The pseudocode of ENDA 1: procedure ENDA() 2: Initialization_Read_Data() 3: for g=1 to max_generation do 4: OF ← Calculate_ObjectiveFunction() 5: if in DistributionSearch_rate then 6: Selected_Individuals ← Choose_BetterObjectives(OF) 7: NewPopulation ← DistributionSearch(Selected_Individuals) 8: else 9: for i=1 to discovery_iteration do 10: S ← RouletteWheel() 11: [P] ← RandSelect_two_points() 12: c ← Random (3)   First, the parameters of the ENDA are initialized. The parameter settings are explained in more detail in Sect. 4.3. The details of the machines (number of machines, their various capacities, and turning radius) are also initialized at this point. Then, the algorithm reads the field data that contains the track coordinates, the distance between each entrance and the Depot, and the length of each track. The iterations start with the computation of the objective function of every individual. The exploration stage of ENDA will perform the distribution search according to either the rate or the neighborhood discovery.
The distribution search (Algorithm 1, Line 5-7) will build the new population according to the probability of the order of tracks for the selected individuals. The distribution search will run if the DistributionSearch_rate (the probability rate to execute the procedure) is met. The selected individuals are the group of individuals with a better objective function (H is set of selected individuals) that are labeled h 1 , h 2 , h 3 , . . . , h n (h ∈ H ). Equation 17 calculates the sum of every track being placed in an order. The graphical illustration of the calculation of is shown in Fig. 6. In Eq. 17, h tp is set to 1 if track t is visited at order p, otherwise, it is set to 0 (t ∈ T , p ∈ P ; P is set of tracks' order). The size of T and P is equal to the number of tracks in the fields. The distribution search function will update each individual in the population by selecting the order of track proportionally according to .
Next, the inter-neighborhood discovery procedure is listed in Line 9-20 of Algorithm 1. The procedure is adapted from Hansen et al. (2010) and Utamima et al. (2019). The procedure runs for discovery_iteration times. At the beginning of every iteration, the roulette wheel selection is performed, and an individual is selected (S). The roulette wheel selection expects that the higher the fitness of an individual implies the greater is the chance of its survival. The fitness function that is used in the Roulette Wheel selection is listed in Eq. 18 which is the inverse of the objective function (mentioned in Eq. 1). The chance of selection in this approach is proportional to the fitness of an individual.  After selection, the algorithm chooses two points randomly from S and performs between flip, guided swap, or inversion operators. The details of the operators are listed in Table 3. Next, the new population is updated from the results of the inter-neighborhood discovery.
The objective function calculation evaluates the fitness of the new population and finds the current best solution in the population (Algorithm 1, Line 23). This solution is checked to determine whether it can replace the global best solution (the best solution found so far among generations). If the current best solution is not better than the global best solution (globalBest), then the second stage of ENDA is implemented (Algorithm 1, Line 28-33). In this stage, the intra-neighborhood discovery searches inside the globalBest to find a better solution. The procedure is adapted from the local search in Song et al. (2019). In this stage, the interchange and reverse operators are employed until no further improvement can be made. The operators are described in Table 3.
If the solution of the intra-neighborhood discovery(ndSol) is better than globalBest, then ndSol will become the globalBest. Next, the Elitism procedure is run to retain ten percent of the best individuals through generations in order to protect prospective solutions (Nouri and Ladhari, 2018). The recorded individuals are copied to the new population and are sent to the next generation.

Genetic algorithm and Tabu search
This study also re-coded GA and TS to solve the extended ARP problem set. GA and TS are chosen because both algorithms have been implemented recently in optimization in the agriculture domain (Florentino et al., 2018;Seyyedhasani and Dvorak, 2018a). The input and the fitness calculation are configured to meet the standard procedure of the algorithms. The number of generation and the size of population in GA and TS are set to be the same as those in ENDA.
Generally, GA starts from the generation of a new population and continues with the selection of chromosomes with a better fitness value. The higher the fit of a chromosome, the higher its chances to be selected. Crossover and mutation are applied to the chosen chromosomes with a given probability rate. The crossover process cuts two chromosomes at one point, and the halves are spliced together to create new chromosomes. Meanwhile, the mutation process changes the value of genes in a small group of chromosomes. After crossover and mutation, the fitness evaluation detects the best solution, which is recorded. The whole process is repeated until a stopping criterion (i.e. maximum iteration) is met. The process of evolution in a population of chromosomes over multiple generations represents a search for a good and feasible solution.
Meanwhile, TS constructs several moves (candidate solution) in the neighborhood. The representation of the neighborhood structure in S is like the one provided in Table 5. A move in TS is using Swap techniques between two points (randomly chosen) in the neighborhood.
Next, TS picks a move and checks whether the move is tabu (the move is tabu if it is listed in the tabu list). If the move is not tabu, TS accepts the move. Otherwise, TS checks the aspiration criteria of the move. If a tabu move meets the aspiration criteria, the move will be accepted, otherwise, the move will be refused. If the new solution from an accepted move is better than the current solution, the solution will be updated. The decision process will be repeated for a number of iterations (Ji and Tang, 2004).

Validation of the mathematical model
In order to validate the mathematical model, it was implemented using General Algebraic Modeling System (GAMS) software. Two small problems are tested, and consist of multiple fields and multiple machines with different capacities. Each problem involves two fields with tracks measuring 4 meters in width, and two machines available with a turning radius of 3 meters. Table 4 shows a comparison of the results of GAMS and ENDA (coded with MATLAB). Both GAMS and ENDA can obtain the same solution for the two problems as well as the same tours of machine 1 and machine 2.
The experiments show that the running time of ENDA is six times faster than that of GAMS. If the size of the problem is increased, the running time is expected to grow exponentially for the GAMS implementation. Therefore, the utilization of ENDA is required for both small and large problems because it can achieve a good solution with a faster running time.
Based on previous research by Bochtis and Sørensen (2009) and Utamima et al. (2018), ARP is similar to Vehicle Routing Problem (VRP) in the agricultural field. VRP itself is an NP-Complete problem (Toth and Vigo, 2002), hence ARP can also be considered as NP-complete. Therefore, finding exact solutions as done by GAMS is restricted to smaller instances as the running time is growing exponentially. Metaheuristics and evolutionary algorithms like ENDA are used on larger instances to achieve near-optimal solutions with short running times.

Problem sets description
The experiments conducted in this study are comprised of two steps. The first step involves the validation of the proposed algorithm by using the data in the literature. The datasets are gathered from published studies and by contacting the corresponding authors. Table 5 shows a description of the data. In Table 5, the first and second columns are the instance code and the total number of tracks in that instance, respectively. The track widths and lengths (meters), turning radii (meters), shapes, and machines used are listed in Column 3 to 6 in Table 5. The last column in Table 5 contains the references to the problem instances, and the details are listed immediately below the table.
In the second step, experiments are run using the extended ARP datasets. Table 6 gives the attributes of the datasets that are used in this study. There are seven datasets with multiple fields and different shapes. The problems are generated and vary in terms of the total number of tracks, their width (meters), and turning radius (meters). The number of fields ranges from two to five, with total tracks ranging from 30 to 112. The machines used (Table 7) have different capacities (ranging from 700 to 3000 m). As indicated earlier, this study assumes a uniform measure of harvest per traveled distance. Hence, the machine capacity is depicted with the maximum distance it can travel on one trip. Figure 7 illustrates the seven problem instances: 2A(a), 2B(b), 3A(c), 3B(d), 4A(e), 4B(f), and 5A(g). The number in the middle of each field is the field number, and the small number near the track is the track number. These numbers are used to give a better understanding of the output of the program that is shown later in Table 12.

Parameters settings
The parameter set-up is acquired by directing a two-level factorial plan with four elements. Each factor joins high and low dimensions (Montgomery, 2013). The algorithm is run ten times to make up for the non-deterministic nature of the heuristics (Guan and Lin, 2016). Table 8 lists factors and the levels of each set-up. To adjust to the extent of the problem, the settings use n, which is the number of tracks in the fields. A larger problem requires a bigger   population and more generations. The value shown in bold indicates the better setting for ENDA. The selected settings for the number of generations and the size of the population are also applied to GA and TS.

Conclusion
It is anticipated that this research will improve the planning of conventional machinery systems by producing better routing designs. The reduction of resources such as machines, fuel, and personnel can result in a competitive advantage and access to markets in a lower price sector. Subsequently, this contributes to addressing the issue of sustainability in agriculture. The scope of this research comprises ARP for a harvesting problem with the tracks inside several fields on a flat surface. This study also assumes that the machine capacity is the maximum distance covered on tracks during the trip. This study presents the extended ARP by considering multiple constraints simultaneously in order to better generalize the harvesting problem. This research also presents seven problem sets for the extended ARP. Furthermore, it recommends the application of a new hybrid algorithm (ENDA) that outperforms GA and TS. The experimental results demonstrate that during the validation process with the previous ARP dataset, ENDA can achieve the same best solution in four instances, and produces a better solution for the other two instances. Moreover, ENDA successfully achieves the best objective function by saving an average of 11.72% of non-working distance compared to other algorithms in the seven problem sets of the extended ARP. ENDA also achieves the fastest running time for all the given problems.
The validation of the mathematical model with GAMS software is also provided in this research. Future research can focus on more experiments regarding the capacities of machines with different field layouts.
article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.