A hybrid spectral clustering simulated annealing algorithm for the street patrol districting problem

Reasonable districting plays an important role in the patrolling process. In this paper, workload attributes are considered, and a mixed integer programming model is developed to solve the street patrol districting problem (SPDP). The improved spectral clustering algorithm named spectral clustering algorithm based on the road network (SCRn) and simulated annealing algorithm (SA) are combined. This results in a hybrid algorithm called SCRn-SA. The SCRn-SA algorithm is tested on small examples and real instances in Zhengzhou, China. The experimental results show that the proposed algorithm is effective for solving SPDP. It has better performance when compared to other advanced algorithms.


Introduction
The objective of districting is to divide the entire study area into a number of basic units. It can allow the relevant resources to be fully utilized [1]. Important criteria for districting are design balance, compactness and continuity of the areas [2][3][4]. The districting problem is applied to all aspects of politics, economics, the environment and public services [5,6]. The main types include patrol districting [7], political districting [8], commercial districting [9] and service districting [10].
The patrol districting problem is a common type of districting problem and can be seen as a geographic area design problem based on relevant rule criteria [11], applied to street, marine, service and other scenarios [12][13][14]. Patrols can be effectively and rationally designed according to attributes such as balancing the workload of patrol officers in each area and optimizing the response time of cases [11,15]. It can have a direct impact on the completion of patrol tasks and B Shan Zhao zhaoshan4geo@zzu.edu.cn 1 ensure that patrols are fair and that all officers are satisfied [16,17].
Police, security officers, city managers and others all contribute to maintaining the aesthetics, safety and order of the city by patrolling the streets in different ways to identify problems [18,19]. Street networks reflect human mobility patterns and are an important object of study in patrols [20]. Patrol studies conducted with the street as the basic unit allow precise, real-time analysis of the patrol process. The street patrol is a rational division of the patrol area along the urban road network according to defined criteria, which allows for the optimal arrangement of patrol personnel and increases the efficiency of patrols.
Compared to the general districting problem, the street patrol districting problem has some special characteristics, mainly consisting of these two aspects. First, the problem is based on the road network as the study area and is effectively solved by constructing a road network model. Second, the patrol area needs to be finely delineated to generate balanced, efficient and compact patrol sections. Therefore, in this paper, a hybrid algorithm is proposed based on a spectral clustering algorithm and simulated annealing algorithm. Specifically, an improved spectral clustering algorithm is used to generate the initial district of the areas, followed by a simulated annealing algorithm to generate the optimal districting results. The main contributions of the paper are summarized below.
• We define a street patrol districting problem (SPDP) and use a mixed integer programming model (MIP) to represent the problem (SPDP). The model uses workload attributes as important criteria to produce a balanced and efficient districting design in the street network. • We develop a hybrid spectral clustering-simulated annealing algorithm (SCRn-SA) that can generate optimal districting results. This algorithm improves on the spectral clustering algorithm and combines it with the simulated annealing algorithm to make it more suitable for solving SPDP. • Two different road network scenarios are selected for testing, and the experimental results indicate that the proposed algorithm is more advantageous than the other advanced algorithms in solving SPDP.
The rest of this paper is organized as follows. "Literature review" section provides a review of related work. "Description and modelling" section presents a mixed integer planning model for SPDP. The proposed algorithm (SCRn-SA) is described, and the experimental results are reported in "Proposed algorithm" section. The numerical experiments and analysis are presented in "Numerical experiments" section. The final section draws conclusions of the paper and discusses future research directions.

Literature review
In this section, the main types of patrol districting problems are reviewed, and then solutions to the problem are presented.
The patrol districting problem is widely used. It mainly consists of police patrol districting, emergency vehicle patrol districting and maritime patrol districting problems [13,14,21,22]. It uses the urban road network or census streets as a whole to achieve some of the goals of the patrol objectives, such as maximum coverage and workload standards [23][24][25]. It makes sense to use the urban road network as an area to patrol, but there is currently little research into the problem based on the road network. Thus motivated, the research in this paper focuses on the patrol districting problem based on urban road networks.
In recent years, many researchers have developed different methods to solve the patrol area delineation problem. Two main solution methods are currently used [5]. The first approach is to use a mathematical model. Kevin et al. [26] used a maximum coverage model for patrol area districting. Chen et al. [27] proposed a patrol area classification model with security levels to guarantee public safety effectively. Nikolas et al. [14] developed a patrol coverage model that takes into account the response time of cases. Zhang et al. [28] established a discrete-event simulation model to evaluate patrol districting planning.
The second approach relies on heuristic algorithms. A number of extensive metaheuristics have been applied to the regional districting problem, namely, simulated annealing algorithms [6], search algorithms [29], genetic algorithms [30], etc. Camacho et al. [31] developed a local search algorithm to solve the police districting problem, and the case study illustrates the advantages of this algorithm. Steven et al. [32] used a simulated annealing algorithm to achieve a suitable assignment of patrol areas. A study of real data from the Buffalo Police Department shows that this algorithm has good robustness. Fernando et al. [33] demonstrated that genetic algorithms can provide better results in regional districting problems. Gonzalez-Ramirez et al. [34] used a hybrid metaheuristic algorithm combining the GRASP algorithm with a tabu search strategy, and this algorithm was experimentally shown to be effective in dealing with the problem. In [35], Thilina et al. proposed a heuristic-based clustering algorithm to achieve workload balancing and minimize the average response time.
The SA algorithm is a popular metaheuristic algorithm that can effectively solve local optimization problems [36,37]. The method of spectral clustering originates from graphical partitioning and is widely used in practical problems [38]. Compared to traditional clustering methods such as kmeans algorithms, spectral clustering algorithms are simple to implement and often yield better results than traditional clustering algorithms [39,40]. Therefore, in this paper, the simulated annealing algorithm and the clustering algorithm are combined to form a hybrid algorithm for the patrol districting problem based on urban road networks. The proposed algorithm is based on the initial solution generated by the spectral clustering algorithm and further optimized by the simulated annealing algorithm to obtain the optimal solution. The algorithm improves the probability of reaching the optimum of the simulated annealing algorithm and reduces the computation time.

Description and modelling
In this section, the street patrol districting problem (SPDP) is defined. Then, the problem is mathematically modelled, and the exact formulation of the model is presented. The objective function and constraints of the model are subsequently defined.

Problem description
Patrols are used to detect problems in the urban road network. This paper focuses on the question of setting patrol areas for patrol officers. The objective is to give patrol officers a reasonable districting of street areas. Based on the specific problem, the following assumptions are usually made.
• A patrol district covers the entire patrolled street network.
The district must not overlap with another district or be empty. • A patrol officer cannot be assigned to two or more patrol districts. • A patrol district contains connected patrol units. The patrol district is made up of at least one patrol unit. • The goal of the districting problem is to give a patrol offer a reasonable allocation of street areas to patrol. It takes into account road conditions, regional risks and distances to form continuous, compact, balanced and efficient patrol areas [41,42].

Problem modelling
The problem is defined based on an undirected graph G n } is the node set of the street links and E is a set of arcs between the street links. The workload attributes of patrols are described according to the basis of patrol deployment and case occurrence [31]. They include four main factors: the risk factor for the district, the total length of the road, the road congestion factor, and the furthest distance of the road.
For the convenience of description, the parameters and decision variables in this paper are defined in Table 1.
The objective function of this paper includes compact and continuous areas, patrolling balanced and efficient weighted indicators. Based on the previous assumptions and under the symbolic parameters and variables, the mathematical model is constructed as follows: The two terms in the objective function (1) represent the average workload and average workload deviation, respectively. They are assigned weights according to w 1 and w 2 . Equation (2) ensures that each unit i is assigned to only one district k. Equation (3) guarantees that unit i must be in district k when a patrol unit j exists in the neighbourhood of i. Constraint Eqs. (4) and (5) define that the average workload deviation indicates the maximum differences between the workload and the average workload. Equations (6), (7), (8) and (9) describe the workload attributes. Equation (6) defines the risk factor for the district. The predicted case frequency is used as an estimate, indicating the predicted number of cases likely to occur in this area during the specified time period. Equation (7) represents the total length of patrol officers in the patrol district. Equation (8) defines the road congestion factor and represents the ratio of the time of congestion to the total time. Equation (9) defines the diameter distance of the patrol road and indicates the farthest distance in the patrol district. The diameter is the maximum distance between two points and the diameter distance represents the maximum internal travel distance in a patrol district. Equation (10) defines the workload of a district as a linear combination of four attributes. They are assigned weights according to α, β, γ , δ. The weights are judged according to the actual situation. Equations (11) and (12) define the domains of decision variables x ik and y i jk .

Proposed algorithm
In this section, the proposed RSCn-SA algorithm is described in detail. First, the basic framework of the RSCn-SA algorithm is presented. Then, the specifics of the improved spectral clustering algorithm and simulated annealing algorithm used in the hybrid algorithm are explained in detail.

Framework of the RSCn-SA algorithm
To solve the patrol districting problem (SPDP), a hybrid algorithm (called RSCn-SA) combines an improved spectral clustering algorithm with a simulated annealing algorithm. An initial segmentation district is obtained after using an improved spectral clustering algorithm. Then, based on this initial district, a simulated annealing algorithm is used to optimize the objective values and ultimately obtain the global The shortest distance between unit i and unit j w 1 ,w 2 Weighting of the objective function α, β, γ , δ Weighting of the workload in the objective function Decision variables Binary variable which is 1 if both units i and j belong to district k optimal solution. The pseudocode for the RSCn-SA algorithm is shown in Algorithm 1.

Spectral clustering algorithm
The spectral clustering algorithm was first proposed by Donath et al. in 1973 and evolved from graph theory [43,44]. The important element of the algorithm is to see all the data as points that can be connected with edges. The distance and the value of the edge weights between the two points correspond to each other. Clustering is achieved by partitioning the graph composed of all data points [38,39,45].
In the model of this paper, an improved spectral clustering algorithm based on a road network is proposed to solve the street patrol districting problem (SPDP). In the algorithm, road segments are as nodes and links between road segments are as links between nodes. The lengths of the road are used as weights and are involved for consideration. Compared to the traditional spectral clustering algorithm, the algorithm is more relevant to the scenario used and can provide a better initial solution in the paper.
This algorithm starts with a preprocessing operation to construct the connectivity matrix of the road network and constructs a Laplace matrix representation. Then, a decomposition operation is performed to generate the minimum eigenvalues and the eigenvectors. Finally, the feature vectors are clustered into a small number of feature vectors and partitioned according to the reduced dimensional vectors.
The Laplace determinant is constructed as shown in Eq. (13): where L represents unnormalized Laplacian Matrix. C represents the connectivity matrix and D represents the degree matrix.
The standardized Laplace matrix is shown in Eq. (14): where LM denotes the standardized Laplace matrix. The distance between patrol district x j and each feature vector f i in the decomposition operation is denoted in Eq. (15): The eigenvectors of this matrix are used as points, and a clustering algorithm is applied to them for appropriate clustering. The partition markings after clustering are denoted in Eq. (16): The new eigenvector is denoted in Eq. (17): The pseudocode for the improved spectral clustering algorithm proposed in this paper is shown in Algorithm 2.

Simulated annealing algorithm
In 1953, Metropolis et al. [1] proposed an algorithm to simulate the evolution of heating in solid materials. Importance sampling methods were used to accept new states with probabilities called the Metropolis criterion rather than using fully deterministic rules. The Metropolis algorithm is the basis of the simulated annealing algorithm. However, direct use of the Metropolis algorithm may result in an optimization search that is so slow that it is impractical. To ensure convergence in a finite amount of time, parameters must be set to control the convergence of the algorithm. In 1983, Kirkpatrick et al. [46] were inspired by the physical annealing process and introduced the concept of annealing in combinatorial optimization. This is the earliest simulated annealing algorithm. The later simulated annealing algorithm builds on this algorithm by using temperature and energy variations to jump out of the local optimums and reach the global optimal solution [46]. This proposed algorithm is an improved simulated annealing algorithm. It is similar to the traditional simulated annealing algorithm, but with the following main differences: • The initial solution is randomly generated in the traditional simulated annealing algorithm. The initial solution in this paper is determined by the result of the spectral clustering algorithm introduced in the previous section. This can reduce computing time. • The boundary segments between different patrol units are selected during the process of generating a new solution.
A disturbance process randomly selects one of the road sections, and then the selected segment is taken out of the original district and arranged in the other district which is close. During the disturbance, the connectivity of the altered section needs to be verified. If there is partitioning of the area, both the changed districts need to be analysed, and the perturbation is completed when justified. • In the process of generating the new solution, length data from all sections of the road network are used to construct a length matrix to find the furthest distance of the road between two road segments and calculate the increment of the objective function.
The difference in the objective function corresponding to the new solution is calculated in Eq. (18): where d E represents the energy difference and f (s) represents the objective function. Since the objective function difference is only generated by the energy transformation section, the objective function difference is calculated by the energy increment. The probability of cooling with an energy difference of d E at a temperature T is P(d E ). The random probability generated is expressed as where k is a constant and exp denotes the natural exponent. The annealing temperature T is a variable that starts the algorithm as an initial temperature and gradually decreases after each iteration of the algorithm. T affects the probability of selecting a new district during the algorithm. The greater the probability is that cooling with a primary energy difference of d E occurs with a higher temperature. Since d E is always less than 0, the function of P(d E ) takes values in the range (0,1). The pseudocode for the simulated annealing algorithm used in the paper is shown in Algorithm 3.

Numerical experiments
In this section, the parameters and procedure of the experiments are described. Then, the algorithm is compared with other algorithms by setting different workload attributes under small instances. Finally, the performance of this algorithm is compared with other algorithms by choosing real scenes with different numbers of districts.

Experimental description
In China, urban managers identify problems in urban construction such as roadside stalls, illegal business and indiscriminate advertising through street patrols [47,48]. To demonstrate the performance of the algorithm proposed in this paper, a street patrol area for urban managers in Zhongyuan District, Zhengzhou City, Henan Province, is selected for the study. The algorithms are coded using MAT-LAB 2018a, and all experiments are implemented on a PC Intel Core i7 processor running 3.7 GHz, RAM of 8 GB computer.
The data of the urban road network in Zhengzhou are selected for testing, and different methods are used for comparison with the proposed algorithm. The performance of Stop iteration temperature 0.000001 w 1 0.5 w 2 0.5 Fig. 1 Simulation result of small instance f 1 the algorithm is illustrated with different cases by choosing small instances and real scenes, and 10 different scenarios are generated under each of the two numbers of road networks. Table 2 lists the parameter settings of the experiments. The proposed algorithm (SCRn-SA) of this paper is compared with the single algorithm-spectral clustering algorithm [39] based on road network (SCRn), simulated annealing algorithm (SA) [49], greedy algorithm (G) [50], tabu search algorithm (TS) [51], dingo optimization algorithm(DOA) [52] and other hybrid algorithm called greedy-tabu search algorithm (G-TS) [53].

Performance comparison in small instances
To test the performance of the algorithm, a road network model with 180 road segments and 4 districts is constructed. Table 3 shows the workload attributes of 10 test instances. Ten different values of the total length of road segment, the furthest distance of the road, risk factor for the district and road congestion factor are selected in the table. The simulation districting result of the real scene F 10 is shown in Fig. 1. The different colours represent different districts. Each point represents a road segment, and each line represents the connection between the segments. Table 4 gives the statistical results for the comparison of SCRn-SA, SCRn, SA, G, TS, DOA and G-TS. The average and standard deviation values of the objective function reflect the performance of the algorithms. The Smaller statistical values for average and standard deviation indicate better performance. The average values are better evaluation criterions than the standard deviation values. As seen in Table 4, the proposed algorithm achieves the lowest workload balance value in all 10 test instances of the comparison algorithm, and the proposed algorithm shows significantly better performance than the standalone algorithms SCRn, SA, G, TS and DOA. Compared with other hybrid algorithms (G-TS), SCRn-SA also has good performance for different workload attributes. Figures 2, 3, 4, 5 represent the convergence curves of different comparison algorithms for solving the problems f 1 f 4 f 7 and f 9 , respectively. Comparing the minimum objective function values of the algorithms with iteration and without iteration in the figure, it is found that the four algorithms with iteration, SCRn-SA, SCRn, SA, G, TS, DOA and G-TS, outperformed the algorithms without iteration, RSC and G. Comparing the convergence speed of the iterative algo-  Best results are in bold rithms, it is found that the proposed algorithm SCRn-SA and SA outperform the other algorithms. Combined with the analysis results in Table 4, it shows that SCRn-SA has better performance than SA.

Performance comparison in real scenes
Next, a street patrol road network in Zhongyuan District in Zhengzhou is selected for performance testing in real scenes corresponding to the road network diagram shown in Fig. 6. The district includes the area between Western Third Ring Road, Longhai Expressway, Jingguang Road and Nongye Expressway. This road network can be abstracted to a total of 290 road segments, and the workload attributes (total length of road segment, furthest distance of the road, risk factor for the dis-trict and road congestion factor) for setting up patrols are shown in Table 5. The number of districts is set from 1 to 10 for multiple experiments (real scenes F 1 -F 10 ). The simulation districting result of the real scene F 10 is shown in Fig. 7.
SCRn-SA is compared with SCRn, SA, G, TS, DOA and G-TS by setting different numbers of districts in the selected road network, as shown in Table 6. The Smaller statistical values for average and standard deviation indicate better performance. The average value is the main judge of performance. Observing example F 1 from Table 6, it can be concluded that when set to a region for calculation, all 5 algorithms result in 0, in line with the actual target value. This proves the accuracy of the algorithm. According to the other 9 examples, it is found that the proposed algorithm has lower mean and variance values of the objective function than    The convergence curves of these algorithms for the real cases F 2 , F 5 , F 8 , and F 10 are chosen for analysis, as shown in Figs. 8,9,10,11. Comparing the objective function values and the convergence speed, it is found that the RSC-SA algorithm in this paper is optimal.

Performance comparison in public datasets
A public dataset of street patrol in Boston is selected for testing. The selected road network in Boston is shown in Fig. 12. This road network can be abstracted to a total of 200 road segments.
The dataset provides all patrol reported including the type of case, location, and date and time in 2018. The workload attributes (total length of road segment, furthest distance of the road, risk factor for the district and road congestion factor) can be calculated using the formula in our paper. The number of districts is set from 1 to 10 for multiple experiments (Boston scenes D 1 -D 10 ). The workload attributes for setting up patrols are shown in Table 7.
SCRn-SA is compared with SCRn, SA, G, TS, DOA and G-TS by setting different numbers of districts in the selected road network, as shown in Table 8. The average and standard deviation values are factors in judging performance. The average value is the main factor. The average values are better evaluation criterions than the standard deviation values. According to the examples, it is found that the SCRn-SA   The convergence curves of these algorithms for the public dataset of street patrol in the public dataset D 2 , D 3 , D 4 , and D 5 are chosen for analysis, as shown in Figs. 13, 14, 15, 16. Comparing the objective function values and the convergence speed, it is found that the RSC-SA algorithm is optimal.

Computational complexity of the proposed SCRn-SA algorithm
The computational complexity of the proposed SCRn-SA algorithm is computed. The computational complexity of the spectral clustering algorithm is approximated as O(n) [54]. And in the simulated annealing algorithm, the temperature is O(log(m)) [55]. The computational complexity of the SCRn-SA algorithm is O((l × n) + (k × log(m)). k is the number of variables and l is the number of iterations.

Conclusion
The street patrol districting problem (SPDP) is presented based on a road network, and a mathematical model with the aim of obtaining balanced, efficient and compact districts is constructed. A hybrid algorithm based on an improved spectral clustering algorithm and simulated annealing algorithm (SCRn-SA) is proposed to solve SPDP. The improved spectral clustering algorithm (SCRn) in this paper is applied to the road network to perform the initial districting. Then, a simulated annealing algorithm (SA) is used to obtain the optimal solution for districting. The proposed algorithm is verified by setting different parameters on small instances and real scenes. The experimental results show that SCRn-SA has advantages over other state-of-the-art algorithms in solving SPDP. However, there are still some shortcomings in this study, and future research can be conducted in these areas. First, the rational deployment of street patrol forces is increased. Specifically, patrols in key areas are increased where there is a high frequency of cases. In addition, the actual patrol situation varies from time to time, and a dynamic patrol districting scheme can be set based on the changes in patrols during each time cycle. Finally, once rational districting is

Conflict of interest
The authors declare that they have no known conflicts of competing interests that could have appeared to influence the work reported in this paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indi-cate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.