1 Introduction and Related Works

The synergy of real-world systems can be described as complex networks that exchange information through their entities’ relationships. Such networks can model complex systems from neuronal networks to subway systems [1] and also, they can shape cities when linking the network topology with georeferenced data.

By analyzing the complex network of a city, it is possible to extract features that can describe urban problems, which are meaningful indicators for city planners [2]. Such features can reveal, for instance, sites where social activities are more intense, regions where facilities should be placed, and neighborhoods that lack street access. Particularly, these networks can expose distance-based inconsistencies, which is how we refer to nodes that lack efficient street access from/to others in the same network, possibly resulting in structural bottlenecks.

Along these lines, we identified a lack of methods to analyze and improve the structure, mobility, and street access of cities. Consequently, in this paper, we contribute with a mathematical tool-set and algorithms to track distance-based inconsistencies by analyzing the complex-network topology of a city. Our results have implications for the street access, supporting a finer street planning by enhancing mobility indicators and providing better city’s structural assessment.

The core assumption of this study is that the network is supposed to provide streets that render the shortest distance between places. In this regard, our tool-set uses two distance-functions to track nodes that do not provide shortest distance routes between them and other nodes that are of some interest. Nodes that fail in providing minimum-length routes are considered to be inconsistent nodes, which are evidence of problems in the city structure. Accordingly, in the face of an inconsistency, we raise two hypotheses: (A) the network lacks a more appropriate mesh; or, instead, (B) the city lacks its points of interest placed in better locations. The first one indicates the need for new points of interest to distribute their load. Contrarily, the second one indicates the need for relocating points of interest because the topology of the terrain cannot afford new streets.

A vast number of studies have been conducted to analyze inherent properties and behaviors found in cities. For instance, multiple metrics have been adopted to explain their structural conditions [3], their intense traffic of vehicles [4], and the emergence of collective behavior [5]. In other studies, the authors centered on the geometrical perspective of the network [6], and on the elements positioning [7, 8]. Furthermore, there are those who reviewed the role of the city elements [9, 10], that addressed the support to the urban planning and design [11, 12], and that improved the facility-location analysis and planning of street meshes [13]. In addition to the ones that inspected the effectiveness of the underground systems [10] and the improvement of long-range connections [14], besides those who defined the concept of accessibility through complex networks and cities [15] and that tested the centrality of cities considering their space syntax [16,17,18].

In this paper, we contribute with a tool-set that improve the analysis of cities by tracking inconsistent urban structures through complex networks. This proposal follows by showing that the distance between two nodes can reveal ill-located points of interest and that such information can be used to make a city better distance-efficiently to their citizens. To present our contributions, this paper is organized as follows: Sect. 2 discusses our mathematical formulation and related algorithms; Sect. 3 discusses the results about the applicability of the proposed tool-set; and, Sect. 4 presents the conclusions and final remarks.

2 Mathematical Formulation and Algorithms

2.1 Preliminaries

Along the text, we refer to a complex network as a distance-weighted directed graph , which is composed by a set of nodes and a set of edges. To model a city as a complex network, we considered streets as the edges and their crossings as the nodes, aiming to preserve their element’s geometry. An edge is an ordered pair , in which is named source and is named target, . Each node has two properties that correspond to their coordinates— is the latitude and the longitude. Based on such coordinates, we conferred to the edges in a floating-point weight that refers to the great-circle (or inline) distance between their source and target.

Our mathematical tool-set tracks inconsistencies identified through distance functions to detect which element does not follow a pattern. The pattern that we consider refers to the real-world distance between the nodes of the network, which in turn can provide insights about the locomotion through the city streets. In this regard, we begin by tracking a set of points of interest; the idea, then, is to determine two sets of nodes that surrounds a point of interest , which can reveal the city’s inconsistencies through applying algebraic operations. We introduce these two sets as the perimeter set of p and the network set of p.

2.2 Grouping Nodes in the Surroundings of Points of Interest

The first set matches the closest nodes to a point of interest according to the great-circle (or inline) distance, which is referred as the perimeter set of p:


where is the great-circle distance between i and j in the surface of Earth:


where and are the latitudes, is the difference between the longitudes and , both of nodes and . Also, is the radius of Earth (6,378 km), and all values are represented in radians. Given a graph and a set of points of interest , a node pertains to the perimeter set of only one .

and — are mutually disjoint.

The second set corresponds to the network set of p, which is made of nodes closest to a point of interest according to the length of their shortest paths:


where is the length of the shortest directed path () between and i.e. the sum of weights of all the edges in a minimum-length path, as follows:


Recall that, the edge weight is given by the straight-line distance between their nodes using the great-circle distance (see Eq. 2). We refer to the shortest path length as network distance, in the sense that one must necessarily (in the best case) move across this path to go from the source node to the target node. Notice that any node is network-closest to one and only one .

and — are mutually disjoint.

In cases where the complex network is directed, the network-distance TO a point of interest is not necessarily the same as the network-distance FROM a point of interest, which may result in different network sets for the same . This detail is addressed in the following section, where we define the network set from a point of interest to the nodes in by mean of the reversed network-set of p:


2.3 Compartmentalizing Inconsistencies for Directed Networks

Consider different public services of a city as points of interest; such services may have different ways to assist the population, but all of them must require locomotion as a condition for assistance. For example, in the case of doctors’ clinics, it is desired that patients get there efficiently. In turn, police stations require that their police officers efficiently reach the house of the citizens. In the case of schools, the daily routine demands an efficient back-and-forth transit to students. Along with other services that can be fitted with this assumption. Notice that, we are referring to efficient paths as the ones with minimum length.

In the first example, there is an implicit displacement from a node to a node ; in the second one, the displacement is from the node to the node ; and, in the third case, there is a bi-directional displacement between and , in which is an ordinary node and is a specific point-of-interest. Based on the network direction, those three cases led to the following definitions:

  1. 1.

    Inward Inconsistency: nodes that are inline-closest to a point of interest, but network-closest (from to , as given by Eq. 3) to a different one:

  2. 2.

    Outward Inconsistency: the same as the previous category, but in the opposite direction (from to , as given by Eq. 5), resulting in the set:

  3. 3.

    Absolute Inconsistency: nodes that are, simultaneously, considered to be inward and outward inconsistencies—i.e. nodes in the sets’ intersection:


As mentioned, these categories rely on the direction of the network. In cases where there is no direction, there will be no minimum-length divergence between paths of a round trip, but yet the inconsistencies can be tracked by calculating the difference between the perimeter set and the network set of . To provide further discussion, hereinafter we are considering just directed networks.

2.4 Tracking Distance-Based Inconsistencies

In this section, we discuss Algorithm 1 that joins the concepts that we previously introduced. The aim of such algorithm is to track distance-based inconsistencies in distance-weighted directed networks by using a set of of points of interest. Notice that, despite the definition of inconsistency is segmented into three types (see Sect. 2.3), the algorithm considers a single inconsistency type at a time.

The algorithm starts by filling a set of empty sets, each one reserved to store the inconsistencies of a single point of interest (see lines 1 to 2). Subsequently, we use and to store, respectively, the inline-closest and network-closest points of interest to a node (see lines 5 and 6). We used the external functions inline_closest and network_closest (see lines 8 and 9) to extract the closest point of interest to the node ; they implement, respectively, Eq. 2 and 4. Following, we perform a test to check whether a node is an inconsistency or not; thus, if the inline-closest point and the network-closest point are not the same (see line 11) then is an inconsistency of (see line 12). Finally, a set of the inconsistencies of points of interest is returned as the result (see line 13).

figure a

Given a graph , a set of points of interest, and an inconsistent node ; such node is known to be an inconsistency to one and only one .

and , — are mutually disjoint.

Consequently, it is possible to derive two other sets from a point of interest : (i) the inconsistency set ; and (ii) the set of consistent nodes , such that . The consistent nodes are fundamental to the process of suggesting locations to points of interest because they provide a smaller average distance to the nodes in their perimeter, different than an inconsistent node.

2.5 Reducing Distance-Based Inconsistencies

In this section, we introduce Algorithm 2, which was designed to suggest changes in the location of points of interest to improve their access through the streets of a city. The task of finding a perfect location for a point of interest might demand the test of all possibilities through an exhaustive search. Consequently, our algorithm has a greedy approach that uses information about centrality metrics to guide the placement of a point of interest. Centrality is not only an adequate technique to quantify the importance of a node but also it is capable to indicate central locations that are equally accessible to all nodes of a network.

Along these lines, we decided to adopt Straightness Centrality [9] as the centrality metric of Algorithm 2 because it analyzes the nodes of a network by joining both inline and network distances. It is noteworthy that any distance-based centrality metric could be employed, as well as multiple metrics together; however, different metrics tend to provide dubious or bad choices for a relocation.

figure b

Our algorithm starts by initializing auxiliary variables (see line 1) and by tracking the inconsistent nodes in the original version of the network (see line 2). In line 4, it starts looping until all points of interest have been replaced or until there are no more inconsistencies to be reduced from the original network. After that, it tries to change one point of interest at a time (see line 7). The candidates to host a point of interest pertains to the induced subgraph of consistent nodes (see line 8). By using the induced subgraph the algorithm searches for the node that has the highest centrality value among all the other ones (see line 9).

The algorithm continues by testing the highest central node as the new location to the point of interest; such that, it temporally replaces the node (see line 10) and then it collects information about the inconsistencies of this network configuration (see line 11). Following, it tests whether the new configuration causes fewer inconsistencies then the previous one (see line 12) before marking the node for relocation (see lines 13 to 15). In a greedy fashion, it first selects the point of interest that by being replaced will lead to the highest elimination of inconsistencies. After choosing the one to be replaced, we perform integrity tests, we mark the node as relocated, and then we remove it (see lines 16 to 19).

The algorithm ends when there are no more profitable changes (see line 21). It is noteworthy to mention that each point of interest can be moved only once; this is due to the greedy nature of the algorithm. Otherwise, it would run until there are no more inconsistencies in the network at a prohibitive computational cost. The output of the algorithm is a set of new locations (see line 22); each element is an ordered pair that denotes the current () node where a point of interest is and a better node () for placing it.

Algorithm 2 runs in in the average case, where is the number of points of interest and is the number of nodes, . Besides that, the algorithm was designed to be straightly parallelized; and, moreover, in our tests, it took less than a minute to compute a whole city with 200,000 inhabitants.

2.6 Correctness of the algorithm formulation

In this section, we demonstrate that Algorithm 2 is finite and it never increases the number of inconsistencies of a city, as required by the problem formulation.

Theorem 1

We hypothesize that Algorithm 2 provides a set of central and consistent nodes that can replace specific points of interest in a city because replacing them will never increase the total number of inconsistencies.


Hereinafter, aiming to prove Theorem 1 by reduction to absurdity, we are supposing that the use of Algorithm 2 can increase the number of inconsistencies. Bearing in mind that the type of the inconsistency has no effect on such proof, we will follow by proving the algorithm using Inward Inconsistency (see Sect. 2.3).

Consider the existence of a city mapped as a complex network and a set of points of interest located in this same city. We start by finding the perimeter set of p () for each , which is given by Eq. 1. Subsequently, we proceed with gathering the network set of p () that is defined by Eq. 3.

Following, we detect a consistent node that is the most central by an arbitrary centrality metric. The most central node is the one that has the highest centrality when compared to the other nodes, potentially being a better place for positioning a point of interest in a city. We follow by replacing by the most central node in its perimeter. Then, we calculate the updated perimeter () and network () sets, both of . Notice that , thus and .

At this point, there are two pairs of answers, one pair for and the other one for , as follows: and . The algorithm we proposed will replace by following Eq. 9, which corresponds to a clause saying that the sets computed from will be used just if they provide fewer inconsistencies than the original set; otherwise, it will keep the original one without making any changes.


The algorithm ceases when all the points of interest are changed at least once or when no change will result in inconsistency elimination (see Sect. 2.5); as such, the algorithm is guaranteed to be finite. Therefore, by reduction to absurdity, it is an absurdum to suppose that the number of inconsistencies increases due to the use of Algorithm 2 because the algorithm provides a set with less or equal inconsistencies than the original set—as defined by Eq. 10.


3 Results and Discussions

The tool-set we proposed was validated over the Brazilian city of Sao Carlos. Such city was instantiated as a complex network through a digital map from OpenStreetMapFootnote 1. We considered streets as edges and their crossings as nodes; this way, we preserved the georeferenced attributes of the city that are necessary to the distance computation of our tool-set. The resulting network is planar and it can be represented in two dimensions, in which edges intersect only at nodes.

3.1 Assessing Inconsistency Recovery

In this section, we analyze the inconsistent nodes found in the city of Sao Carlos regarding the location of hospitals, police stations, and public schools, which are our points of interest; such public services are known to be affected respectively by inward, outward, and absolute inconsistencies as described in Sect. 2.3. It is noteworthy that each set of points of interest are independent, as such, the inconsistencies of one set of points have no relationship with the ones of others.

The inconsistent nodes we tracked are in Table 1, which suggest that their occurrence is connected to the number of points of interest. In fact, they appear whenever different perimeters meet; as a consequence, there is no way to eradicate them without altering the network topology by changing the streets’ direction or creating new streets. In addition, more points of interest mean more boundaries, what tends to increase their number. Hence, the challenge is to find locations to points of interest that reduce, rather than eradicate, inconsistencies.

We used Algorithm 2 so to reduce the inconsistencies from Sao Carlos (see Table 1). The algorithm suggested relocating 6 hospitals, 2 police stations, and 9 public schools; such configuration, was able to reduce 160 inconsistencies from the hospitals (from 559 to 399), 123 inconsistencies from the police stations (from 342 to 219), and 179 inconsistencies from the public schools (from 663 to 484). Notice that the inconsistencies of some points of interest raised in number from the original to the enhanced city, which is a setback of our approach. However, as we have already proved, the total number of inconsistencies is always smaller.

Table 1. Analysis of the inconsistencies of the city of Sao Carlos, in which we considered police stations, hospitals, and public schools as points of interest; we use # to refer to the total number of inconsistencies and % to their percentage.

3.2 Supporting the Designing of Urban Structures

Our tool-set is not only to be used in the automatic recovery of inconsistencies, but also to assist human-made urban-planning decisions. This is the case, for instance, when a specialist designs a city by having knowledge of the citizens’ needs. In this case, Algorithms 1 and 2 can aid the process by analyzing and recommending distance-efficiently locations that are feasible to points of interest.

This section introduces two hypothetical case studies that depict our tool-set in practice. Both of them were conducted considering a subset of hospitals and public schools of the city of Sao Carlos (see Sect. 3.1). Nonetheless, our tool-set is extendable to any point of interest since it is equivalent to all of them.

Both case studies follow as in Fig. 1, in which we start by finding a point of interest, next we try to solve the problem by ourselves, and then we use the algorithms to improve our results; all steps are guided under the light of the nodes’ straightness centrality. Furthermore, all case studies are represented by the induced subgraph of the point of interest being analyzed and, although we have illustrated the inconsistencies in Fig. 1, in the case studies they are not visible because they do not provide visual information to the other images.

Fig. 1.
figure 1

Illustration of the process of designing urban structures under the light of centrality metrics. This process starts by identifying nodes that are of interest, then it follows by tracking their inconsistencies, and it ends by suggesting new locations—that reduce the number of inconsistencies—to place these nodes.

3.3 Case Study 1: Creating a new hospital to reduce demand

From the set of hospitals of the city of Sao Carlos, we identified one that, when compared to another hospital in the city, has excessive nodes in its perimeter (see Fig. 2a). There is no specific explanation of the hospital’s location and, for instance, we can think that the city may have grown after the hospital has been built or the planners did not take the surroundings of the hospital into account. One thing is for sure, an extensive area with an ill-positioned point of interest will deprive the street access of the nodes; in this case, when points of interest are healthcare facilities, time-critical activities, as the transportation of patients in a critical state, can be jeopardized by lack of street access. Hence, the problem becomes where to build a hospital and how to avoid inconsistencies.

Fig. 2.
figure 2

Illustration of the assisted urban planning task from the first case study, in which the point of interest is a hospital and the color of the nodes denotes their centrality—the darker, the higher. Figure 2a shows a hospital’s perimeter that is too large causing lack of access. We placed a new hospital in an eye-based central location in that same area to solve this issue. Afterwards, we used the algorithm to reduce inconsistencies, which suggested relocating the new hospital to a more central location that reduces the hospital’s inconsistencies; as in Fig. 2b.

First, we tried to solve the problem manually by an eye-based analysis of a location that could provide equal nodes to the perimeters of both hospitals. Figure 2a shows a possible place to the new hospital as well as the resulting perimeter of both of them, which are defined by a line that cuts the image in half. After that, we inserted the proposed location in the set of hospitals and we used Algorithm 1 to track the inconsistencies of the resulting configuration. Such configuration lead us to 615 inconsistencies, which is a bigger value than the original city. Thus, we succeeded in building a hospital that splits the perimeter into two, but we failed in providing efficient access to both old and new hospital.

In a second approach, we analyzed the nodes’ centrality together with a supporting visualization. We colored the nodes by their centrality, what allowed us to notice that the selected location for the new hospital is a node with low centrality. Then, we used Algorithm 2 to suggest a better place for the new hospital while keeping the location of the old one. Doing so, the city inconsistencies were reduced from 615 to 352 (see Fig. 2b), which positively reflected in the mobility of this area by distributing the demand between both hospitals. Thus, creating a new hospital in a specific location was able to reduce almost half of the inconsistencies of the city without relocating the existing ones.

3.4 Case Study 2: Merging schools to centralize public resources

In a similar fashion, we identified two public schools that are adjacent and support a short set of nodes. In this case, the proximity of the schools (see Fig. 3a) is a problem since none of them is used up to its capacity implying a waste of public resources. In a first approach, by using Algorithm 2 to relocate them, the number of inconsistencies was reduced from 663 to 635.

Fig. 3.
figure 3

Illustration of the assisted urban planning task from the second case study, in which the points of interest are public schools and the color of the nodes denotes their centrality—the darker, the higher. In this case study, we treated a problem related to the waste of resources that was caused by having two schools near each other; Fig. 3a shows the problematic area, which is small, increasing the drawbacks related to access. By replacing both schools with a single one we achieved a better coverage of nodes, as depicted in Fig. 3b.

Considering the size of the perimeter of both schools, we decided to remove one school to improve the utility of the one that remained. By centralizing the schools in a single node, we can reduce inconsistencies because there will be fewer perimeters bordering each other; hence, the inconsistencies, located whenever two of them meet, will be naturally decreased. To further enhance this process, we used the color-coded centrality metric to choose a candidate to be the new sole school. Afterward, we used Algorithm 2 to provide a better location (see Fig. 3b), which reduced the total number of inconsistencies from 635 to 445.

3.5 Discussions on Results Generalization

For a concise results presentation, we have assumed: (i) that any displacement is through cities’ streets; and, (ii) a city with a uniform population distribution. However, our tool-set holds for scenarios where these assumptions are not true.

We can use weights in accordance with the type of the displacement rather than using streets distance. This is because our tool-set uses a general concept of weight and when providing additional information such weight can assume any quantitative value—i.e. travel time, edge capacity, route cost, and so on.

About the population distribution, it is possible, for instance, to use a normal distribution peaked at the center of the city, multimodal distributions, or census data. This information can aid in the analysis of urban agglomerations if it is used to assign values to sets of nodes corresponding to the population density of the area that they belong to. Nevertheless, the set of inconsistencies would depend on the analysis of a specialist rather than being a self-explanatory result.

Also, despite being central to our problem formulation, the viability of redesigning a city is not suited for most cases. Furthermore, changing the topology of the network will alter the centrality of its elements, which will modify regions that attract vehicles and people. Our tool-set is not only to be used in redesigning a city but also on the initial design when all possibilities are open.

Finally, our proposal has open problems that support further studies: (1) the tool-set to track inconsistencies is categorical, then further algebra can aid in identifying the severity of a network inconsistency in a continuous, rather than binary, manner; (2) for simplicity’s sake, we assumed the origin and destination of all paths as nodes of the network; such nodes are street intersections, which might not be real-world points of interest, requiring the addition of new nodes.

4 Conclusion

This paper was instantiated as a set of mathematical formalisms and algorithms to track and reduce distance-based inconsistencies improving access to/from points of interest in a city. Beyond the mathematical formulation, we provided a proof of concept and case studies, all of which indicate that our tool-set is able to suggest better placements for points of interest at the same time that it improves the access to the majority of the nodes of a city by reducing its inconsistencies.

More specifically, our contributions are in the definition of a concept based on intrinsic problems to urban structures that are caused by the misallocation of points of interest in cities; also, in two algorithms that were devised to track and reduce inconsistent nodes in complex networks; and, finally, in a case study, in which we show how our tool-set and algorithms can aid planners and designers.

In summary, our methods were proved empirically and formally, granting potential for prompt contribution and for opening new research questions. In addition, as a future work, we shall embrace link prediction methods for suggesting relocations in the network topology, i.e. proposing variations in the flow’s direction, in the task of looking for a better topological setting for a city.