1 Introduction

A shortest path query is one of the fundamental problems with a wide variety of applications in many domains. It requires finding the minimal cost path between two given points, a source s and a target t. This type of problem is commonly encountered in various contexts, including but not limited to road networks [2], social networks [39], geographical systems [62], indoor spaces [9], and game maps [15], to name a few. In many cases, it is helpful to provide several alternative paths to the users, in addition to the shortest one, as this allows them to select the route that best suits their needs or preferences. For example, popular navigation systems like Google Maps offer several options for traveling from the source to the target, allowing the users to choose the path that they prefer. To be useful for the end users, these alternative paths should be reasonably short and distinct from one another to provide meaningful choices to the user.

Fig. 1
figure 1

Four alternative paths on two different game maps

There has been a large body of work on finding alternative paths in road networks (e.g., see [26, 37, 44]). Finding alternative paths in game maps has many useful applications but has not been studied in this context. In real-time strategy (RTS) games, characters typically take the shortest path to reach their destination, but this can make their movements predictable to opponents. To avoid this, it may be beneficial to compute alternative paths and randomly assign one of them to the character. Many RTS games allow players to choose their own waypoints, which they can use to make their characters follow a path different from the shortest path. In such games, a player may be shown several alternative paths so that they can choose a path for their character to take. Figure 1 illustrates two examples where four alternative paths are provided on two different game maps. Alternative paths can also be useful in indoor venues, which are often represented as a Euclidean plane with obstacles. Indoor navigation systems may provide multiple path options for users to choose from.

Some open-source game development projectsFootnote 1 have attempted to include support for alternative paths in game maps. Existing techniques designed for road networks can also be adapted for the game maps. However, it is not clear how efficient or effective these algorithms are when applied on game maps. In this paper, we aim to fill this gap by formally studying the problem of finding alternative paths in game maps. We adapt the existing road network algorithms for game maps and our experimental study demonstrates that these algorithms are not efficient when applied on game maps. This is mainly because they fail to exploit the properties unique to game maps such as polygonal obstacles and that the source and target can be anywhere in the non-obstacle area in contrast to the road networks where these must be on the road edges/nodes.

Intuitively, the alternative paths must be sufficiently dissimilar to each other and must be meaningful/natural, e.g., should not be non-taut or contain un-necessarily long/short detours etc. Note that computing k shortest paths is not a good solution for alternative pathfinding as these paths are expected to be very similar to each other. In this paper, we make the following contributions.

  • To the best of our knowledge, we are the first to study the problem of finding alternative paths in game maps. We adapted three techniques that are commonly used for finding alternative paths in road networks, called Penalty [4, 11, 32], Plateaus [29] and Dissimilarity [14, 40], for use in game maps. However, it was unclear if these techniques would be able to generate high-quality alternative paths in game maps. Therefore, we created a web-based demonstration system and conducted a user study on nine diverse game maps selected from a widely used benchmark. We received a total of 472 responses and found that the adapted techniques generated high-quality alternative paths according to the users. We have made the source codeFootnote 2 for the web-based demonstration system publicly available for reuse or further development.

  • We observe that the existing approaches when extended for game maps are computationally expensive especially when compared with the recent shortest pathfinding algorithms for game maps. To address this, we propose an efficient algorithm to compute alternative paths in game maps exploiting a novel compressed via-path database (CVPD) which has certain advantages compared to the traditional compressed path databases (CPDs) [56].

  • We conduct an extensive experimental study on a widely used game maps benchmark which shows that our proposed algorithm computes the alternative paths in a time comparable to a state-of-the-art shortest path algorithm, Polyanya [15], (slower for shorter paths but faster for longer paths) and outperforms the existing alternative path approaches by more than an order of magnitude. We also evaluate the quality of alternative paths returned by our algorithm using some well-known quantitative measures such as path similarity, bounded stretch and local optimality [31] . The results show that the paths returned by our algorithm are comparable to those returned by the existing approaches in terms of quality.

This paper is an extended version of our previous work [35]. The main contribution of our previous work was the user study and experiments to evaluate the existing alternative pathfinding approaches. In this extended version, our key contribution is a novel data structured (CVPD) and a new algorithm which computes the alternative paths much more efficiently.

The rest of the paper is organised as follows. In Section 2, we present problem definition, evaluation measures, and related work. In Section 3, we first discuss how existing algorithms for road networks can be extended for alternative pathfinding in game maps and then we present the details of our user study and results. Section 4 presents the details of our efficient algorithm for computing alternative paths. Experimental study is provided in Section 5 followed by conclusions in Section 6.

2 Preliminaries

2.1 Problem formulation

In this paper, we assume that the input map is a 2D Euclidean plane which contains a set of obstacles. Each obstacle is a polygon and its convex corners are called convex vertices. The set of all convex vertices in the map is denoted as V. We say that two points are co-visible (or are visible to each other) if a straight line connecting them does not pass through any obstacle in the map. A path P between a source s and a target t is an ordered set of points \(\langle p_1\),\(p_2\), \(\cdots \), \(p_n\rangle \) such that, for each \(p_i\) (\(i<n\)), \(p_i\) and \(p_{i+1}\) are co-visible where \(p_1=s\) and \(p_n=t\). Length of a path P is the cumulative Euclidean distance between every successive pair of points, denoted as |P|, i.e., \(|{P}| = \sum _{i=1}^{k-1}EDist(p_i,p_{i+1})\) where EDist(xy) is the Euclidean distance between x and y. The shortest path sp(st) is a path between s and t with the minimum length. The shortest distance between s and t is denoted as d(st), i.e., \(d(s,t)=|sp(s,t)|\).

Problem Definition. Given a source s, a target t, and a positive integer k, we aim at finding k alternative paths (including the shortest path sp(st)) between s and t such that each alternative path is no longer than \(d(s,t)\times \epsilon \) where \(\epsilon \ge 1\) is a user-defined parameter.

Intuitively, the k alternative paths must be “significantly different” from each other (e.g., should have small overlap with each other) and each path must be a “reasonable” path, e.g., should not contain unnecessary detours and loops etc. The existing works on finding alternative paths in road networks (e.g., see [31]) have defined several measures to quantify whether a set of alternative paths is “reasonable” or not. “Significantly different” can be quantified by defining a dissimilarity function based on the overlap between paths. We formally define these measures in Section 2.2 which are also used in the experimental study. Like most existing works on alternative paths in road networks, we focus on retrieving k alternative paths with the smallest lengths while guaranteeing that the intra-path dissimilarity between these paths is no less than a user-defined threshold \(\theta \). Nevertheless, we remark that our algorithm can be easily generalised to handle other objective functions (e.g., retrieve k paths that minimise a weighted sum defined over several quantiative measures). Please see details in Section 4.2.2.

2.2 Evaluation measures

Let \({P}=\langle p_1,p_2,\cdots ,p_n\rangle \) be an alternative path between s and t such that \(p_1=s\), \(p_n=t\), each \(p_i\) (\(1<i<n\)) is a vertex of an obstacle and for each \(p_i\) (\(i<n)\), \(p_i\) and \(p_{i+1}\) are visible from each other. We use \(P_{x,y}\) where \(x<y\) to denote the subpath \(\langle p_x,\cdots ,p_y\rangle \) of P and denote its length as \(d^P(p_x,p_y)\), i.e., \(d^P(p_x,p_y) = \sum _{i=x}^{y-1}EDist(p_i,p_{i+1})\). Hereafter, whenever we use x and y, assume \(x<y\).

Bounded Stretch [31]. Stretch of a path defines how long is the path compared to the shortest path. Formally, stretch of a subpath \(P_{x,y}\) is defined as \(S(P_{x,y})=d^P(p_x,p_y)/d(p_x,p_y)\). For an alternative path P, its bounded stretch is the maximum stretch of any of its subpaths.

$$\begin{aligned} BS(P) = \mathop {\mathrm {\,max}}\limits _{\forall (x,y)} \frac{d^P(p_x,p_y)}{d(p_x,p_y)} \end{aligned}$$
(1)

For example, assume a path P which has a bounded stretch 1.20, i.e., the maximum stretch of any of its subpath is 1.20. This implies that there is no subpath of P which is more than 20% longer than the shortest distance between its end points.

Note that an alternative path P with smaller bounded stretch is better. Also, if P is a shortest path, its bounded stretch is 1. Let \(\mathcal {P}\) be a set of alternative paths returned by an algorithm. The bounded stretch of \(\mathcal {P}\) is the maximum bounded stretch of any of the paths in \(\mathcal {P}\), i.e., \(BS(\mathcal {P})=max_{\forall P\in \mathcal {P}} BS(P)\).

Local Optimality [31]. We say that a subpath \(P_{x,y}\) is suboptimal if it is longer than the shortest distance between \(p_x\) and \(p_y\), i.e., \(d^P(p_x,p_y)>d(p_x,p_y)\). Given an alternative path P between s and t, we use minL(P) to denote the length of the shortest suboptimal subpath of P (if all subpaths are optimal, minL(P) is assumed to be infinity. Note that any subpath of P which is shorter than minL(P) must be optimal. Thus, minL(P) is a measure of optimality. The local optimality LO(P) normalises this measure w.r.t. the shortest distance d(st) between s and t.

$$\begin{aligned} LO(P) = \frac{minL(P)}{d(s,t)}=\mathop {\mathrm {\,min}}\limits _{\forall (x,y): d^P(p_x,p_y)>d(p_x,p_y)} \frac{d^P(p_x,p_y)}{d(s,t)} \end{aligned}$$
(2)

Consider an alternative path P between s and t and assume that its shortest suboptimal path has length 20 and \(d(s,t)=100\). Its local suboptimality is \(20/100=0.2\). This implies that every subpath of P which is shorter than 20% of the shortest path between s and t is guaranteed to be an optimal path. A path P with higher local optimality is better. Also, if P is a shortest path, its local optimality is 1. Let \(\mathcal {P}\) be a set of alternative paths returned by an algorithm. The local optimality of \(\mathcal {P}\) is \(LO(\mathcal {P})=min_{\forall P\in \mathcal {P}} LO(P)\).

Similarity [37]. Similarity \(Sim(\mathcal {P})\) of a set of paths \(\mathcal {P}\) is

$$\begin{aligned} Sim(\mathcal {P}) = \mathop {\mathrm {\,max}}\limits _{\forall (P_i,P_j)\in \mathcal {P}\times \mathcal {P}: i\ne j} \frac{|P_i\cap P_j|}{|P_i\cup P_j|} \end{aligned}$$
(3)

where \(|P_i \cap P_j|\) (resp. \(|P_i\cup P_j|\)) denotes the total length of the overlap (resp. union) of two paths \(P_i\) and \(P_j\). Dissimilarity is \(1-Sim(\mathcal {P})\).

2.3 Related work

Graphs are commonly used to model many real-world problems in a wide variety of application domains such as social networks [28, 45], recommendation systems [49, 52], health informatics [19, 22], transportation networks [6, 59], the Internet of Things [27, 38], and information security [25, 61]. Shortest path queries  [2, 17, 47] are one of the most fundamental and frequently used operations conducted on graphs. In this work, our focus is on path queries on graphs representing physical spaces such as game maps or road networks. In Section 2.3.1, we present existing work related to computing shortest paths in such graphs whereas Section 2.3.2 covers the related works on computing alternative paths in such graphs.

2.3.1 Computing shortest paths

Shortest path queries [2, 5, 17, 24, 47] and related queries [1, 10, 23, 33, 58] have been very well-studied in road networks. Below, we discuss two of the most popular shortest path algorithms for road networks namely contraction hierarchies and hub labeling.

Contraction Hierarchies (CH) [24] is a graph indexing technique that can answer shortest path query orders of magnitude faster than Dijkstra’s algorithm. As a successor of Highway Hierarchies [50] and Highway Node Routing [51], CH implements the idea of shortcuts to exploit the road network hierarchy. These shortcuts allow efficient shortest path retrieval during the search process which employs a bidirectional search from source and target on the constructed hierarchy. Hub labeling [2] is another indexing technique which assigns, during preprocessing phase, labels to each node in the graph a set of labels. These labels are assigned such that for any source s and t, the labels of s and t are guaranteed to contain a hub node on the shortest path between s and t. During the query processing, the labels of s and t are joined to find the common hub nodes and the hub node that gives the smallest distance from s to the hub node and to t is used to recover the shortest path.

In game maps, shortest pathfinding has also been extensively studied [15, 20, 54]. Typically, these approaches construct a visibility graph by connecting the obstacle vertices that are co-visible. At query time, source s and target t are connected to the visibility graph and it can be guaranteed that the shortest path in this graph is the shortest path between the source and target. Polyanya [15] is the state-of-the-art online shortest path algorithm. It employs a navigation mesh [30] which divides the traversable space into a disjoint set of convex polygons and uses an algorithm similar to A* algorithm with subtle differences (see details in [15]). End Point Search [53] (EPS) employs a Compressed Path Database (CPD) [7] and Polyanya. Specifically, given a \(V\times V\) table where each row R(u) of a convex vertex \(u \in V\) stores, for every \(v\in V\), the first vertex f on the shortest path from u to v. Each row R(u) is then compressed using the run-length encoding (RLE) [55]. Given this CPD, for any pair of vertices u and v, the first move (i.e., the first vertex) on the shortest path from u to v can be accessed from the CPD using a binary search on the compressed row R(u). The shortest path from u to v can be obtained by recursively extracting the first moves towards v until v is reached. Regarding EPS, which employs Polyanya to incrementally find vertices visible from s and t, respectively. Then, the CPD is used to obtain the pair-wise paths between these visible vertices efficiently. Several pruning techniques and optimisations are proposed to speed up the computation.

2.3.2 Computing alternative paths

Existing studies have proposed a variety of approaches to answer alternative pathfinding queries in road networks [26, 36, 37, 42, 44]. Some existing works focus on finding k-shortest paths [21, 60]. However, these do not reflect good alternative paths as most of the k shortest paths have a very high level of overlap with each other. To address this, several existing works use user-defined parameters to obtain k paths that are significantly different from each other and are not too long compared to the shortest path [12, 13, 40]. These do not guarantee the quality of alternative paths (e.g., paths may have local detours etc.), Furthermore, the problem as defined in these papers is NP-Hard making them computationally challenging without providing any path quality guarantee. The penalty based approaches [4, 11] compute the alternative paths by increasing the edge weights on the paths already found (to avoid selecting the edges while finding additional paths). Plateau-based approach [3, 16, 18, 41, 43, 48] is arguably the most popular approach to return the promising alternative paths as it naturally provides several guarantees such as local optimality and smaller overlaps with the other paths, etc. In some works (e.g., [3]), the author proposed several constraints to formally define the alternative paths, which should be locally optimal, limited sharing, and uniformly bounded stretch.

Indeed, the shortest pathfinding in game maps has been well understood in existing studies, e.g., see [15, 53] and references therein. However, alternative pathfinding queries have only been solved in road networks. Next, we briefly explain the alternative pathfinding approaches in road networks, as in [37]. The majority of alternative pathfinding algorithms fall into three broad categories, which are Penalty [4], Plateaus [29], and Dissimilarity [13]. For ease of presentation, we assume that road networks are undirected graphs.

Penalty: Algorithms [4, 11, 32] in this category iteratively calculate the shortest paths between source and target. Specifically, once the current iteration is finished and the shortest path P = \(\langle p_1\),\(p_2\), \(\cdots \), \(p_n\rangle \) is found, the algorithm increases the weight of each edge on the current path P by a certain penalty factor (e.g., multiplying the edge weight by 1.5). Since the edge weights on the shortest path have been increased, in the next iteration, the algorithm is likely to find a different shortest path [37]. The algorithm terminates once k different paths are returned or when the length of the path found in the current iteration is longer than \(d(s,t)\times \epsilon \). One major issue with this approach is its slow query processing time because it needs multiple traversals over the graph to find the k alternative paths. Furthermore, there is no guarantee that the k alternative paths are significantly different because despite the increase in edge weights, the paths may still have significant overlap with each other.

Plateaus: Cotares Limited designed this algorithm [29] for their routing engine called Choice Routing. First, it creates a shortest path tree \(T_s\) rooted at the source s and another shortest path tree \(T_t\) rooted at the target t. Next, \(T_s\) and \(T_t\) are joined and common branches in both trees are found. Each of the common branches is called a plateau. Consider one branch \(\langle s, \cdots , u_1, u_2,\cdots , u_n, \cdots , y\rangle \) in \(T_s\) and another branch \(\langle t, \cdots u_n, u_{n-1},\cdots , u_1,\) \( \cdots ,x\rangle \) in \(T_t\). When these branches are joined, the common part \(\langle u_1,\cdots , u_n\rangle \) is found, i.e., \(\langle u_1,\cdots , u_n\rangle \) is a plateau, which we denote as \(pl(u_1,u_n)\) using the end points of the branch. We remark that the shortest path between s and t is always a plateau and its length is d(st). Let pl(uv) be a plateau such that u is the end closer to s and v is the end closer to t. This plateau can be used to retrieve an alternative path \(sp(s,u)\oplus pl(u,v) \oplus sp(v,t)\) where \(\oplus \) is the concatenation operation. It was observed in [29] that longer plateaus typically generate better alternative paths. Therefore, the algorithm picks k longest plateaus and generates alternative paths based on each of the plateaus. We give a detailed example of how Plateaus works in Section 3.

The alternative paths generated using plateaus have some natural/useful characteristics which make them attractive for both research and commercial solutions. Firstly, an alternative path generated using a longer plateau avoids unnecessary/unnatural detours (e.g., leaving a motorway and entering it again shortly afterward without reducing the traveling cost). This is due to local optimality provided by plateaus. Specifically, as noted in [3], a path that is local optimality does not have undesirable local detours. Secondly, plateaus do not overlap with each other, therefore, the alternative paths generated using longer plateaus tend to have smaller overlaps with each other. Thirdly, plateaus are generated using the shortest path trees and seamlessly capture the intrinsic properties of the underlying road networks.

Dissimilarity: This group of techniques follows a function to compute dissimilarity between two paths. Specifically, Given a list of the shortest paths, this function returns the shortest alternative paths with a dissimilarity value between any two paths that are less than or equal to the given threshold \(\theta \). This problem is NP-hard [14], to which several approximate algorithms [14, 40] have been proposed. Now, we explain an approach [13] shown in [37] to generate high-quality alternative paths in road networks.

Fig. 2
figure 2

Three alternative paths generated by Plateaus are \(\langle s, I, G, H, t\rangle \), \(\langle s, K, C, B, E, H, t\rangle \) and \(\langle s, K, C, A,\) \(D, J, t\rangle \) with lengths 88, 92 and 100, respectively

Given a vertex v in the road network, a via-path sp(svt) is the concatenation of path sp(sv) and sp(vt). Namely, sp(svt) = \(sp(s,v)\oplus sp(v,t)\). To efficiently retrieve the paths from any vertex v to either source s or target t, two shortest path trees \(T_s\) and \(T_t\) are computed and stored. With these shortest path trees, the algorithm iteratively evaluates vertices in the road networks in ascending order of their via-path lengths. If the dissimilarity value between the current via-path and any added alternative path is at least \(\theta \), then add this via-path to the result set. Once the result set has k alternative paths or the current via-path is longer than \(d(s,t)\times \epsilon \), the algorithm stops.

3 User study

A recent research study [37] was conducted on road networks in three different cities, Melbourne, Dhaka, and Copenhagen. The study found that using certain techniques, called Penalty, Dissimilarity, and Plateaus, can create alternative routes that are of similar quality to those generated by Google Maps. In this work, we adapt these techniques for generating alternative paths in video game maps (Section 3.1). However, first, we want to determine if these techniques would be effective in generating high-quality alternative paths in game maps. To answer this question, we conducted a user study and present the results later in Section 3.2.

3.1 Adapting existing techniques for game maps

In this section, we describe our adaptation of the existing techniques, Penalty, Dissimilarity and Plateaus, for the game maps. First, a visibility graph \(G=\{V,E\}\) is created where V is the set containing convex corners/vertices in the game map and E is the set of edges connecting every pair of vertices (uv) that are visible to each other. The weight of each edge corresponds to the Euclidean distance EDist(uv) between the pair of vertices. To compute the shortest path between a source s and a target t, the source and target are added to the graph G by inserting some new edges between s (resp. t) and the vertices which are visible from s (resp. t). Figure 2 provides an example. There are three obstacles (grey polygons). The edges shown in the figure correspond to the visibility graph G. Note that s and t have been added to G by connecting them to the vertices visible from them, respectively. After the visibility graph is constructed, the existing approaches can be directly applied on this graph. The paths that are non-taut are filtered. A taut path is a path which, when treated as a string, cannot be made “tighter” by pulling on its ends [46]. For example, the path \(\langle s, I, F, G\rangle \) is non-taut because string-pulling results in a shorter path \(\langle s, I, G \rangle \).

Example 1

Consider the example in Figure 2. We briefly describe how Plateaus generates three alternative paths. In this example, A to K are the convex vertices. Visibility graph includes the edges that connect source and target to their respective visible convex vertices (I and K for s and H and J for t). Then, Plateaus computes the forward shortest path tree \(T_s\) rooted at s (see the tree shown in blue edges) and the backward shortest path tree \(T_t\) rooted at t (see the pink edges shown in broken lines). These two trees are joined and the branches which are common in the two trees are the plateaus. The algorithm then chooses three longest plateaus: \(\langle s, I, G, H, t\rangle \), \(\langle K, C, B, E\rangle \) and \(\langle D, J \rangle \) with lengths 88, 47 and 16, respectively. Using these three plateaus, three alternative paths are generated connecting s and t to the end of each plateau closer to them. The three alternative paths are \(\langle s, I, G, H, t\rangle \), \(\langle s, K, C, B, E, H, t\rangle \) and \(\langle s, K, C, A, D, J, t\rangle \) with lengths 88, 92 and 100, respectively.

3.2 User study and results

To conduct the user study, we extend our previous work [34] to develop a web-based system which helps visualising the alternative paths generated by different approaches. We select 9 diverse maps from a well-known game maps benchmarkFootnote 3. Participants in the user study are sent a webpageFootnote 4 which contains the instructions as well as links to the web-based system. We sent two different types of surveys to the participants:

  • Pre-defined: In this type of survey, for each of the nine game maps, the participant were shown three queries (source-target pairs) which were pre-selected by us. This was to ensure that different participants provide rating for the same set of queries.

  • User-selected: In this type of survey, the participants were given the freedom to choose any source-target pair by clicking on the map. The participants were required to choose at least one source-target pair for each of the nine maps. This was to ensure that we get user-selected queries from different participants for each of the maps.

Based on the source and target of each query, our web-based system produces up to 4 alternative paths for each approach. We anonymise these approaches and display Plateaus, Dissmilarity and Penalty as A, B and C, respectively. This is to cater for any potential preconceived biases. The participants can view the paths generated by these approaches by clicking on the radio buttons (see Figure 3). For each of the approaches, the system asks the participants to give a rating at a scale of 1 to 5 where higher is better. All the participants were given a quick overview of the alternative paths in road networks and game maps. We asked them to rate these paths based on their impression of how good the paths generated by these approaches were especially taking into account that the paths should be substantially different from each other but meaningful at the same time (e.g., should not have strange detours or loops). We remark that, due to the popularity of navigation services, most of the people have experience with alternative paths in road networks, however, most people have not necessarily seen alternative paths in video games. Hence, we carefully chose the participants who had background either in game maps pathfinding or alternative path computation in road networks.

Fig. 3
figure 3

The web-based system for the user study: Pre-defined Queries

Table 1 Average user rating for Plateaus, Dissimilarity (shown as Dissim.) and Penalty. Best values for each category are shown in bold

The results of the user study are shown in Table 1. In total, we got 472 responses from 9 different participants. As it can be seen, the ratings given by the participants to different approaches are quite similar on average. Also, the ratings are usually typically high (e.g., around 4 on a scale of 1 to 5 where higher is better). To test the statistical significance of the results, we conducted one-way repeated measures ANOVA test. Given a null hypothesis of no statistically significant difference in mean ratings of the three approaches, the results suggest that, at \(p < 0.05\) level, there is no evidence that the null hypothesis is false, i.e., there is no credible evidence that the three approaches received different ratings on average.

Later, in the experiments section, we also consider the measures discussed in Section 2.2 and compare these approaches against those measures. Our user study and the experimental study demonstrate that these approaches are able to generate good-quality alternative paths. However, one major limitation of these techniques (as shown later in the experimental study) is that their computation time is quite high especially when compared to the shortest path algorithms in game maps. To fill this gap, in the next section, we present an efficient algorithm to compute alternative paths in game maps.

4 Efficient alternative pathfinding algorithm

4.1 Offline preprocessing

Before we present our offline preprocessing, we first describe Compressed Path Database (CPD) [7]. Given a \(V\times V\) table where each row R(u) of a convex vertex \(u \in V\) stores, for every \(v\in V\), the first vertex f on the shortest path from u to v. Each row R(u) is then compressed using run-length encoding (RLE) [55]. Given this CPD, for any pair of vertices u and v, the first move (i.e., the first vertex) on the shortest path from u to v can be accessed from the CPD using a binary search on the compressed row R(u). The shortest path from u to v can be obtained by recursively extracting first moves towards v until v is reached.

Fig. 4
figure 4

An example showing three polygonal obstacles

Table 2 Rows I and K of a Compressed Path Database (CPD)

Example 2

For our running example in Figure 4, Table 2 shows two uncompressed rows of the CPD containing first moves from I and K, respectively, to the other vertices. Consider the row of K. The first move from K to each of A, B, C, D, E, H and J is C. Therefore, the corresponding cells contain C. A wildcard symbol “*” is stored for the cell K because the path from K to K is not needed. The wildcard symbol can be compressed with any other symbol. Run-length encoding (RLE) is used to compress the row of K as \([1C{:}6I{:}8C{:}9I{:}10C]\) (the compressed row indicates that the value in this row for indices [1, 6) is C, the value is I for indices [6, 8) and so on). To recover the shortest path from K to G, a binary search is conducted on the RLE string of K to extract the first move I on the shortest path from K to G. Next, the algorithm conducts a binary search on the RLE string of I to extract the first move G from I to G. The algorithm stops since G is reached.

Using the CPD, the shortest path can be obtained in O(e) first move extractions where e is the number of edges on the shortest path. Each first move extraction takes \(O(\log {r})\) where r is the size of the compressed row. Thus, the total cost to obtain the shortest path/distance using a CPD is \(O(e\log {r})\). Next, we present the details of our proposed data structure called CVPD+DL which allows computing distance between any two vertices \(u\in V\) and \(v\in V\) in logarithmic time. Once the distance d(uv) is computed, the shortest path between u and v can be computed in time linear to the number of edges on the shortest path, i.e., O(e).

Table 3 Compressed Via-Path Database (CVPD)

4.1.1 CVPD+DL

The proposed index, CVPD+DL, consists of two new indexes namely Compressed Via-Path Database (CVPD) and Distance Labels (DL).

\(\underline{Compressed\, Via-Path\, Database\, (CVPD)}\): We impose a strict total order on all vertices in V. Although any ordering can be used, in this paper, we use betweenness centrality score [8] to order the nodes where the betweenness score of a vertex v is the number of shortest paths passing through v. The betweenness scores of vertices are computed by constructing all shortest path trees. We break the ties arbitrarily but consistently, e.g., using node IDs. Ranks of nodes are their positions in this order, e.g, the highest ranked node is the node with the highest betweenness score. We use \({u}<_B{v}\) to denote that u ranks higher than v.

While a traditional CPD stores the first move f on the shortest path from \(u\in V\) to \(v\in V\), a CVPD stores the highest ranked vertex h on the shortest path between u and v. This vertex is called the highest via node on the shortest path and is denoted as via(uv). Since \(sp(u,v)=sp(v,u)\), we have \(via(u,v)=via(v,u)\). Therefore, we only need to store the highest via node via(uv) only once in the CVPD. Specifically, we store via(uv) in the row of a vertex u only if \({u}<_B{v}\). Otherwise, we store a wildcard symbol “*” to achieve better compression. Note that, unlike CVPD, the traditional CPDs are not symmetric and need to store first move from u to v as well as the first move from v to u.

Example 3

Table 3 shows CVPD for the example shown in Figure 4. The alphabetical order of vertices in Figure 4 represents the betweenness ranks, i.e., A is the highest ranked node and K is the lowest ranked node. Consider the row for vertex F. It stores wildcard symbols for nodes A to F as the rank of F is not higher than the rank of each of these nodes. The shortest path between F and J is \(\langle F, B, A, D, J\rangle \) and the highest ranked via node A on this path is stored in the CVPD. For the remaining nodes in this row (i.e., G, H, I, and K), the highest via node on the shortest path from F to these nodes is F itself. Therefore, F is stored for these nodes. The RLE compression of this row R(F) gives [1F:10A:11F]. We remark that, in practice, the columns of the CVPD (and CPD) are ordered following a Depth-First Search (DFS) order to achieve better compression. However, for the sake of simplicity, in our examples, we show the columns ordered according to their betweenness ranks (A to K).

Given the CVPD, for any pair of vertices u and v, we can obtain the highest ranked node on the shortest path between u and v using the compressed row of u (if \({u}<_B{v}\)) or using the compressed row of v (if \({v}<_B{u}\)).

\(\underline{Distance\, Labels\, (DL)}\): For each vertex u, we store a list of distance labels denoted as DL(u). Specifically, DL(u) contains a distance label for every \(v\in V\) for which: 1) \({u}<_B{v}\); and 2) u is the highest ranked vertex on the shortest path between u and v. Each distance label in DL(u) is a triplet (vd(uv), p) where p is the first vertex on the shortest path from v to u. The labels in each DL(u) are sorted according to the betweenness ranks of vertices v which allows finding the label of v in DL(u) in logarithmic time. To reduce the number of distance labels, if u and v are co-visible, we do not store the distance label for v in DL(u). For such u and v, at query time, a binary search can be conducted on DL(u) and if v was expected to be in DL(u) but is not found, it implies that u and v are co-visible and the Euclidean distance between them can be computed on-the-fly (as we show later in Example 5).

Table 4 Distance labels

Example 4

Table 4 shows the distance labels for all vertices in Figure 4. As shown in Table 3, F is the highest ranked via node on the shortest path from F to each of G, H, I and K (see row F in Table 3). Since G and I are visible from F, the distance labels to them are not added. Therefore, we add in DL(F) the distance labels to H and K (see row F in Table 4). The label (H, 60, G) in DL(F) indicates that the distance between F and H is 60 and the first vertex on the shortest path from H to F is G.

Given the CVPD+DL, the shortest distance d(uv) can be efficiently computed as follows. Without loss of generality, assume \({u}<_B{v}\). First, the highest via node h on the shortest path between u and v is found using a binary search on the compressed row of u in the CVPD. Then, the distances d(uh) and d(vh) are obtained from DL(h) using two binary searchesFootnote 5 to find the labels (ud(uh), p) and \((v,d(v,h),p')\), and \(d(u,v)=d(u,h)+d(v,h)\). Note that this requires conducting at most three binary searches (one on the compressed row of u and two on DL(h)). To obtain the shortest path, the first moves can be used to recursively recover the path. For example, to obtain the path from u to h, the first vertex p is obtained from the label (ud(uh), p). Then, the label of p is found in DL(h) and this process continues until h is reached. The label of p can be obtained using another binary search. Alternatively, with each label (ud(uh), p) we can store an additional value \(p_i\) which corresponds to the position (i.e., index) of the label of p in DL(h) (or -1 if p is not in DL(h) because it is visible from h). This allows obtaining each successive label in O(1) resulting in O(e) time to recover the whole path.

Example 5

Consider our running example and assume that d(FJ) is to be computed. We conduct a binary search on CVPD on row F and find the highest via node A on sp(FJ). Then, we conduct two binary searches on DL(A) to find the labels corresponding to F and J: (F, 33, B); and (J, 39, D). Here, \(d(F,J)=d(F,A)+d(J,A)=33+39=72\). The shortest paths from F to A and J to A are obtained to recover the path. For example, the label (F, 33, B) indicates that the first vertex on the shortest path from F to A is B. Next, the label of B is searched in DL(A) which is not found indicating B and A are co-visible. Note that with the label (F, 33, B) we can store the index of B in DL(A) (i.e., \(-1\) in this case indicating that the label of B is not present) which helps avoiding a binary search. The shortest path from J to A is recovered similarly.

4.1.2 Compressed i-th via-path database (CVPD\(^{i}\))

Now, we present a generalisation of the CVPD called Compressed i-th Via-Path Database (denoted as CVPD\(^{i}\)). While a CVPD records the highest ranked node on the shortest path between u and v, a CVPD\(^{i}\) stores the highest ranked node on the i-th longest plateau between u and v. We ignore the zero length plateaus (i.e., consisting of only a single vertex), and store a special symbol “-” in CVPD\(^{i}\) for such cases. Since i-th longest plateau between u and v is the same as the i-th longest plateau between v and u, we store the highest ranked via node in the row of u only if \({u}<_B{v}\). It can be shown that the shortest path between u and v is the longest plateau between u and v. Therefore, CVPD is sometimes denoted as CVPD\(^{1}\) hereafter. In total, we create m CVPD\(^{i}\)s denoted as CVPD\(^{1}\),\(\cdots \), CVPD\(^{m}\). In our experiments, we evaluate the effect of m on preprocessing time, storage cost and query performance.

Table 5 Compressed 2nd Via-Path Database (CVPD\(^{2}\))

Example 6

Table 5 shows CVPD\(^{2}\) for our running example. Consider the row F. Similar to CVPD, we store “*” for nodes A to F. The second longest plateau between F and H is \(\langle D, J\rangle \) and CVPD\(^{2}\) stores the highest ranked node D on this plateau. Similarly, the second longest plateau between F and J is \(\langle G, H\rangle \) and we store the highest ranked node G on this plateau. For other nodes, we store “-” as the second longest plateaus between F and these nodes contain only a single vertex each.

Given a CVPD\(^{i}\), we can obtain the highest via node – the highest ranked node n on the i-th plateau between u and v – using a binary search on CVPD\(^{i}\). The via-path between u and v passing through n (i.e., \(sp(u,n)\oplus sp(n,v)\)) and its length can then be obtained using CVPD+DL as described earlier.

4.1.3 Advantages of CVPD\(^{i}\)s and DL

A major advantage of the proposed CVPD\(^{i}\)s in the context of alternative pathfinding is that these can be used to efficiently obtain the highest ranked nodes on i longest plateaus between each pair of vertices visible from s and t. Such vertices are likely to be on high-quality alternative paths between s and t because longer plateaus typically generate better quality alternative paths [29]. CVPD+DL can then be used to efficiently recover these paths and/or find their lengths. While traditional CPDs can be used to recover the paths, CVPD+DL has some advantages. Firstly, the traditional CPD needs to recover the whole shortest path in order to find the distance d(uv). On the other hand, CVPD+DL can find d(uv) in at most three binary searches. Secondly, the CPD recovers the shortest path by recursively finding first moves which requires \(O(e\log {r})\) whereas CVPD+DL recovers the shortest path in O(e) once the shortest distance has been computed in logarithmic time. Furthermore, to obtain each successive first move, CPDs need to do binary look up in different rows of the CPD. In contrast, CVPD+DL can recover the shortest path using a single row DL(h) resulting in fewer cache misses.

4.1.4 Advantages compared to hub labels

Traditional hub labeling approaches [39] store a set of hub labels for each vertex v such that d(uv) can be computed by finding the common hub nodes in the labels of u and v. Finding all common hub nodes requires linear search on both the labels of u and v with complexity \(O(|HL(u)| + |HL(v)|)\) where |HL(x)| is the number of hub labels stored for a vertex x. In contrast, CVPD+DL does not need to find the common hub labels. Instead, the highest via node is found using a binary search on the compressed row of u assuming \({u}<_B{v}\). Also, to recover the shortest paths, the hub labeling approaches need to recursively obtain successors which requires accessing hub labels for different nodes (resulting in potentially more cache misses) wheres CVPD+DL can recover the shortest path using a single row DL(h). In the context of alternative pathfinding which is our main focus, traditional hub labeling cannot be used (or trivially extended) because the number of common hub nodes between two nodes may be less than k (and, even if there are at least k common hub nodes, they may not generate high-quality alternative paths as the hub nodes are not selected with an aim to generate alternative paths).

4.2 Online query processing

4.2.1 Algorithm

Details of our query processing algorithm are shown in Algorithm 1. If s and t are co-visible (which can be checked using ray shooting or Polyanya [15]), the path \(\langle s,t\rangle \) is added to the set of alternative paths \(\mathcal {P}\). Then, Polyanya is employed (as described in [53]) to retrieve the sets of convex vertices visible from s and t, denoted as \(V_s\) and \(V_t\), respectively. The algorithm then utilises the CVPD\(^{i}\)s to obtain a set of via nodes denoted as ViaNodes (lines 6 to 11). Specifically, for each pair of visible vertices u and v, the algorithm uses the CVPD\(^{i}\)s to obtain the highest i-th via nodes and adds these to ViaNodes. In the experiments, we evaluate the effect of m, the number of CVPD\(^{i}\)s used by the algorithm. We remark that although the maximum number of nodes in ViaNodes is \(|V_s|\times |V_t|\times m\), in practice, the number of nodes in ViaNodes is very small (less than 10 in most cases in our experiments) because the highest ranked nodes usually are the same between many pairs of visible vertices.

Once ViaNodes are found, the algorithm accesses each via node n and uses CVPD+DL to obtain the via-path \(sp(s,n)\oplus sp(n,t)\). If the via-path is taut and its length is not greater than \(d(s,t)\times \epsilon \), it is added to a list of via paths VPaths (the candidate alternative paths). Finally, the algorithm accesses each via path in ascending order of their lengths, and if its dissimilarity with the existing paths in \(\mathcal {P}\) is at least \(\theta \), it is added to \(\mathcal {P}\). Although any dissimilarity function can be employed, we compute the dissimilarity of a set of paths \(\mathcal {P}\) as \(1-Sim(\mathcal {P})\) (see (3) in Section 2). The algorithm returns \(\mathcal {P}\) when it contains k paths or when the algorithm terminates.

figure a

Efficient alternative pathfinding.

We include many optimisations to the basic algorithm described above. Specifically, we remove the dead-end and non-taut vertices from \(V_s\) and \(V_t\) as discussed in [53]. Also, the vertices in \(V_s\) (resp. \(V_t\)) are accessed in ascending order of \(d(s,u)+EDist(u,t)\) (resp. \(d(t,v)+EDist(v,s)\)) to obtain the via nodes between more promising nodes earlier and, instead of accessing all nodes in \(V_s\) and \(V_t\), we stop after accessing at most N nodes each from \(V_s\) and \(V_t\). We evaluate the effect of N in experiments.

4.2.2 Extending to other objective functions

Similar to the existing works in road networks, Algorithm 1 primarily aims to minimise the path lengths while satisfying a certain dissimilarity criterion. However, different users may have different requirements, e.g., one user may define a multi-objective function as weighted sum of different measures (e.g., similarity, length, bounded stretch) and may want to retrieve alternative paths with the smallest weighted sum. On the other hand, another user may want to minimise one particular measure while imposing constraints on the other measures, e.g., return alternative paths with the smallest total length such that path similarity is at most 0.5 and bounded stretch is at most 0.3. Algorithm 1 can be easily adapted to meet the demands of these different users. Specifically, at 18, the algorithm currently accesses the candidate paths VPaths in ascending order of lengths and, at line 19, filters a path if adding it will violate the path similarity constraint. If the objective function is weighted sum, the algorithm can process the candidate paths in ascending order of their weighted sum at line 18, and apply additional filters at line 19 if needed (e.g., if constraints on bounded stretch are required by the user, the path that violates the bounded stretch constraint can be filtered).

5 Experiments

5.1 Settings

Following the experimental settings of previous works on pathfinding in game maps, we conduct experiments on the widely used benchmarksFootnote 6 with a total of 298 game maps, as described in [57]. The benchmarks contain 67 maps from Dragon Age II (DA), 156 maps from Dragon Age Origins (DAO), and 75 maps from Baldur’s Gate II (BG) (see Table 6 which also shows the total number of queries for each benchmark, average number of vertices in each map and average number of convex vertices in each map).

Algorithm 1 is shown as CVPD(N) in the experimental results where the value of N indicates the maximum number of vertices processed from each of \(V_s\) and \(V_t\) at lines 7 and 8. We construct three CVPD\(^{i}\)s and evaluate the effect of using different numbers of CVPD\(^{i}\)s in the experiments. We evaluate our algorithm, CVPD(N), against some of the most well-known existing techniques: Plateaus, Dissimilarity, and Penalty which are shown as Pla, Dissim, and Pen, respectively, in the results.

Table 6 Benchmark stats include total # of maps (#M) and total # of queries (#Q) in each benchmark, and average # of vertices (#V) and convex vertices (#CV) per map. For each benchmark, we also show average build time (mins) per map for constructing three CVPD\(^{i}\)s and DL as well as average memory (MB) per map for the three CVPD\(^{i}\)s and DL

We set the penalty factor for the Penalty approach to 1.4. The dissimilarity threshold \(\theta \) for Dissimilarity and CVPD(N) was set to 0.6. The penalty factor and dissimilarity threshold were selected after trying different values and choosing the best values. \(\epsilon \) was set to 1.5 for each approach. The default value of k was set to 3.

In addition to comparison with the three alternative pathfinding algorithms mentioned above, we also included Polyanya [15] (displayed as Poly) as a competitor. Polyanya is the state-of-the-art online shortest path algorithm in game maps. Please note that Polyanya only finds the shortest path and not the alternative paths. However, we included Polyanya in the experimental study to demonstrate the overhead cost of finding the alternative paths compared to only finding the shortest path. The source code of Polyanya was provided by its authorsFootnote 7.

All algorithms were implemented in C++ and compiled with GNU GCC 4.8. We conduct experiments on a Linux (64-bit) dedicated NeCTAR server m1.xxlarge instance with Intel Core Processor 2.9GHz 16-core CPUs and 16GB DDR4-1866 memory.

We evaluate the algorithms considering the pre-processing cost (build time and memory required), query runtimes, and the quality of the alternative paths found by each algorithm. In order to evaluate the quality of the returned alternative paths \(\mathcal {P}\), we use bounded stretch \(BS(\mathcal {P})\) (lower the better), local optimality \(LO(\mathcal {P})\) (higher the better), and similarity \(Sim(\mathcal {P})\) (lower the better) as defined in Section 2.2. In our experiments, we report average values over all queries for each of these measures. In addition, we report the maximum of \(BS(\mathcal {P})\) and \(Sim(\mathcal {P})\). Note that maximum of \(BS(\mathcal {P})\) and \(Sim(\mathcal {P})\) correspond to the worst-case bounded stretch and similarity for an algorithm across all queries. Furthermore, we report the minimum of \(LO(\mathcal {P})\) considering all queries, which represents the worst-case for local optimality. In some cases, an approach may only be able to return less than k alternative paths. Such an approach may get better quantitative scores only because it generated less than k alternative paths which is unfair. For this reason, we only take into consideration the queries that return exactly k alternative paths for all approaches.

5.2 Results

5.2.1 Preprocessing time and memory

Table 6 shows the average build time (in minutes) per map for constructing three CVPD\(^{i}\)s and the distance labels (DL). It also shows the memory required by each CVPD\(^{i}\) and DL (in MB). The results show that the build time and the memory required by the CVPDs and DL are quite small. CVPD\(^{1}\) consumes significantly more memory than CVPD\(^{2}\) and CVPD\(^{3}\). This is because for CVPD\(^{2}\) and CVPD\(^{3}\), for many pairs of vertices especially those that are close to each other, the 2nd (or 3rd) plateaus do not exist resulting in many cells having the special symbol “-” which leads to better compression.

Fig. 5
figure 5

x-axis shows the percentile ranks of queries in number of node expansions needed by A* search to solve them

5.2.2 Query runtimes

Figure 5 demonstrates query runtimes for all algorithms on DA, DAO, and BG maps (please note log-scale on y-axis). In each figure, the x-axis ranks the queries roughly in the order of their difficulty. Specifically, following the existing shortest pathfinding approaches, we sort the queries based on the number of nodes expansion required by the standard A* search to solve them (which serves as a proxy of how challenging a query is). The x-axis denotes the percentile ranks of queries in this order. As Figure 5 shows, Plateaus and Dissimilarity exhibit almost equal query times because both need to compute the shortest paths trees rooted at the source and target which is the dominant cost. The penalty is more expensive given the fact that it requires the iterative computation of k unique paths. We show the performance of CVPD for \(N=15\), \(N=30\) and \(N=\) All, which refers to the case when it processes all visible vertices from both ends. CVPD is more than an order of magnitude faster than Plateaus, Dissimilarity, and Penalty. Furthermore, its query time is comparable to (or even better than) Polyanya for the more challenging queries. We also show the performance of our algorithm when it uses one, two, or three CVPD\(^{i}\)s. For example, Figures 5(a), (b), and (c) show the runtimes of CVPD when it uses one, two and three CVPD\(^{i}\)s, respectively. As expected, the query processing times increase as the number of CVPD\(^{i}\)s used by the algorithms increases (please note that y-axis is in log-scale). However, in all cases, CVPD significantly outperforms the existing alternative pathfinding algorithms.

5.2.3 Varying K

Figures  6 (a), (b) and (c) display the average query processing time when k is varied from 2 to 5. The cost of Plateaus and Dissimilarity is independent of k. The reason is that the dominant cost of those two algorithms is constructing the forward and backward shortest path trees and this construction cost does not depend on the value of k. The cost of Penalty is positively associated with k. The reason lies in the fact that the algorithm involves k iterations to get the result. We only run Polyanya for k = 1 as the algorithm only generates the shortest path (and therefore is expected to be faster than the other algorithms that generate k alternative paths). We show the performance of CVPD for \(N=15\), \(N=30\), and \(N=\) All when it uses two CVPD\(^{i}\)s. Again, CVPD outperforms the other alternative pathfinding algorithms by more than an order of magnitude and has performance comparable to Polyanya. Interestingly, CVPD outperforms Polyanya when \(N=15\) and \(N=30\) on the DAO benchmark.

Fig. 6
figure 6

Effect of varying k

Table 7 Quality of alternative paths on DA, DAO and BG maps. We show \(BS(\mathcal {P})\) (smaller the better), \(Sim(\mathcal {P})\) (smaller the better), \(LO(\mathcal {P})\) (larger the better) and average path length. Best values for each column are shown in bold. x-CVPD(N) corresponds to the case when x CVPD\(^{i}\)s are used by our algorithm and N vertices are processed from each of \(V_s\) and \(V_t\)

5.2.4 Quality of alternative paths

Table 7 compares the quality of alternative paths that different algorithms generate. As the table shows, the quality of alternative paths generated by our approach is comparable to the existing approaches. However, as our experimental study showed, our approach is significantly faster than the existing approaches, i.e., by more than an order of magnitude. The average bounded stretch produced by our approach outperforms Plateaus and is similar to that of Dissmiliarity and Penalty. With regard to average similarity, our approach sits at the second rank, following Penalty which performs the best. However, Penalty underperforms in terms of the worst-case (i.e., max) similarity (with maximum similarity around 0.9), whereas our approach and Dissimilarity ensure a maximum similarity level at no more than 0.4. In terms of local optimality, our algorithm outperforms the other algorithms. Dissimilarity is the best approach in terms of average path length followed by our approach. We note that using more CVPD\(^{i}\)s reduces the average path length of our approach but does not have a clear benefit in terms of the other measures.

6 Conclusions

To the best of our knowledge, we present the first comprehensive study on computing alternative pathfinding in game maps. First, we adapt the previous works, specifically designed for finding alternative paths in road networks, to find the alternative paths in game maps. Then, based on a web-based system that allows users visualise the paths generated by different approaches, we conduct a user study which demonstrates that these previous approaches are capable of generating high-quality alternative paths in game maps. One limitation of the existing techniques is their high query runtimes. To address this, we propose an efficient algorithm to generate alternative paths using a novel variation of CPDs called Compressed Via-Path Database (CVPD). Our experimental study shows that the proposed approach is more than an order of magnitude faster than the existing alternative pathfinding approaches and is capable of generating high-quality paths of similar quality.