Finding k-shortest paths with limited overlap

In this paper, we investigate the computation of alternative paths between two locations in a road network. More specifically, we study the k-shortest paths with limited overlap (kSPwLO\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k\text {SPwLO}$$\end{document}) problem that aims at finding a set of k paths such that all paths are sufficiently dissimilar to each other and as short as possible. To compute kSPwLO\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k\text {SPwLO}$$\end{document} queries, we propose two exact algorithms, termed OnePass and MultiPass, and we formally prove that MultiPass is optimal in terms of complexity. We also study two classes of heuristic algorithms: (a) performance-oriented heuristic algorithms that trade shortness for performance, i.e., they reduce query processing time, but do not guarantee that the length of each subsequent result is minimum; and (b) completeness-oriented heuristic algorithms that trade dissimilarity for completeness, i.e., they relax the similarity constraint to return a result that contains exactly k paths. An extensive experimental analysis on real road networks demonstrates the efficiency of our proposed solutions in terms of runtime and quality of the result.


Introduction
Computing the shortest path between two locations in a road network is a fundamental problem that has attracted lots of attention by both the research community and industry. However, in many real-world scenarios, determining solely the shortest path is not enough. For instance, users of navigation systems are interested in alternative paths that might be longer than the shortest path but have other desirable properties. Another scenario where alternative routes are useful is transport of humanitarian aid goods through unsafe regions. The distribution of the load to a fleet of vehicles that follow non-overlapping routes increases the chances that at least some of the goods will be delivered. The need for alternative routes also arises in emergency situations, such as natural disasters and terrorist attacks. To avoid panic and potential catastrophic collisions while dealing with the aftermath of such events, evacuation plans should include alternative paths that overlap as little as possible.
A first take on providing alternative routes is to solve the K -shortest paths problem [16,23,37]. In most cases though, the returned paths share large stretches, and therefore, they are not helpful in scenarios such as the aforementioned ones. Consider Fig. 1, which shows three paths connecting two locations in the city of Oldenburg. The solid/black line indicates the shortest path, the dashed/red line indicates the next path by length, which however is very similar to the shortest path, and the dotted/blue line indicates a path that is clearly longer, but significantly different from the shortest path as it passes through a very distant part of the city's road network. In scenarios like the ones mentioned above, the dotted/blue path is a better alternative to the shortest path than the dashed/red one.
Existing literature has approached alternative routing from different perspectives. Notable works include methods that compute alternative routes either by incrementally building Fig. 1 Motivating example a set of dissimilar paths [18] or by employing edge penalties [3]. These methods provide no formal definition of the alternative routing problem, and typically no guarantee regarding the length of the recommended paths. Other approaches [2,4,8] first generate a large set of candidates and, in a postprocessing step, apply a number of constraints to determine the final result. In these works, alternative paths are defined solely on their individual similarity to the shortest path. This yields paths that are very similar to each other and, hence, of limited interest in many applications.
In this paper, we formalize alternative routing as the k-Shortest Paths with Limited Overlap (kSPwLO) problem. A kSPwLO query aims at finding a set of k paths that are (a) sufficiently dissimilar to each other with respect to a userdefined similarity threshold θ , and (b) as short as possible. We prove that the kSPwLO problem is weakly N P-hard, and we propose two exact algorithms that traverse the road network by expanding paths from the source in length order, while employing pruning criteria to reduce the number of paths that need to be examined.
To balance between performance and result quality, we present two classes of heuristic algorithms. First, to enable the processing of kSPwLO query on large road networks, we propose three performance-oriented heuristic algorithms that trade shortness (i.e., how short the recommended alternative paths are) for performance. These algorithms drastically reduce the number of examined paths and therefore scale for large road networks, but they do not guarantee that the returned paths are as short as possible.
Strictly abiding by the similarity constraint may prevent a kSPwLO algorithm from finding exactly k sufficiently dissimilar paths. In many cases though, returning a complete result set of k paths is more important than abiding by the similarity constraint. For this purpose, we introduce a procedure to gradually relax the similarity constraint, trading dissimilarity for completeness. Based on this procedure, we present two completeness-oriented heuristic algorithms, which guarantee that the result set contains exactly k paths. This paper extends our previous work [9,10] where we presented the following contributions: -We formally defined the k-shortest paths with limited overlap (kSPwLO) problem for computing alternative routes on road networks (Sect. 4). -We introduced two exact algorithms for kSPwLO queries: OnePass traverses the road network once expanding only paths that qualify the similarity constraint; Mul-tiPass improves OnePass by employing an additional pruning criterion and traversing the network k−1 times (Sect. 5). -We presented three performance-oriented heuristic algorithms that limit the number of examined paths but do not minimize the length of each subsequent result: OnePass + employs the pruning power of MultiPass but traverses the network only once; SVP + selects alternative paths from the set of single-via paths [2]; ESX removes edges from the road network incrementally and computes the shortest path on the updated network (Sect. 6).
As an extension to our previous work, in this paper we present the following theoretical and technical contributions: -We prove that the kSPwLO problem is weakly N P-hard (Sect. 4). -We present comprehensive complexity analyses for all proposed algorithms (Sects. 5-7). -We prove that MultiPass is optimal for the kSPwLO problem (Sect. 5). -We examine additional edge removal criteria for ESX in order to further improve its performance and result quality (Sect. 6.4). -We present the Complete_kSPwLO function that gradually relaxes the similarity constraint, and we discuss two completeness-oriented heuristic algorithms, termed ESX-C and SVP-C, that employ this function to always compute a complete result set of k paths (Sect. 7).
Through an extensive experimental analysis on real road networks, we evaluate both the algorithms presented in our previous works and the new ones in terms of runtime, quality of alternative paths and completeness of the result set (Sect. 9).
Pairwise dissimilar paths To the best of our knowledge, works that aim at computing a set of pairwise dissimilar paths are the closest ones to our own.
Jeong's algorithm [18] aims at computing dissimilar alternative paths by directly extending Yen's algorithm [37]. Given a length limit x and a similarity threshold y, the goal is to incrementally compute k paths not longer than x and the similarity between any two paths does not exceed y. At each step, the algorithm modifies all previously computed paths to obtain a set of candidate paths and examines the candidate path that is most dissimilar to the already recommended paths. While dissimilarity is guaranteed, in contrast to our approach, the algorithm does not minimize the length of each subsequent path in the result.
Another strategy to compute dissimilar alternative paths is to iteratively apply a penalty on the weights of edges that lie on previously computed paths. Akgun et al. [3] proposed a method that computes alternative paths by repeatedly computing the shortest path on the road network, each time with updated weights. The main shortcoming of this approach is that there is no intuition behind the penalty applied in each iteration. A large penalty would result in dissimilar but possibly long alternative paths, whereas a small penalty would require the algorithm to execute more iterations. Similar to Jeong's algorithm, penalty-based methods make no effort to minimize the length of the alternative paths.
Another approach is the k-dissimilar paths with minimum collective length (k-DPwML) problem introduced by Liu et al. [22]. In contrast to kSPwLO, a k-DPwML query computes the set of k sufficiently dissimilar paths w.r.t. a similarity threshold θ , that exhibits the lowest collective path length among all sets of k sufficiently dissimilar paths. As shown by Chondrogiannis et al. [11], the requirement to minimize the collective length of the result renders the problem strongly NP-hard and its exact computation prohibitively expensive. Note that Liu et al. [22] did not study the exact computation of the k-DPwML problem. Instead, the authors proposed a greedy approach FindKSPD, which solves our kSPwLO problem. In fact, the kSPwLO can be seen as an approximation to the much harder but clearly less practical k-DPwML problem.
Candidate sets A different definition of alternative routing is to compute paths that are alternatives only to the shortest path. The Plateaux method [8] aims at computing paths that cross different highways of the road network. Bader et al. [4] introduced the concept of alternative graphs, which have a similar functionality as the plateaus. Abraham et al. [2] introduced the notion of single-via paths, which we adopt and extend for developing one of our heuristic algorithms, i.e., SVP + . The proposed approach evaluates each single-via path individually by comparing it to the shortest path and checks whether it meets a set of user-defined constraints, i.e., local optimality and stretch.
Segment avoidance Another definition of alternative route is to compute paths that avoid certain segments of the road network. Xie et al. [35] study the computation of paths that avoid specific edges of the road network and then introduce iSQPF, a spatial data structure that extends the shortest path quadtree [31], to enable the efficient computation of such paths. The concept of segment avoidance has also been studied in the context of traffic management. Methods in this category utilize traffic information obtained from trajectory data [39], from sensor networks [24] or from VANETs [17]. They aim to identify congested segments of the road network and to compute paths that avoid them. Xu et al. [36] proposed a first-cut approach to compute traffic-aware routes on dynamic road networks. Li et al. [21] utilized historical traffic information and study the computation of the k traffic-tolerant paths, i.e., the paths with the minimum (historic) travel time.
Multi-objective path planning The computation of multiple routes has also been approached as a multi-objective problem. Pareto-optimal paths [13,25] and the route skyline [20] can be directly seen as alternative routes, or they can be further examined in a post-processing phase to obtain the final alternative paths. Another approach involves solving a multiobjective traffic assignment problem [27,29]. Works in this direction aim at assigning paths to different users while optimizing for a set of user preferences. Such approaches are frequently employed in urban traffic management systems to achieve flow optimization [26].
Popular route extraction Finally, there are also historical data-based methods that aim at analyzing and mining trajectory data in order to extract popular routes [6,7,34,38]. Popularity is usually measured by the number of trajectories that cross a specific edge/segment. The more popular the edges/segments of a route are, the more popular the route is. This line of work aims at exploiting the wisdom of the crowd and recommending routes that are frequently used by experienced users, e.g., taxi drivers.

Preliminaries
Let road network G = (N , E) be represented by a directed weighted graph with a set of nodes N and a set of edges E ⊆ N × N 1 . The nodes of the graph represent road intersections, and edges represent road segments. Every each edge (n i , n j ) ∈ E is assigned a positive weight w(n i , n j ), which captures the cost of moving from node n i to node n j . This weight can represent any nonnegative cost, e.g., distance and travel time, or even a composite cost, e.g., a linear combination of travel time with financial cost. A (simple) path p(s → t) from a source node s to a target node t is a connected and cycle-free sequence of edges (s, n i ), . . . , (n j , t) . The length ( p) of a path p is the sum of the weights of all contained edges. The shortest path p sp (s → t) is the path with the lowest length among all paths that connect nodes s and t.
Given two paths p, p from s to t. The similarity Sim of p and p is defined by their overlap ratio [2], i.e., where p ∩ p denotes the set of edges shared by p and p . For the similarity, we have 0 ≤ Sim( p, p ) ≤ 1, where Sim( p, p ) = 0 if path p shares no edges with p, while Sim( p, p ) = 1 holds if p ≡ p. Since only simple cyclefree paths are considered, the similarity between different paths is strictly lower than 1.
While various measures to compute the similarity between two paths have been proposed [22], we argue that the similarity measure of Eq. 1 is the most suitable one for alternative routing on road networks, as it enables us to disregard needlessly long paths when searching for alternative paths. In practice, there is no value in defining an alternative path p to a path p, if p is shorter than p. The shortest of two paths will always be the first option, and the longer one will be the alternative.
Given a similarity threshold θ , path p is called an alternative path to p if p is sufficiently dissimilar to p, i.e., Sim( p, p ) < θ. We also call a path p alternative to a set of paths P if p is sufficiently dissimilar to every path in P.
Definition 1 (Alternative Path) Let P be a set of paths from s to t and θ be a similarity threshold. A path p from s to t is alternative to set P iff ∀ p ∈ P : Sim( p, p ) ≤ θ . Fig. 2. Let set P = { p 1 , p 2 }, where p 1 = (s, n 3 ), (n 3 , n 5 ), (n 5 , t) and p 2 = (s, n 3 ), (n 3 , n 4 ), (n 4 , t) with ( p 1 ) = 8 and ( p 2 ) = 10, respectively. Furthermore, assume a similarity threshold θ = 0.5. Path p 3 = (s, n 3 ), (n 3 , n 5 ), (n 5 , n 4 ), (n 4 , t) shares edges (s, n 3 ) and (n 3 , n 5 ) with p 1 , yielding Sim( p 3 , p 1 ) = 6/8 = 0.75 > θ. Therefore, p 3 is not an alternative path to P. On the contrary, the similarity of path p 4 = (s, n 2 ), (n 2 , n 4 ), (n 4 , t) to p 1 and p 2 is Sim( p 4 , p 1 ) = 0 < θ and Sim( p 4 , p 2 ) = 2/10 = 0.2 < θ, respectively. Therefore, p 4 is an alternative path to P. Intuitively, the goal of a kSPwLO(G, s, t, k, θ) query is to identify a set of k paths from a source node s to a target node t, such that (a) the shortest path p sp (s→t) is always returned, (b) all returned paths are sufficiently dissimilar to each other with respect to a given similarity threshold θ , and (c) the paths are as short as possible. This is captured in the following definition.

Example 1 Consider the road network in
Definition 2 (kSPwLO Problem) Given a road network G = (N , E), a source node s and a target node t both in N , a requested number of paths k, and a similarity threshold θ ∈ [0, 1], find the set P LO of k paths from s to t, such that: (A) all paths in P LO are sufficiently dissimilar to each other, i.e., (B) every path p / ∈ P LO is either too long or too similar to a shorter path in P LO , i.e., one of the following two conditions holds for p: Condition (A) ensures the dissimilarity of the recommended paths, i.e., all paths are sufficiently dissimilar to each other w.r.t. the given similarity threshold θ . Condition (B) guarantees that each path p in P LO is the shortest possible path that is sufficiently dissimilar to all paths in P LO shorter than p. As a result, the shortest path p sp (s→t) is always part of P LO .  Among all other paths from s to t, path p 2 is the shortest one that is sufficiently dissimilar to p 1 , i.e., Sim( p 2 , p 1 ) = 3/8 = 0.375 < θ. Subsequently, among all remaining paths from s to t, p 3 is the shortest one that is sufficiently dissimilar to both p 1 and p 2 , i.e., Sim( p 3 , p 1 ) = 0 < θ and Sim( p 3 , p 2 ) = 2/10 = 0.2 < θ.

Complexity analysis
Next, we elaborate on the complexity of the kSPwLO.

Theorem 1 The kSPwLO problem is weakly NP-hard.
Proof We prove the theorem by reduction from the subset sum problem, a famous weakly N P-complete problem [12]. Given natural numbers a 1 , . . . , a m ∈ N, S ∈ N, the subset sum problem asks whether there is an index set I ⊆ {1, . . . , m} such that i∈I a i = S. For reducing this problem to kSPwLO, we fix an instance Since the size of the constructed instance of kSPwLO is polynomial in the size of ({a i } m i=1 , S), this proves the theorem; if there was a polynomial-time algorithm for kSPwLO, then we could solve the subset sum problem in polynomial time.
For proving our claim, we first note that the shortest (n 0 −n m ) path p sp equals {(n i-1 , n i ) |= 1, . . . , m} and has length ( p sp ) = A. Let p be a n 0 -n m -path with Sim( p, p sp ) ≤ θ and let p ∩ p sp be the intersection of p with p sp . Note that p ∩ p sp completely defines the path p: For each (n i-1 , n i ) ∈ p sp \ p, we know that p contains the edges (n i-1 , n i ) and (n i , n i ). We now make two observations. First, we note that, because of the definition of Sim, the choice of θ , and ( p sp ) = A, it holds that ( p∩ p sp ) ≤ S. Second, by con- . These two observations directly imply the claim and prove the theorem.

Computing kSPwLO queries
A naïve approach for computing kSPwLO queries is to enumerate all paths from s to t and choose the subset that satisfies Definition 2. This is clearly impractical. A more efficient approach involves the examination of paths in increasing length order. After adding the shortest path p sp (s→t) to the result set P LO , every next path p in length order is constructed using some algorithm for the K -shortest paths [16,23,37]. If p is an alternative to P LO , then p is added to the result. This process continues until P LO contains k paths or all paths from s to t have been examined. Despite its simplicity, this approach is not practical even for small road networks. Chondrogiannis et al. [9] introduced the BSL algorithm that captures this approach and showed that the number of constructed paths is prohibitively high. In the worst case, all paths from s to t have to be constructed, which is a well-known # P-complete problem [33].
To improve computation even further, Liu et al. [22] proposed the FindKSPD algorithm that employs two lower bounds. The first bound is determined using a reverse shortest path tree, while the second is derived from the similarity function of Eq. 1. These bounds prioritize the examination of paths that are more likely to lead to the next shortest alternative path. In practice though, the number of examined paths is still very high. Our experiments in Sect. 9 show that our exact algorithms clearly outperform FindKSPD.

Incomplete solutions
Regardless of the approach, it is important to note that computing an exact solution to a kSPwLO query is not always possible. For instance, consider the query kSPwLO(G, s, t, 5, 0.3) on our running example in Fig. 2. By examining paths in length order aiming for constructing the P LO result, we obtain the set { p 1 , p 4 , p 11 } that contains less than the requested five paths but still satisfies both conditions of Definition 2. We call such a set of paths an incomplete solution. Apparently, if an exact algorithm returns an incomplete result, then an exact solution for the given combination of k and θ does not exist. Nevertheless, as an incomplete result may still be meaningful to the user, our algorithms in Sects. 5 and 6 return the incomplete result if a complete solution does not exist.

Extending kSPwLO
Throughout this paper, we consider a single optimization criterion (i.e., edge weight) and a single constraint (i.e., path overlap) for computing alternative routes. However, our problem definition and our solutions can be adapted to take into account more optimization criteria and/or constraints. A direct approach would be to have composite weights assigned to the edges of the road network edges using a linear combination of multiple criteria. Standard multicriteria optimization can also be supported, as the term 'shortest' can be interpreted as 'the best path according to a set of optimization criteria'. This, however, would also increase the complexity of the problem. With regard to additional constraints, despite focusing on the path overlap, our problem definition and all the algorithms we present support any monotonic similarity measure. Nevertheless, our aim is to provide a general purpose solution. Investigating optimization criteria and/or constraints that might be interesting in specific application scenarios is out of the scope of this paper.

Exact algorithms
In this section, we investigate the exact computation of kSPwLO queries, and we propose two label-setting algorithms that traverse the road network examining paths in length order.

The ONEPASS algorithm
OnePass, our first exact algorithm, traverses the road network expanding paths from the source node s while pruning partially expanded paths that cannot lead to a result as early as possible. We call such paths infeasible. For this purpose, we first introduce the notion of one-way similarity, which enables the comparison of partially expanded paths p(s→n) to paths p(s→t) already in the tentative result set. Formally: Compared to Eq. 1, we observe that Eq. 2 is asymmetric, i.e., − − → Sim( p , p) ≡ − − → Sim( p, p ). The following lemma follows naturally from the asymmetric nature of Eq. 2.
Lemma 1 Let p, p be two paths and p ∩ p = {e 1 , . . . , e m } be the set of their shared edges. The following holds for the one-way similarity of p to p : where e i is the subpath of p containing only edge e i .
Proof From Eq. 2 we have: thus proving the Lemma.
Lemma 1 unveils the monotonicity of the one-way similarity that enables the incremental computation of Eq. 1. Given two paths p and p , to compute − − → Sim( p, p ) it suffices to accumulate the individual similarities of the edges of the longer path. Formally: Apart from enabling the incremental computation of the similarity measure, Lemma 3 also enables the early pruning of partially expanded paths that cannot lead to a solution. Let p sub be a subpath of p. If p sub shares some edges with some p i ∈ P LO , then p contains all those edges as well.
then path p is infeasible and can be safely discarded. This pruning criterion is formally captured by the following lemma:

Lemma 2 Let P LO be the tentative result of a kSPwLO(G, s, t, k, θ) query and p /
∈ P LO be a path from s to t. If p is an alternative path to P LO , then − − → Sim( p sub , p i ) ≤ θ holds for every subpath p sub of p and for all paths p i ∈ P LO .
Proof The proof follows directly from Eq. 2. Let p i ∈ P LO be some already recommended path. As for both − − → Sim( p, p i ) and − − → Sim( p sub , p i ) the denominator is the same, i.e., ( p i ), the numerator gets the greatest value when all edges of p i shared by p or p sub are counted for the computation. As As a result of Lemma 2, we observe that, if the one-way similarity of a path p(s→n) violates the threshold θ , then all of its extensions p(s→n) • p(n→t) to target node t are infeasible as they violate the similarity constraint.
Next, we present the OnePass algorithm that traverses the road network, expanding every path from source node s that qualifies the pruning criterion of Lemma 2. Similar to all label-setting algorithms, OnePass maintains a set of labels Λ(n), where each label n, p(s→n) represents a path from s to n. 2 The paths are examined in increasing length order. By doing so, OnePass ensures that the shortest alternative path to each tentative kSPwLO result is computed. Let P LO = { p 1 , . . . , p k } be the result of a kSPwLO query. Every path p i+1 is the shortest alternative to each tentative result P i LO = { p 1 , . . . , p i }. Since after the computation of each alternative path more paths are pruned but no new edges are added, every subsequent result path p i+1 will be longer than every LO . Algorithm 1 illustrates the pseudocode of OnePass. First, the shortest path p sp (s→t) is retrieved and the result set P LO is initialized with p sp in Line 1. The algorithm uses a min priority queue Q (initialized with label s, ∅ in Line 2) to traverse the road network. Between Lines 5-16, OnePass examines the contents of Q until either P LO contains k paths or the queue is depleted. At each round, current label n, p n is popped from Q (Line 6). If n is the target node t, then p n is recommended, i.e., added to P LO (Line 8). Next, between Lines 8-10, for each label n q , p q in Q, OnePass computes the similarity of p q to the newly recommended path p n and determines whether p q qualifies the pruning criterion of Lemma 2; in particular, if − − → Sim( p q , p n ) > θ then p q can be safely discarded. If node n is not the target t, the algorithm expands the current path p n considering all outgoing edges (n, n c ) (Lines 13-16), provided that the new path p c ← p n • (n, n c ) qualifies the pruning criterion of Lemma 2 (Line 15). Finally, OnePass returns the result set P LO in Line 17. Note that if Q is depleted before k paths are added to the result, then the result set P LO is incomplete and an exact solution does not exist. Figure 4 exemplifies OnePass for the kSPwLO (s, t, 3, 0.5) query. Initially, the shortest path p sp = (s, n 3 ),

ALGORITHM 1: OnePass
Init. result with p sp 2 initialize min-priority queue Q with s, ∅ ; Complexity analysis Since the pruning criterion of Lemma 2 does not give any guarantee as to how many paths are pruned, in the worst case OnePass has to enumerate all (s→t) paths. If K is the number of such paths, OnePass runs in O(poly(K )) time. K is vastly superpolynomial, i.e., for random graphs with density d [28], which implies that OnePass is prohibitively expensive.

The MULTIPASS algorithm
Despite employing the pruning criterion of Lemma 2, OnePass still has to expand and examine a large portion of all possible p(s→t) paths. To address this shortcoming, we introduce MultiPass, our second exact algorithm. In addition to the pruning criterion of Lemma 2, MultiPass employs a second pruning criterion that aims at reducing the search space by avoiding the expansion of non-promising paths.
Let p sp (s→t) be the shortest path from a source node s to a target t as illustrated in Fig. 5. In addition, let p i (s→n) and p j (s→n) be two distinct paths from source s to a node n of the shortest path p sp such that ( p i )< ( p j ). Assuming that both p i , p j are extended to reach t following the same path p(n→t), any extension of p i will be shorter than the respective extension of p j . Furthermore, let i.e., the similarity of p i with p sp is equal or lower than the similarity of p j with p sp . Due to the monotonicity of the one-way similarity, any extension of p i to n will have the same or less similarity with p sp compared to the respective extension of p j . As a result, for any extension of p j there will always be a shorter extension of p i with less or equal similarity with p sp ; thus, p j can be pruned. The same idea can be utilized to prune the search space when computing the shortest alternative path to a set of paths P. This pruning criterion is formally captured by the following lemma: Lemma 3 Let P be a set of paths from a source node s to a target node t, and p i , p j be two paths from s to some node n.
hold then path p j cannot be part of the shortest alternative path to P and we write p i ≺ P p j .
Proof We prove the lemma by contradiction. Let p j = (s, * ), . . . , ( * , n), . . . , ( * , t) be an extension of p j (s→n) to target t is the shortest alternative path to P. Then, we show that an extension p i = (s, * ), . . . , ( * , n), . . . , ( * , t) of p i (s→n) to target t is also an alternative path and it will be examined and recommended before p j .
According to the definition of an alternative path, As extension paths p i and p j share the same sequence of edges connecting n to target t, we deduce that (a) − − → Sim( p i , p) ≤ θ holds ∀ p ∈ P, i.e., p i is alternative to P and (b) ( p i ) < (p j ) which means that p i will be examined before p j . Lemma 3 can be utilized to compute the shortest alternative to a set of paths as follows. Let P be the set of paths for which we want to compute the shortest alternative path, and P n the set of paths from s to some node n created during the expansion of all paths from s. If P n contains a path p (s→n) such that (a) p is longer than any path p n ∈ P n \ {p }, and (b) for every path p ∈ P the similarity − − → Sim( p , p) is higher than − − → Sim( p n , p) for all paths p n ∈ P n \ {p }, then p can be pruned. Note that the addition of a path in P n may render Condition (B) of Definition 2 not applicable for another path already in P n . To ensure that set P n always contains only paths that satisfy Conditions (A) and (B), we have to check whether both conditions still hold every time a new path is added to P n .
We now present MultiPass, an algorithm that employs the pruning criteria of both Lemma 2 and Lemma 3. For each node n of the road network, MultiPass maintains a set of labels Λ(n). Each label represents a path from s to n and is of the form n, p(s→n) 3 . The algorithm examines paths from s in increasing order of their length and expands every path p(s→n) from s to a node n that satisfy the conditions set by Lemma 2 and Lemma 3. Similar to OnePass, by examining paths from s in length order, MultiPass ensures that the shortest alternative path to each tentative kSPwLO result is computed.
Algorithm 2 illustrates the pseudocode of MultiPass. First, the P LO result set is initialized to the shortest path in Line 1. Before each round, a min priority queue Q is initialized to s, ∅ (Line 3) and each node n is associated with a (initially empty) set of labels Λ(n) (Lines 4-5). At each round in Lines 5-20, MultiPass pops label n, p n for current path p n in Line 7. If n is the target t, then p n is added to P LO and the round terminates (Lines 8-10). Otherwise, MultiPass expands p n considering all outgoing edges (n, n c ) (Lines [11][12][13][14][15][16][17]. Each new path p c ← p n • (n, n c ) (Line 13) is evaluated against the pruning criteria of Lemma 2 (Lines 14-15) and Lemma 3 (Lines 16-17). If p c qualifies both pruning criteria, MultiPass removes from Q and Λ(n c ) every label representing a path p n such that p c ≺ P LO p n (Line 19). The new label is added to Q (Line 20) and Λ(n c ) (Line 21), and the next label is popped from Q. The loop terminates when either k paths are added to P LO or Q is empty. Finally, MultiPass returns the result set P LO in Line 22. Note that, similar to OnePass if the loop terminates before k paths are found, then the result set P LO is incomplete; an exact solution does not exist. Figure 6 demonstrates MultiPass for the kSPwLO(G, s, t, 3, 0.5) query. Initially, the shortest path p sp (s→n) = (s, n 3 ), (n 3 , n 5 ), (n 5 , t) is computed and added to P LO . The first path examined by MultiPass is p 1 . The similarity − − → Sim( p 1 , p sp ) = 3/8 = 0.375 is below the similarity threshold θ = 0.5; hence, p 1 is not pruned. The same holds for p 2 , which is the next path examined by Multi-Pass. Subsequently, MultiPass examines paths p 3 , p 4 and p 5 . Path p 3 is not pruned as − − → Sim( p 3 , p sp ) = 3/8 = 0.375 does not exceed the similarity threshold. For p 4 the similarity − − → Sim( p 4 , p sp ) = 0.375 also does not exceed the similarity threshold. Since node n 1 has already been visited by p 3 though, we also check Lemma 3, and we have

Example 4
. Therefore, Lemma 3 cannot be applied and p 4 is not pruned. On the contrary, for p 5 the similarity − − → Sim( p 5 , p sp ) = 6/8 = 0.75 exceeds the similarity threshold and so, p 5 is pruned by Lemma 2. MultiPass continues the execution of the current round in the same fashion until the alternative path p 14 with ( p 14 ) = 10 is found and subsequently added to P LO . Next, MultiPass performs the second round in the same fashion, computes the alternative path p 13 with ( p 13 ) = 11 and completes the result set P LO .

ALGORITHM 2: MultiPass
Init. result with p sp 2 while |P LO | < k and last round updated P LO do 3 initialize min-priority queue Q with s, ∅ ; Complexity analysis With regard to the complexity of Mul-tiPass we state the following theorem: Theorem 2 MultiPass is optimal for the kSPwLO problem.
Proof To determine the complexity of MultiPass, we assume without loss of generality that the edge weights (u, v) are natural numbers. For each node n c and each iteration j of the main while-loop (Line 2), we define c j (n c ) as the number of non-dominated labels n c , p c such that p c respects the similarity constraints for all previously computed paths p i ∈ P LO . Note that, because of the pruning in Lines 14, 16, and 19, we have |Λ(n c )| ≤ c j (n c ) throughout iteration j. Furthermore, we know that at most n c ∈N c j (n c ) labels are added to Q (Line 20). Hence, MultiPass enters the inner while-loop (Line 6) at most n c ∈N c j (n c ) times.
For upper-bounding c j (n c ), we observe that, for each previously computed path Since the weights are natural numbers, we hence know that − − → Sim( p c , p i ) can assume at most θ ( p i ) + 1 different values. Now let C(n c ) be a collection of (s→n c ) paths that respect the similarity constraints for all previously computed paths Due to the aforementioned considerations, we know that in the j th iteration of the main while-loop MultiPass enters the inner while-loop starting in Line 6 at most O(|N |·(θ ·L) j ) times. Moreover, if the priority queue Q is implemented as a Fibonacci heap, each iteration of the inner loop runs in O(|N | · (θ ·L) j ) time. By summing over iterations of the main loop, we conclude that the overall runtime complex- MultiPass is a pseudo-polynomial algorithm. Following on Theorem 1, as answering kSPwLO queries is weakly N Phard even for constant k, MultiPass is optimal in terms of complexity. In other words, unless P = N P, there are no substantially faster algorithms for answering kSPwLO queries.
Note that, in contrast to OnePass, MultiPass traverses the road network multiple times, i.e., k − 1 times in total. As the pruning criterion of Lemma 3 can be utilized to compute only a single alternative path to a set of paths, there is no guarantee that a path pruned at a given round of MultiPass using the pruning criterion of Lemma 3, will not lead to a result during a subsequent round. Consider again the example in Fig. 5. Let p sp be the only path in the tentative set P LO of alternative paths. If during the search for the alternative path p 1 to P LO , p j is pruned because p i ≺ P p j holds, p j cannot be part of the shortest alternative to P LO . However, there is no guarantee that p j will not be part of the shortest alternative to both p sp and p 1 . If p i is a subpath of p 1 , then during the search for the alternative path to P LO = {p sp , p 1 }, p i may be pruned much earlier by Lemma 2. Consequently, MultiPass needs to restart the traversal to ensure the correctness of the result and may potentially re-examine paths already examined in previous rounds. However, in contrast to the runtime complexity of OnePass, the runtime complexity of MultiPass does not depend on the exponentially large number K of (s→t) paths. Hence, despite traversing the net-work k − 1 times instead of one, MultiPass is expected to be much faster than OnePass.

Performance-oriented heuristic algorithms
Despite employing the pruning criteria of Lemma 2 and Lemma 3, the exact algorithms still examine a large number of paths, which renders them impractical for large road networks. In view of this, we investigate three heuristic algorithms to accelerate the computation of kSPwLO queries. Intuitively, the algorithms treat Condition (B) in Definition 2 as a soft constraint, i.e., the alternative paths are sufficiently dissimilar to each other, but not necessarily as short as possible.

The ONEPASS + algorithm
Our first heuristic algorithm, denoted by OnePass + , provides a first cut solution for computing kSPwLO queries. Given a source node s and a target node t, OnePass + traverses the road network expanding every path p(s→n) from the source to a node n that qualifies both Lemma 2 and Lemma 3. This procedure is the same as one round of Multi-Pass. In contrast to MultiPass though, each time a new path is added to the result set P LO , OnePass + does not restart the traversal like MultiPass, but continues in a similar fashion to OnePass, thus traversing the network only once. Recall our discussion for MultiPass though, that a path which is pruned as non-promising during the current round may be promising during the next round. As such, OnePass + cannot guarantee that the exact solution is found. However, as this case applies to only a small subset of the examined paths, the result of OnePass + is expected to be close to the optimal solution in terms of length, a fact which is supported by our experiments in Sect. 9.
Algorithm 3 illustrates the pseudocode of OnePass + . The Between Lines 5 and 22, OnePass + examines the contents of Q until either k paths are added to P LO or Q is depleted. At each iteration, a label n, p n is popped from Q (Line 6). If node n is target t (Line 7), then p n is added to P LO (Line 8) and the same update procedure as in OnePass takes place (Lines 9-11), i.e., all paths p h with − − → Sim( p h , p c ) > θ are discarded. Otherwise, the algorithm expands the current path p n considering all outgoing edges (n, n c )(Lines [13][14][15][16][17][18][19][20][21][22]. OnePass + checks whether the new path p c ← p n • (n, n c ) qualifies the pruning criteria of both Lemma 2 (Lines 15-16) and Lemma 3 (Lines 17-18) and updates Q and Λ(n c ) accordingly. Then, OnePass + adds a new label for p c to Q (Line 21) and Λ(n c ) (Line 22) and proceeds with popping the next label from Q. Finally, the result set P LO is returned in Line 23.
Complexity analysis OnePass + and MultiPass use the same pruning criteria. Hence, following the complexity analysis of MultiPass in Sect. 5.2, we obtain that OnePass + enters the main while-loop at most O(|N | · (θ ·L) k ) times and that each of its iterations runs in O(|N | · (θ ·L) k ) time. Therefore, the runtime complexity of OnePass + is O(|N 2 | · (θ ·L) 2k ) time. As OnePass + traverses the network only once, the overall runtime complexity of OnePass + cor-responds to the complexity of one traversal carried out by MultiPass.

The SVP + algorithm
Our second heuristic algorithm, denoted by SVP + , recommends alternative paths by employing the concept of single-via paths [2]. Given a road network G = (N , E), a source node s and a target node t, the single-via path of every node n ∈ N is the concatenation of the shortest paths p sp (s→n) and p sp (n→t). SVP + aims at finding a set of k single-via paths such that: (a) the shortest single-via path, i.e., the shortest path p sp (s→t), is always recommended, (b) every single-via path is dissimilar to its predecessors with respect to a similarity threshold θ , and (c) all k single-via paths are as short as possible. The main idea behind SVP + is similar to the baseline method for computing kSPwLO queries discussed in Sect. 4.2. However, instead of iterating over all possible (s→t) paths, SVP + iterates over the much smaller set of single-via paths.
Algorithm 4 illustrates the pseudocode of SVP + . In Line 1, the result set P LO is initialized to the shortest path p sp (s→t). Then, two shortest path trees are computed, one from s to every node n of G (Line 2) and a reverse one from every node n of G to t (Line 3). During this step, all distances d(s, n) and d(n, t) are computed. The algorithm organizes the nodes of the road network according to the length of their single-via path, i.e., ( p svp (n)) = d(s, n)+d(n, t), inside min-priority queue Q (Lines 4-6). At each iteration between Lines 7 and 11, SVP + pops from the queue the top element representing a node n (Line 8) and retrieves the single-via path p n for node n (Line 9). The single-via paths are examined in increasing order of their length. In Line 10, SVP + checks whether p n is simple (contains no cycles) and sufficiently dissimilar to all paths currently in P LO ; if so, p n is added to P LO (Line 11). The algorithm terminates when either k paths have been added to P LO or there exist no more singlevia paths to examine, i.e., queue Q is depleted, in which case the P LO result set contains less than k paths. Finally, the result set P LO is returned in Line 12. Figure 7 exemplifies SVP + for the query kSPwLO (G, s, t, 3, 0.5). First, SVP + adds the shortest path p sp (s→t) = (s, n 3 ), (n 3 , n 5 ), (n 5 , t) to the P LO result set. Then, SVP + iterates over the set of single-via paths in length order. The table in Fig. 7 shows the entire set of singlevia paths for the example road network. The first single-via paths examined are p svp (n 3 ) and p svp (n 5 ). Both paths are rejected as their similarity to p sp exceeds the similarity threshold θ . Single-via path p svp (n 4 ) is also rejected as Sim( p svp (n 4 ), p) = 6/8 = 0.75 exceeds the similarity threshold. Next, SVP + examines p svp (n 2 ) for which the similarity to the shortest path is Sim( p svp (n 2 ), p) = Init. result with p sp 2 T s→N ← shortest path tree from s to all n ∈ N ; 3 T N →t ← shortest path tree from all n ∈ N to t; 4 initialize min-priority queue Q with ∅; 5 foreach n ∈ N do 6 Q. push( n, d(s, n)+d(n, t) ); 7 while |P LO | < k and Q not empty do Notice that in Example 5, SVP + fails to find the exact result for the given kSPwLO query. In particular, path p = (s, n 3 ), (n 3 , n 4 ), (n 4 , t) , which is in the exact kSPwLO result, is not a single-via path; hence, p is not examined by SVP + . At a more general level, this example shows that SVP + is unable to compute the exact solution to a kSPwLO query if a path p is part of the exact result, but is not a singlevia path.

Example 5
Complexity analysis To build the set of single-via paths, SVP + needs to run Dijkstra's algorithm twice, which requires O(|E| + |N | · log|N |) time. Since, by definition, there is one single-via path for each node n ∈ N \ {s, t}, the number of paths that have to be examined by SVP + is in the worst case O(|N |). Examining whether a given single-via path should be added to the P LO set or not requires O(k) time. Therefore, the overall runtime complexity of SVP + is O(|N | · k + |E| + |N | · log|N |).

The ESX algorithm
Our third heuristic algorithm, denoted by ESX, computes kSPwLO by executing shortest path searches while progressively excluding edges from the road network 4 . We identify two important factors that affect the processing of a kSPwLO query with ESX: (1) the order in which edges are removed from the road network and (2) the maintenance of the connectivity of the network. We investigate the former in detail in Sect. 6.4. Regarding the latter, removing an edge from the road network may cause the network to become disconnected. This prevents any subsequent iteration from finding a valid path. To avoid such cases, if the shortest path search fails to find a path from s and t after the removal of an edge e from the road network, ESX re-inserts e in the network and marks it as non-removable. Non-removable edges are never removed from the road network regardless of their priority.
Algorithm 5 illustrates the pseudocode of ESX. The algorithm maintains a heap H i for each path p i (s→t) added to the P LO result set. The heap organizes every edge e j contained in p i according to their priority prio(e j ); H i can be either a min-heap or a max-heap depending on the strategy in which the edges are prioritized (see Sect. 6.4). Initially, the p sp (s→t) shortest path is added to the result and the associated heap H sp is initialized with the edges of p sp (s→t) (Lines 1-2). The algorithm keeps track of the non-removable edges inside set E DNR (initialized to an empty set in Line 3).
In Lines 4-19, ESX iterates over the already computed paths and their heaps to determine the next alternative path, until either the P LO result set contains exactly k paths or there are no more edges to be removed from the road network. Specifically, ESX first accesses the most recently recommended path, denoted by p c in Line 5 and then executes the loop in Lines 6-16. At each iteration of this loop, the  already computed path p i ∈ P LO with the highest similarity to p c is chosen as long as p i contains edges that can be removed from the road network. ESX then considers edge e of path p i , i.e., the top of the H i max-heap (Line 7). If e is not marked as non-removable, then the algorithm removes the edge from road network in Line 10 and computes the new shortest path p c in Line 11. If p c is null, then the removal of e has rendered the network disconnected. Consequently, e is re-inserted to the road network (Line 13) and is inserted to E DN R (Line 14), and ESX proceeds to the next round. Otherwise, p c is checked whether it is an alternative to P LO (Line 16), the result set is updated accordingly in Line 17 and a new heap H c associated with p c is initialized with the edges of p c in Line 18. This process is repeated until either k paths have been added to P LO or there are no more edges that can be removed, in which case the P LO result set contains less than k paths. Finally, the result set P LO is returned in Line 19. Figure 8 exemplifies ESX for the query kSPwLO (G, s, t, 3, 0.5). To determine the priority of an edge, we consider its stretch shown on the upper table in Fig. 8. Without loss of generality, assume that ESX removes the edge with the smallest stretch first; hence, every H i is a min-heap. Initially, the shortest path from s to t, i.e., p sp (s→t) = (s, n 3 ), (n 3 , n 5 ), (n 5 , t) , is computed and added to the P LO result set. Based on the edge priorities, ESX first removes the (n 5 , n t ) edge of p sp and compute the shortest path on the updated network, p 2 = (s, n 3 ), (n 3 , n 5 ), (n 5 , n 4 ), (n 4 , t) with ( p 2 ) = 9. Path p 1 is not an alternative to P LO as Sim( p 2 , p sp ) = 0.75 > θ. Hence, ESX proceeds by removing edge (n 3 , n 5 ) ∈ p sp and computing the new shortest path p 3 = (s, n 3 ), (n 3 , n 4 ), (n 4 , t) with ( p 3 = 10). We now have Sim( p 3 , p sp ) < θ and hence, p 3 is added to the result set P LO . Subsequently, ESX updates the edge priorities table by computing the priorities of (n 3 , n 4 ) and (n 4 , t) and proceed to the next round. As the current path is p 3 (the last path added to the result set) and e 1 has the highest stretch, either (n 3 , n 4 ) and (n 4 , t) is removed.

Optimizing ESX
The order in which edges are removed from the road network affects both the result quality and the performance of ESX. Since determining the optimal order is prohibitively expensive, in what follows we describe three strategies to determine which edge to remove at each iteration. Depending on the strategy and the nature of the heap (min or max) that is used to organize the edges of a path, we discuss six variants of ESX.
Smallest/largest edge weight The first strategy uses the edge weight to select which edge to remove at each iteration. We prioritize either edges with small weight (MinW variant) or edges with large weight (MaxW variant). Removing first edges with a large weight causes the next path to be less similar to the already computed paths, thereby enabling the algorithm to terminate sooner. However, there is a higher chance that ESX will miss alternative paths that have small differences in length with the paths already found. Hence, on average the result set is expected to contain longer paths. On the contrary, prioritizing edges with a small weight decreases the chances of missing such paths, but leads to more iterations of the algorithm, thereby increasing the overall runtime of ESX.
Minimum/maximum stretch Our second strategy is to remove edges based on their stretch. Given an edge e = (n i , n j ), let p be the shortest path from n i to n j computed by excluding e from the network. The stretch of an edge e is the difference between the length of p and the length of e, i.e., stretch(e) = | ( p) − (e)|. Similar to MinW/MaxW, the removal of edges with a high stretch (MaxS variant) is more likely to cause a detour, leading to paths that are less similar to the paths already in the result set and, hence, allowing ESX to find a result sooner. In contrast, prioritizing edges with a small stretch (MinS variant) leads to the examination of more paths. This increases the overall runtime, but at the same time, it also increases the chances that on average the result set will contain shorter paths.
Least/most local shortest paths Our third strategy is inspired by the edge betweenness. Given an edge e(a, b) on some path p ∈ P LO , let E inc (a) and E out (b) be the set of all incoming edges e(n i , a) to a from some nodes n i ∈ N \{b} and the set of all outgoing edges e(b, n j ) from b to some nodes n j ∈ N \{a}, respectively. First, ESX computes the set P s which contains the shortest paths p sp (n i , n j ) such that n i ∈ E inc (a) and n j ∈ E out (b). Then, ESX defines the set P s of all paths p sp (n i , n j ) that cross e. Finally, ESX assigns a priority to e, denoted by prio(e), which is set to |P s |. Similar to the previous variants, the intuition behind MaxP is that the more (shortest) paths cross an edge, the more the chances that removing this edge causes a detour; thus, ESX will find an alternative path sooner. By employing MinP, ESX examines more paths and increases the chance of computing alternative paths that are shorter on average, at the cost of an increased overall runtime.

Completeness-oriented heuristic algorithms
Up to this point, we have discussed how to efficiently compute the exact result to kSPwLO queries as well as approximations, where the paths in the result set are not necessarily as short as possible. Since guaranteeing the dissimilarity threshold of the returned paths may lead to incomplete results (cf. Sect. 4.3), we next investigate the approximate computation of kSPwLO queries while treating the Condition (A) in Definition 2 as a soft constraint in order to ensure that the result set contains exactly k paths. 5

Relaxation of Â
A naive solution to deal with an incomplete result set P LO is to execute multiple kSPwLO queries, each time increasing the original similarity threshold θ manually. Such a trial-anderror approach is impractical unless there is a hint on how much to increase θ . Furthermore, executing multiple queries means that all previously computed intermediate results, e.g., rejected paths, are disregarded. To this end, we aim for a solution that determines the smallest increase of θ that yields a complete result automatically and computes the complete result without running another query from scratch. Let kSPwLO(G, s, t, k, θ) be a query whose P LO result set is incomplete, i.e, |P LO | < k. In addition, let P cand be a set of paths from s to t with P LO ⊆ P cand . We will elaborate on the nature of P cand and how it is computed in Sect. 7.2. Without loss of generality, assume for now that P cand contains all possible p(s→t) paths in road network G. By the definition of the kSPwLO problem, for every path p ∈ P cand \P LO , there exists a shorter path p ∈ P LO such that Sim( p, p ) > θ . Based on this, we define the maximum similarity of a path p ∈ P cand to result set P LO as: Consequently, path p ∈ P cand cannot be part of the result set P LO , as long as either p is part of the result, or the similarity threshold is greater than θ and lower than Sim max ( p, P LO ). In fact, even if we use a new similarity threshold equal or higher than Sim max ( p, P LO ), we cannot guarantee that p will be included in the new result set P LO .
That is because the new threshold may cause a shorter path p with ( p ) ≤ ( p) to join P LO that keeps p out. In other words, with this new threshold, we can only guarantee that the result set P LO will indeed change. In this context, we next define θ min as the minimum value for the new similarity threshold such that query kSPwLO (G, s, t, k, θ) will return an updated result as: Essentially, if the new threshold is set inside [θ, θ min ), the P LO result remains unchanged, while a value equal or higher than θ min will cause P LO to change.
To ensure a complete result, we need to progressively increase θ min until P LO contains exactly k paths. This iterative procedure is captured by Complete_kSPwLO illustrated in Function 1. The function receives as input a set of candidate paths P cand out of which the result will be extracted, the number k of requested paths, and an initial similarity threshold θ . The first step is to sort the paths in P cand in increasing length order. Note that if P cand contains less than k paths, it is impossible to return a complete result; hence, the function terminates and P LO ← P cand .
Next, between Lines 4 and 15 the function progressively relaxes the similarity threshold θ until the P LO result is complete. At each round, θ min and P LO are re-initialized to 1 and the shortest path p sp , respectively (Lines 5-6); note that p sp is the first path in P cand . Then, in between Lines 7 and 14, Complete_kSPwLO examines the paths in P cand in increasing length order. Fix such a path p. If p is alternative to the current result set P LO , p is added to P LO . At this point, the function will terminate if P LO already contains k paths as the result set is now complete. In contrast, if the current path p is not alternative to P LO , the current θ min is checked against Sim max of p to P LO in Line 13 and is updated in Line 14 accordingly using Eqs. 4 and 5. Finally, the value of θ is relaxed, i.e., increased to θ min to prepare for the next round. Note that the while loop eventually terminates due to the condition in Line 10, i.e., after k paths are added to P LO . kSPwLO(G, s, t, 5, 0.3) query on the road network in Fig. 9 and its incomplete result P LO = { p 1 , p 5 , p 11 }. Also assume that P cand contains all 24 possible paths from s to t. Figure 10 illustrates the process of relaxing similarity threshold θ and completing the result set. The paths contained inside the initial P LO are marked with ( * ).

Example 7 Consider the
The first round starts by setting θ min = 1 and P LO = {p 1 }. We then iterate over the paths in P cand . Path p 2 is more than 30% similar to p 1 and so it is ignored but enables us to update θ min = 0.75. In the same manner, p 3 is ignored but allows us to set θ min = 0.375. The first sufficiently dissimilar path to p 1 is p 4 , hence P LO is updated to { p 1 , p 4 }. All subsequent paths are not alternative to the current P LO until p 11 , which is Relax similarity threshold added to P LO . Complete_kSPwLO continues in the same manner until all 24 paths are examined. Note that when p 17 is examined, θ min is set to 0.365, which is also the value at the end of the round. Consequently, θ gets a new value θ = θ min = 0.364, indicating that in order for the P LO to change, paths should be allowed to be at most 36.4% similar to each other instead of the initial 30%. Columns 4-6 in Fig. 10 report all path similarities computed during the first round. The second round starts by setting θ min and P LO to 1 and { p 1 }, respectively; remember that θ is now 0.364. Complete_kSPwLO operates exactly as in the first round until p 17 is examined. This time, the path is sufficiently dissimilar to the current result set, and P LO is updated to { p 1 , p 4 , p 11 , p 17 }. At the end of the second round, the similarity threshold is further relaxed to θ = 0.375. For illustration purposes, Fig. 10 reports only the extra path similarities computed in the second round. Finally, the third round commences by setting again θ min = 1 and P LO = {p 1 }. Due to the new threshold θ = 0.375, Complete_kSPwLO adds to P LO paths p 1 , p 3 , p 4 , p 6 , and p 15 , in this order. Note that it is possible to add p 15 because p 11 is now excluded from the result as Sim( p 11 , p 6 ) = 0.58 > θ. The process terminates after adding p 15 , as P LO contains k = 5 paths.
Theorem 3 Given a set of paths P cand from s to t and a similarity threshold θ , Complete_kSPwLO determines the lowest value for θ min ≥ θ such that there is a solution set P LO ⊆ P cand with |P LO | = k.
Proof Let P cand be an input set of at least k distinct paths from s to t. Also, let Complete_kSPwLO terminate after I iterations of its main while-loop. Let θ i denote the value of the similarity threshold and k θ i the size of the result set P LO during the i-th iteration. We prove by induction on i ∈ {1, . . . , I } that k θ <k holds for all θ ∈ [θ, θ I ). If I = 1, i.e., the function Complete_kSPwLO terminates after a single iteration, we have θ 1 = θ which is by definition the minimum possible value. Now, if the inductive hypothesis holds for i<I but not for i+1, there is a θ ∈ [θ, θ i+1 ) such that k θ = k. We know from the hypothesis that θ ≥θ i . Furthermore, we know from the definition of θ i and θ i+1 that all θ ∈ [θ i , θ i+1 ) yield the same result set (cf. Eqs. 4 and 5). Together, these observations imply that k θ i = k, which contradicts the fact that function Complete_kSPwLO did not terminate at the i-th iteration.

Complexity analysis The runtime of function
Complete_kSPwLO depends on the size of |P cand |. At each round, the number of paths examined by Complete_kSPwLO is O(|P cand |). To determine whether a path should be added to P LO or not requires O(k) time. Furthermore, as for each path we keep at most k−1 similarities with paths in the result set, to add a single path to P LO we need, in the worst case, to run an iteration for all the similarities of all paths, i.e., O(|P cand |·k) iterations. Since, we need to fill the result set with k paths, the total number of iterations is O(|P cand |·k 2 ). Therefore, the overall runtime complexity of Complete_kSPwLO is O(|P cand | 2 ·k 3 ).

The SVP-C and ESX-C algorithms
As discussed in the previous subsection, the Complete_kSPwLO function can operate with any arbitrary set of (s→t) paths as input. The only requirement dictated by Definition 2 for P cand is to include the shortest path from s to t.
Paradigm 1 outlines the completeness-oriented computation of kSPwLO (G, s, t, k, θ) queries. Initially, the query is processed by a kSPwLO algorithm. Besides P LO , the algorithm also returns the candidate set P cand of paths from s to t. If P LO contains k paths, the result is already complete and the computation terminates. Otherwise, the completeness process takes over in between Lines 4 and 6. To deliver a complete result set, P cand must contain k or more paths. To this end, if P cand contains less than k paths, we add to it  , k, θ); the k-shortest paths from s to t (Lines 4-5). Finally, P cand is fed to the Complete_kSPwLO to produce a complete P LO result in Line 6.
In practice, using all possible paths from s to t as P cand is prohibitively expensive. Therefore, we rely on the kSPwLO algorithms to provide a set of candidate paths. Not all of our kSPwLO algorithms are compatible with Paradigm 1 though. Algorithms that traverse the original network, i.e., exact OnePass, MultiPass and heuristic OnePass + , do not qualify for this purpose, as the only (s→t) paths these algorithms construct are the ones that constitute the result set. On the contrary, this is possible with SVP + and ESX; recall that SVP + considers concatenations of single-via paths as candidate results (cf. Algorithm 4, Line 9) while ESX constructs candidate paths by removing edges (cf. Algorithm 5, Line 11). We denote by SVP-C and ESX-C the algorithms that follow Paradigm 1 and employ SVP + and ESX respectively to compute an initial P LO result and a set of candidate paths P cand .

Optimization with lower bounds
To further improve the performance of both our exact and heuristic algorithms, we employ a lower bound d(n, t) for the network distance d(n, t) of every node n to the target t. Such a lower bound enables algorithms to direct the traversal toward the target and has been employed by various existing works as well [22,30]. Also, such bounds can be computed in a preprocessing phase [32]. However, to enable our algorithms to work on road networks with changing edge weights, we compute bounds on query time.
To derive tight d(n, t) lower bounds, we run Dijkstra's algorithm [14] in reverse from target t to every node n of the road network. By executing such an all-to-one query, we obtain for every node n its exact distance d(n, t) to the target t, which is the tightest possible lower bound. This computation takes place at the beginning of the execution of all algorithm that employ this optimization, i.e., OnePass, MultiPass, OnePass + , and ESX. Instead of simply com-puting the shortest path from s to t, we compute the shortest path tree from target t to each node n in the road network.
Optimizing OnePass/MultiPass/OnePass + . In principle, OnePass, MultiPass and OnePass + utilize lower bounds in the same fashion. Instead of sorting labels into the priority queue based on their distance from the source, each label associated with some node n is sorted based on the total estimated distance d(s, n) + d(n, t). Apart from reducing the search space of the traversal, the pruning power of the algorithms is enhanced as well. Paths to nodes that are far away from the target have less chances of sharing edges with already recommended paths. Instead, paths to nodes that are closer to the target are more likely to share edges with already recommended paths and therefore have more chances to be pruned.
Optimizing ESX. As we explained in Sect. 6.3, ESX computes alternative paths by executing shortest path searches repeatedly. By employing the aforementioned lower bounds, ESX uses A * -search [15]. Since the lower bounds are the tightest possible ones, the search space of the traversal is expected to be small. While the quality of the bounds drops after each iteration as they are not updated after each edge removal, the correctness of the A * -search is still ensured.

Experimental evaluation
In this section, we report the results of our experiments that involve ten real-world road networks obtained from three different sources [1,5,19]. We selected road networks with different structural characteristics. Table 1 shows the number of nodes and edges, and the structure of each road network.
To assess the performance of all algorithms, we measure their average runtime over 1000 kSPwLO queries with randomly selected source and target nodes, while varying the number k of requested paths and the similarity threshold θ . In each experiment, we vary one of the two parameters and set  the other to its default value, i.e., k = 3 and θ = 0.5. For our performance-oriented heuristic algorithms, we also measure the quality, i.e., the shortness of the alternative paths, by comparing their average length to the length of the shortest path for each query, and the completeness of the result set, i.e., the percentage of queries for which an algorithm returns exactly k paths. For our completeness-oriented heuristic algorithms, we identify the maximum pairwise similarity between the results of each query such that the result set contains k paths, and we report the average similarity value for all queries. All algorithms were implemented in C++ 6 , the code was compiled using GNU G++ 8, and the tests run on a machine with 12 Intel Xeon E5-2650 (2.20GHz) processors and 256GB RAM running Ubuntu Linux. Moreover, our implementations of OnePass, MultiPass and OnePass + employ the lower bounds of Sect. 8. Figure 11 reports the runtime of our exact algorithms for processing kSPwLO queries while varying parameter k. As expected, the runtime of all algorithm goes up for increasing values of k since more paths need to be examined. Mul-tiPass clearly outperforms both OnePass and FindKSPD [22] and, in most cases, by a large margin. By utilizing the pruning criterion of Lemma 3, MultiPass is able to significantly reduce the total number of examined paths, even though it scans the network multiple times. Furthermore, while OnePass is always faster than FindKSPD, none of these two algorithms scale. Even for the road networks of Rome and Oldenburg, the smallest networks used in our experiment, both algorithms require in most cases several seconds on average to process kSPwLO queries. Figure 12 reports the runtime of our exact algorithms for processing kSPwLO queries while varying parameter θ . Similar to varying k, MultiPass is clearly the fastest exact algorithm. We also observe that the runtime of OnePass and FindKSPD increases for decreasing values of θ while the runtime of MultiPass peaks for θ = 0.3. This result reveals an important trade-off: as θ increases, the pruning power of Lemma 2 deteriorates, and, MultiPass constructs more (partial) paths. At the same time, the next path added to the result set is shorter due to the higher similarity threshold, and hence, the algorithm terminates earlier. With a decreasing θ , the pruning power of Lemma 3 also increases and more partial paths are pruned.

Heuristic algorithms
Next, we report the performance, result quality, and completeness of our heuristic algorithms.

Comparison of ESX variants
Before presenting the results of our experiments for all heuristic algorithms, we analyze the effectiveness of different ESX variants/edge removal strategies (cf. Sect. 6.4) to determine the most efficient one. For this purpose, we present our measurements on the road networks of San Joaquin and Tianjin.
Runtime Figure 13  Regarding the prioritization of edges with low priority (i.e., Min* variants) or high priority (i.e., Max* variants), as expected removing first edges with high priority is more efficient. Figure 15

Performance-oriented heuristic algorithms
Runtime Figures 16 and 17 report the response time of our performance-oriented heuristic algorithms OnePass + , SVP + and ESX, on four clearly larger road networks than the ones we used for measuring the performance of the exact algorithms. We also include the fastest exact algorithm, i.e., MultiPass. Figure 16 reports the response time of the heuristic algorithms and MultiPass varying the number of requested paths k. While, as expected, the runtime of all algorithms increases with k, the efficiency of MultiPass and OnePass + deteriorates much faster. For k≥3, MultiPass is approximately three times slower than OnePass + and more than three orders of magnitude slower than SVP + and ESX. OnePass + is also clearly outperformed by SVP + and ESX on all road networks for k≥3 by approximately three orders of magnitude. In brief, ESX is clearly the fastest algorithm in all cases, SVP + comes second in all networks for k≥3, while MultiPass and OnePass + have comparable performance to SVP + and ESX only for k = 2. Figure 17 reports the response time of the heuristic algorithms and MultiPass varying the similarity threshold θ . The overall picture is the same as in Fig. 16, i.e., SVP + and ESX are the clear winners, MultiPass is approximately three times slower than OnePass + , which in turn is up to two orders of magnitude slower than SVP + and ESX. An interesting observation though is that while the response time of SVP + and ESX decreases with increasing values of θ , Mul-tiPass and OnePass + show a local maximum for θ = 0.3. This indicates an important trade-off: as θ increases, the pruning power of Lemma 2 deteriorates, and both MultiPass and OnePass + construct more (partial) paths. At the same time, the higher similarity threshold causes each next path to be determined faster and the algorithms to terminate earlier. With θ decreasing, the pruning power of Lemma 3 also increases and more partial paths are pruned. Figure 18 shows the average length difference to p sp of the exact solution and the computed results of the heuristic algorithms OnePass + , SVP + and ESX. Note that only queries for which all algorithms returned a complete result set, i.e., a set of k paths, are considered. Naturally, the exact kSPwLO result provides the shortest alternatives. Looking at the heuristic algorithms, OnePass + produces the shortest alternative paths, which are very close to the paths in the exact solution. Both SVP + and ESX recommend alternative paths that are up to 15% longer on average than the paths in kSPwLO, with SVP + returning slightly shorter ones than ESX.

Result shortness
Completeness As already discussed, the algorithms for kSPwLO queries are not always able to compute all requested k alternative paths. Table 2 reports, for each algorithm, the percentage of queries for which exactly k alternative paths were found. Naturally, the exact solution kSPwLO has the highest completeness ratio. OnePass + is very close to the exact solution, achieving a completeness ratio of more than 90% in all scenarios. SVP + and ESX show similar completeness ratio, i.e., over 90%, in all scenarios, apart from the case where k = 3 and θ = 0.1 (i.e., the alternative paths are very dissimilar to each other). In this case, the completeness ration of SVP + and ESX is clearly lower than that of the exact solution and OnePass + . Nevertheless, the completeness ratio of ESX is above 80% is all cases and is clearly higher than the completeness ratio of SVP + .  Fig. 19, we analyze the runtime performance of SVP + and ESX for large values of k and large road networks setting θ = 0.5. For k ≤ 10, we observe that the runtime of ESX and the runtime of SVP + are similar, with ESX being slightly faster. For k > 10 and the road networks of Milan and Florida, we observe that ESX is clearly faster the SVP + . However, observe that for k = 20 and the road networks of Chicago and Colorado SVP + is faster than ESX. This result connected to the completeness of SVP + that we analyze below.
In Fig. 20, we report the average length difference to ( p sp ) of the result paths computed by SVP + and ESX. For the road networks of Milan and Chicago, we observe that the two algorithms recommend alternative paths of similar length. More specifically, for k ≤ 10 the two algorithms compute alternative paths of similar length with the ones returned by SVP + being slightly shorter, while for k > 10, ESX clearly computes shorter alternative paths. On the sparser road networks of Colorado and Florida, for  Table 3 reports on the completeness of the SVP + and ESX results. For the road networks of Milan and Chicago, we observe that while ESX demonstrates a higher completeness ratio than SVP + , the completeness ratio of both algorithms is in most cases over 90%. For θ < 0.5 though, ESX is clearly better than SVP + , while both algorithms struggle to compute a complete result for θ = 0.   Fig. 21 Performance of SVP + , ESX, SVP-C and ESX-C varying similarity threshold θ (k = 10)

Completeness-oriented heuristic algorithms
Runtime In Fig. 21, we report the runtime of SVP-C and ESX-C, varying the similarity threshold θ with k = 10, and we compare them to their performance-oriented coun-terparts, i.e., SVP + and ESX. In the road networks of Milan and Chicago for θ ≥0.5 and in the road networks of Colorado and Florida θ ≥0.7, we observe that all algorithms have similar runtime. For kSPwLO queries where SVP + and ESX return a complete result, SVP-C and ESX-C do not need to invoke function Complete_kSPwLO . Hence, the runtime of SVP-C and ESX-C is almost the same with SVP + and ESX, respectively. For smaller values of θ though, we observe in all networks that ESX-C clearly outperforms SVP-C. Also, the difference between the runtime of SVP + and SVP-C is much greater than the difference between the runtime of ESX and ESX-C. As the completeness ratio of SVP + is fairly low for small values of θ , function Complete_kSPwLO has to be invoked many times to relax θ and compute a complete result.
To get additional insights in the performance of our completeness-oriented algorithms, in Fig. 22 we report the number of paths in the candidate set of SVP-C and ESX-C for all datasets, for the case where both SVP-C and ESX-C demonstrated the lowest completeness ratio, i.e., k = 10 and θ = 10%. We observe that the P cand used by SVP-C is around three orders of magnitude larger than the P cand used by ESX-C. This difference justifies the difference in the runtime of the two algorithms, since ESX-C invokes Complete_kSPwLO using a much smaller P cand as input.
Pairwise Similarity. Figure 23 reports the average relaxed similarity threshold θ min for which algorithms SVP-C and ESX-C return a complete result, varying the initial similar- ity threshold θ . The line above each bar indicates the average θ min only for queries that the initial θ resulted in an incomplete result. In all road networks, for θ > 0.5, the relaxed similarity threshold θ min computed by SVP-C and ESX-C is almost the same. For θ ≤ 0.5, SVP-C performs slightly better in the road networks of Chicago and Milan, while the opposite is true in the road networks of Colorado and Florida. However, this result is influenced directly by the completeness ratios of SVP + and ESX. On the contrary, for queries that the initial θ leads to an incomplete result, SVP-C finds a complete result for a much smaller θ min than ESX-C. These observations hint that the P cand constructed by SVP-C is not only larger but also more diverse than the one constructed by ESX-C.

Summary of findings
To sum up, our experimental analysis concludes into three key findings. First, our tests for the exact computation of kSPwLO are in line with the theoretical analysis on the optimality of MultiPass (cf. Sect. 5.2). MultiPass is the fastest exact algorithm outperforming both OnePass and Find-KSPD [22] by a significant margin. However, MultiPass is practical only for k = 2. For k > 2, MultiPass is practical only on small road networks, as its response time even for mid-sized road networks is prohibitively high. Two out of our three performance-oriented heuristic algorithms manage to address this scalability issue. More specifically, despite being an improvement over MultiPass in terms of runtime and computing a result set that is close to the exact solution, OnePass + does not scale on large road networks. On the contrary, SVP + and ESX are both significantly faster than OnePass + and able to scale, but on average they compute slightly longer alternative paths. Overall, ESX is the best choice as it is the fastest performance-oriented heuristic algorithm while recommending more and shorter alternative paths than SVP + .
Finally, in applications where a complete result is required, i.e., exactly k alternative paths must be retrieved, we distinguish between two cases. If the response time matters more than the quality of the complete result set, ESX-C is the algorithm of choice as it inherits the performance advantage of ESX. However, if result quality is more important, SVP-C is preferred as its candidate set enables the Complete_kSPwLO to find a result set of more dissimilar paths than ESX-C.
To provide additional insights on how our approach could be used in practice, in Fig. 24 we visualize different sets of 3 alternative paths between two locations in the city of Oldenburg, setting the similarity threshold θ = 50%. Figure 24a shows the K -shortest paths, which clearly have little practical value since they are too similar to each other. On the contrary, both the kSPwLO shown in Fig. 24b and the results of our heuristic algorithms shown in Fig. 24d-f offer much more attractive alternative paths. As the heuristic algorithms compute fairly similar results to the kSPwLO, in applications where response time is important, we expect these results to be satisfactory. We also visualize the result of a randomized search to examine how better or worse the alternative paths would be if they were to be selected at random. We first add the shortest path into the result set, and then, we execute k−1 random walks in order to obtain k paths in total. We repeat this process multiple times and keep the best result in terms of shortness. Figure 24c shows that the alternative paths obtained using this randomization are too long and contain too many needless detours, even after 1000 iterations to improve the result quality.

Conclusions
In this paper, we studied the problem of alternative routing on road networks. Our goal was to recommend k paths that are sufficiently dissimilar to each other and as short as possible. To this end, we proposed kSPwLO, which minimizes the length of each individual path in the result set and we showed that kSPwLO is weakly N P-hard. For answering kSPwLO queries, we presented two exact algorithms, three performance-oriented heuristic algorithms and two completeness-oriented heuristic algorithms. Through an extensive experimental evaluation, we demonstrated the performance of all algorithms in terms of runtime and result quality, and we identified use-cases each algorithm is useful for and trade-offs each algorithm comes with.
In the future, we plan to extend the definition of alternative routing by considering multiple criteria and constraints to match the requirements of a wider range of applications. Moreover, we plan to adapt our algorithms for time-dependent and dynamic traffic-aware road networks. Finally, we plan to investigate the computation of dissimilar paths on different types of networks such as social networks and web graphs.