Four Soviets Walk the Dog: Improved Bounds for Computing the Fréchet Distance

Given two polygonal curves in the plane, there are many ways to define a notion of similarity between them. One popular measure is the Fréchet distance. Since it was proposed by Alt and Godau in 1992, many variants and extensions have been studied. Nonetheless, even more than 20 years later, the original O(n2logn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(n^2 \log n)$$\end{document} algorithm by Alt and Godau for computing the Fréchet distance remains the state of the art (here, n denotes the number of edges on each curve). This has led Helmut Alt to conjecture that the associated decision problem is 3SUM-hard. In recent work, Agarwal et al. show how to break the quadratic barrier for the discrete version of the Fréchet distance, where one considers sequences of points instead of polygonal curves. Building on their work, we give a randomized algorithm to compute the Fréchet distance between two polygonal curves in time O(n2logn(loglogn)3/2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(n^2 \sqrt{\log n}(\log \log n)^{3/2})$$\end{document} on a pointer machine and in time O(n2(loglogn)2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(n^2(\log \log n)^2)$$\end{document} on a word RAM. Furthermore, we show that there exists an algebraic decision tree for the decision problem of depth O(n2-ε)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(n^{2-\varepsilon })$$\end{document}, for some ε>0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon > 0$$\end{document}. We believe that this reveals an intriguing new aspect of this well-studied problem. Finally, we show how to obtain the first subquadratic algorithm for computing the weak Fréchet distance on a word RAM.

time O(n 2 (log log n) 2 ) on a word RAM. Furthermore, we show that there exists an algebraic decision tree for the decision problem of depth O(n 2−ε ), for some ε > 0. We believe that this reveals an intriguing new aspect of this well-studied problem. Finally, we show how to obtain the first subquadratic algorithm for computing the weak Fréchet distance on a word RAM.
Keywords Frechet distance · Word RAM · Pointer machine · Algebraic decision tree · Four Russian trick

Introduction
Shape matching is a fundamental problem in computational geometry, computer vision, and image processing. A simple version can be stated as follows: given a database D of shapes (or images) and a query shape S, find the shape in D that most resembles S. However, before we can solve this problem, we first need to address an issue: what does it mean for two shapes to be "similar"? In the mathematical literature, on can find many different notions of distance between two sets, a prominent example being the Hausdorff distance. Informally, the Hausdorff distance is defined as the maximal distance between two elements when every element of one set is mapped to the closest element in the other. It has the advantage of being simple to describe and easy to compute for discrete sets. In the context of shape matching, however, the Hausdorff distance often turns out to be unsatisfactory: it does not take the continuity of the shapes into account. There are well known examples where the distance fails to capture the similarity of shapes as perceived by human observers [6].
In order to address this issue, Alt and Godau introduced the Fréchet distance into the computational geometry literature [8,45]. They argued that the Fréchet distance is better suited as a similarity measure, and they described an O(n 2 log n) time algorithm to compute it on a real RAM or pointer machine. 1 Since Alt and Godau's seminal paper, there has been a wealth of research in various directions, such as extensions to higher dimensions [7,23,26,28,33,46], approximation algorithms [9,10,37], the geodesic and the homotopic Fréchet distance [29,34,38,48], and much more [2,22,25,35,36,51,54,55]. Most known approximation algorithms make further assumptions on the curves, and only an O(n 2 )-time approximation algorithm is known for arbitrary polygonal curves [24]. The Fréchet distance and its variants, such as dynamic timewarping [13], have found various applications, with recent work particularly focusing on geographic applications such as map-matching tracking data [15,63] and moving objects analysis [19,20,47].
Despite the large amount of published research, the original algorithm by Alt and Godau has not been improved, and the quadratic barrier on the running time of the associated decision problem remains unbroken. If we cannot improve on a quadratic bound for a geometric problem despite many efforts, a possible culprit may be the underlying 3SUM-hardness [44]. This situation induced Helmut Alt to make the following conjecture. 2 Conjecture 1.1 (Alt's conjecture) Let P, Q be two polygonal curves in the plane. Then it is 3SUM-hard to decide whether the Fréchet distance between P and Q is at most 1.
Here, 1 can be considered as an arbitrary constant, which can be changed to any other bound by scaling the curves. So far, the best unconditional lower bound for the problem is (n log n) steps in the algebraic computation tree model [21].
Recently, Agarwal et al. [1] showed how to achieve a subquadratic running time for the discrete version of the Fréchet distance, running in O n 2 log log n log n time. Their approach relies on reusing small parts of the solution. We follow a similar approach based on the so-called Four-Russian-trick which precomputes small recurring parts of the solution and uses table-lookup to speed up the whole computation. 3 The result by Agarwal et al. is stated in the word RAM model of computation. They ask whether their result can be generalized to the case of the original (continuous) Fréchet distance.

Our Contribution
We address the question by Agarwal et al. and show how to extend their approach to the Fréchet distance between two polygonal curves. Our algorithm requires total expected time O(n 2 √ log n(log log n) 3/2 ). This is the first algorithm with a running time of o(n 2 log n) and constitutes the first improvement for the general case since the original paper by Alt and Godau [8]. To achieve this running time, we give the first subquadratic algorithm for the decision problem of the Fréchet distance. We emphasize that these algorithms run on a real RAM/pointer machine and do not require any bit-manipulation tricks. Therefore, our results are more in the line of Chan's recent subcubic-time algorithms for all-pairs-shortest paths [30,31] or recent subquadratictime algorithms for min-plus convolution [16] than the subquadratic-time algorithms for 3SUM due to Baran et al. [12]. If we relax the model to allow constant time tablelookups, the running time can be improved to be almost quadratic, up to O(log log n) factors. As in Agarwal et al., our results are achieved by first giving a faster algorithm for the decision version, and then performing an appropriate search over the critical values to solve the optimization problem.
In addition, we show that non-uniformly, the Fréchet distance can be computed in subquadratic time. More precisely, we prove that the decision version of the problem can be solved by an algebraic decision tree [11] of depth O(n 2−ε ), for some fixed ε > 0. It is, however, not clear how to implement this decision tree in subquadratic time, which hints at a discrepancy between the decision tree and the uniform complexity of the Fréchet problem.
Finally, we consider the weak Fréchet distance, where we are allowed to walk backwards along the curves. In this case, our framework allows us to achieve a subquadratic algorithm on the work RAM. Refer to Table 1 for a comprehensive summary of our results.  We distinguish the continuous and the discrete Fréchet distance in the decision (D) and the computation (C) version. We also consider the weak continuous Fréchet distance. The computational models are the pointer machine (PM), the word RAM (WRAM) or the algebraic decision trees (DT). The old bounds are due to Alt and Godau [8] (continuous and weak) and Eiter and Mannila [39] (discrete)

Recent Developments
Recently, Ben Avraham et al. [14] presented a subquadratic algorithm for the discrete Fréchet distance with shortcuts that runs in O(n 4/3 log 3 n) time. This running time resembles, at least superficially, our result on algebraic computation trees for the general discrete Fréchet distance. When we initially announced our results, we believed that they provided strong evidence that Alt's conjecture is false. Indeed, for a long time it was conjectured that no subquadratic decision tree exists for 3SUM [57] and an (n 2 ) lower bound is known in a restricted linear decision tree model [4,40]. However, in a recentand in our opinion quite astonishing-result, Grønlund and Pettie showed that if we allow only slightly more powerful algebraic decision trees than in the previous lower bounds, one can decide 3SUM non-uniformly in O(n 2−ε ) steps, for some fixed ε > 0 [52]. They also show that this leads to a general subquadratic algorithm for 3SUM, a situation very similar to the Fréchet distance as described in the present paper. Thus, despite some interesting developments, the status of Alt's conjecture remains as open as before. However, we can now see that there exists a wide variety of efficiently solvable problems such as (in addition to 3SUM and the Fréchet distance) Sorting X + Y [41], Min-Plus-Convolution [16], or finding the Delaunay triangulation for a point set that has been sorted in two orthogonal directions [27], for which there seems to be a noticeable gap between the decision tree complexity and the uniform complexity.
In our initial announcement, we also asked whether, besides 3SUM-hardness, there may be other reasons to believe that the quadratic running time for the Fréchet distance cannot be improved. Karl Bringmann provided an interesting answer to this question by showing that any algorithm for the Fréchet distance with running time O(n 2−ε ), for some fixed ε > 0, would violate the strong exponential time hypothesis (SETH) [17]. These results were later refined and improved to show that the lower bound holds in basically all settings (with the notable exception of the one-dimensional continuous Fréchet distance, which is still unresolved) [18]. We believe that these developments show that the Fréchet distance still holds many interesting aspects to be discovered and remains an intriguing object of further study.

Preliminaries and Basic Definitions
Let P and Q be two polygonal curves in the plane, defined by their vertices p 0 , p 1 , . . . , p n and q 0 , q 1 , . . . , q n . Depending on the context, we interpret P and Q either as sequences of n and n edges, or as continuous functions P : [0, n] → R 2 and Q : [0, n] → R 2 . In the latter case, we have P(i + λ) = (1 − λ) p i + λp i+1 for i = 0, . . . , n − 1 and λ ∈ [0, 1], and similarly for Q. Let be the set of all continuous and nondecreasing functions α : [0, 1] → [0, n] with α(0) = 0 and α(1) = n. The Fréchet distance between P and Q is defined as where · denotes the Euclidean distance.
The classic approach to computing d F (P, Q) uses the free-space diagram FSD(P, Q). It is defined as In other words, FSD(P, Q) is the subset of the joint parameter space for P and Q where the corresponding points on the curves have distance at most 1, see Fig. 1.
The structure of FSD(P, Q) is easy to describe. Let R := [0, n] × [0, n] be the ground set. We subdivide R into The cell C(i, j) corresponds to the edge pair e i+1 and f j+1 , where e i+1 is the (i + 1)th edge of P and f j+1 is the ( j + 1)th edge of Q. Then the set F(i, j) := FSD(P, Q) ∩ C(i, j) represents all pairs of points on e i+1 × f j+1 with distance at most 1. Elementary geometry shows that F(i, j) is the intersection of C(i, j) with an ellipse [8]. In particular, the set F(i, j) is convex, and the intersection of FSD(P, Q) with the boundary of C(i, j) consists of four (possibly empty) intervals, one on each side of ∂C(i, j). We call these intervals the doors of C(i, j) in FSD(P, Q). A door is said to be closed if the interval is empty, and open otherwise.
A path π in FSD(P, Q) is bimonotone if it is both x-and y-monotone, i.e., every vertical and every horizontal line intersects π in at most one connected component. Alt and Godau observed that it suffices to decide whether there exists a bimonotone path from (0, 0) to (n, n) inside FSD(P, Q). We define the reachable region reach(P, Q) as the set of points in FSD(P, Q) that are reachable from (0, 0) on a bimonotone path. Then, d F (P, Q) ≤ 1 if and only if (n, n) ∈ reach(P, Q), see Fig. 1. It is not necessary to compute all of reach(P, Q): since FSD(P, Q) is convex inside each cell, we only need the intersections reach(P, Q) ∩ ∂C(i, j). The sets defined by reach(P, Q) ∩ ∂C(i, j) are subintervals of the doors of the free-space diagram, and they are defined by endpoints of doors in the free-space diagram in the same row or column. We call the intersection of a door with reach(P, Q) a reach-door. The reachdoors can be found in O(n 2 ) time through a simple breadth-first-traversal of the cells [8]. In the next sections, we show how to obtain the crucial information, i.e., whether (n, n) ∈ reach(P, Q), in o(n 2 ) time instead.

Basic Approach and Intuition
In our algorithm for the decision problem, we basically want to compute reach(P, Q). But instead of propagating the reachability information cell by cell, we always group τ × τ cells (with 1 τ n) into an elementary box of cells. When processing a box, we can assume that we know which parts of the left and the bottom boundary of the box are reachable. That is, we know the reach-doors on the bottom and left boundary, and we need to compute the reach-doors on the top and right boundary of the elementary box. These reach-doors are determined by the combinatorial structure of the box. More specifically, suppose we know for every row and column the order of the door endpoints (including for the reach-doors on the left and bottom boundary). Then, we can deduce which of these door boundaries determine the reach-doors on the top and right boundary. We call the sequence of these orders, the (full) signature of the box.
The total number of possible signatures is bounded by an expression in terms of τ . Thus, if we pick τ sufficiently small compared to n, we can pre-compute for all possible signatures the reach-doors on the top and right boundary, and build a data structure to query these quickly (Sect. 3). Since the reach-doors on the bottom and left boundary are required to make the signature, we initially have only incomplete signatures. In Sect. 4, we describe how to compute these efficiently. The incomplete signatures are then used to preprocess the data structure such that we can quickly find the full signature once we know the reach-doors of an elementary box. After building and preprocessing the data structure, it is possible to determine d F (P, Q) ≤ 1 efficiently by traversing the free-space diagram elementary box by elementary box, as explained in Sect. 5. Table   3.

Preprocessing an Elementary Box
Before it considers the input, our algorithm builds a lookup table. As mentioned above, the purpose of this table is to speed up the computation of small parts of the free-space diagram.
Let τ ∈ N be a parameter. 4 The elementary box is a subdivision of [0, τ ] 2 into τ columns and rows, thus τ 2 cells. 5 . We denote the left side of the boundary ∂ D(i, j) by l(i, j) and the bottom side by b(i, j). Note that l(i, j) coincides with the right side of ∂ D(i − 1, j) and b(i, j) with the top of ∂ D(i, j − 1). Thus, we write l(τ, j) for the right side of D(τ − 1, j) and b(i, τ ) for the top side of D(i, τ − 1). Figure 2 shows the elementary box.
The door-order σ r j for a row j is a permutation of {s 0 , t 0 , . . . , s τ , t τ }, having 2τ + 2 elements. For i = 1, . . . , τ , the element s i represents the lower endpoint of the door on l(i, j), and t i represents the upper endpoint. The elements s 0 and t 0 are an exception: they describe the reach-door on the boundary l(0, j) (i.e., its intersection with reach(P, Q)). The door-order σ r j represents the combinatorial order of these endpoints, as projected onto a vertical line, i.e., they are sorted into their vertical order. Some door-orders may encode the same combinatorial structure. In particular, when door i is closed, the exact position of s i and t i in a door-order is irrelevant, as long as t i comes before s i . For a closed door i (i > 0), we assign s i to the upper endpoint of l(i, j) and t i to the lower endpoint. The values of s 0 and t 0 are defined by the reachdoor and their relative order is thus a result of computation. We break ties between s i and t i by placing s i before t i , and any other ties are resolved by index. A door-order σ c i is defined analogously for a column i. We write x < c i y if x comes before y in σ c i , and x < r j y if x comes before y in σ r j . An incomplete door-order is a door-order in which s 0 and t 0 are omitted (i.e. the intersection of reach(P, Q) with the door is still unknown); see Fig. 3.  Fig. 3 The door-order of a row (the vertical order of the points) encodes the combinatorial structure of the doors. The door-order for the row in the figure is s 1 s 3 s 4 t 5 t 3 t 0 s 2 t 4 s 0 s 5 t 1 t 2 . Note that s 0 and t 0 represent the reach-door, which is empty in this case. These are omitted in the incomplete door-order We can now define the (full) signature of the elementary box as the aggregation of the door-orders of its rows and columns. Therefore, a signature = (σ c 1 , . . . , σ c τ , σ r 1 , . . . , σ r τ ) consists of 2τ door-orders: one door-order σ c i for each column i and one doororder σ r j for each row j of the elementary box. Similarly, an incomplete signature is the aggregation of incomplete door-orders.
For a given signature, we define the combinatorial reachability structure of the elementary box as follows. For each column i and for each row j, the combinatorial reachability structure indicates which door boundaries in the respective column or row define the reach-door of b(i, τ ) or l(τ, j). Proof We use dynamic programming, very similar to the algorithm by Alt and Godau [8]. For each vertical edge l(i, j) we define a variable l(i, j), and for each horizontal edge b(i, j) we define a variable b(i, j). The l(i, j) are pairs of the form (s u , t v ), representing the reach-door reach(P, Q) ∩ l(i, j). If this reach-door is closed, then t v < r j s u holds. If the reach-door is open, then it is bounded by the lower endpoint of the door on l(u, j) and by the upper endpoint of the door on l(v, j). (Note that in this case we have v = i.) Once again s 0 and t 0 are special and represent the reach-door on l(0, j). The variables b(i, j) are defined analogously.
The three cases for the recursive definition of l(i, j). If the lower boundary is reachable, we can reach the whole right door (left). If neither the lower nor the left boundary is reachable, the right door is not reachable either (middle). Otherwise, the lower boundary is the maximum of l(i − 1, j) and the lower boundary of the right door (right) Now we can compute l(i, j) and b(i, j) recursively as follows: first, we set Next, we describe how to find l(i, j) given l(i − 1, j) and b(i − 1, j), see is open, we may be able to reach l(i, j) via l(i − 1, j). Let s u be the lower endpoint of l(i − 1, j). We need to pass l(i, j) above s u and s i and below t i , and therefore set l(i, j) := (max(s u , s i ), t i ), where the maximum is taken according to the order < r j . The recursion for the variable b(i, j) is defined similarly. We can implement the recursion in time O(τ 2 ) for any given signature, for example by traversing the elementary box column by column, while processing each column from bottom to top.
There are at most ((2τ + 2)!) 2τ = τ O(τ 2 ) distinct signatures for the elementary box. We choose τ = λ √ log n/ log log n for a sufficiently small constant λ > 0, so that this number becomes o(n). Thus, during the preprocessing stage we have time to enumerate all possible signatures and determine the corresponding combinatorial reachability structure inside the elementary box. This information is then stored in an appropriate data structure.

Building the Data Structure
Before we describe this data structure, we first explain how the door-orders are represented. This depends on the computational model. By our choice of τ , there are o(n) distinct door-orders. On the word RAM, we represent each door-order and incomplete door-order by an integer between 1 and (2τ )!. This fits into a word of log n bits. On the pointer machine, we create a record for each door-order and incomplete door-order; we represent an order by a pointer to the corresponding record.
The data structure has two stages. In the first stage, we assume we know the incomplete door-order for each row and for each column of the elementary box, 6 and we wish to determine the incomplete signature. In the second stage we have obtained the reach-doors for the left and bottom sides of the elementary box, and we are looking for the full signature. The details of our method depend on the computational model. One way uses table lookup and requires the word RAM; the other way works on the pointer machine, but is a bit more involved.

Word RAM
We organize the lookup table as a large tree T . In the first stage, each level of T corresponds to a row or column of the elementary box. Thus, there are 2τ levels. Each node has (2τ )! children, representing the possible incomplete door-orders for the next row or column. Since we represent door-orders by positive integers, each node of T may store an array for its children; we can choose the appropriate child for a given incomplete door-order in constant time. Thus, determining the incomplete signature for an elementary box requires O(τ ) steps on a word RAM.
For the second stage, we again use a tree structure. Now the tree has O(τ ) layers, each with O(log τ ) levels. Again, each layer corresponds to a row or column of the elementary box. The levels inside each layer then implement a balanced binary search tree that allows us to locate the endpoints of the reach-door within the incomplete signature. Since there are 2τ endpoints, this requires O(log τ ) levels. Thus, it takes O(τ log τ ) time to find the full signature of a given elementary box.

Pointer machine
Unlike in the word RAM model, we are not allowed to store a lookup table on every level of the tree T , and there is no way to quickly find the appropriate child for a given door-order. Instead, we must rely on batch processing to achieve a reasonable running time.
Thus, suppose that during the first stage we want to find the incomplete signatures for a set B of m elementary boxes, where again for each box in B we know the incomplete door-order for each row and each column. Recall that we represent the door-order by a pointer to the corresponding record. With each such record, we store a queue of elementary boxes that is empty initially.
We now simultaneously propagate the boxes in B through T , proceeding level by level. In the first level, all of B is assigned to the root of T . Then, we go through the nodes of one level of T , from left to right. Let v be the current node of T . We consider each elementary box b assigned to v. We determine the next incomplete door-order for b, and we append b to the queue for this incomplete door-order-the queue is addressed through the corresponding record, so all elementary boxes with the same next incomplete door-order end up in the same queue. Next, we go through the nodes of the next level, again from left to right. Let v be the current node. The node v corresponds to a next incomplete door-order σ that extends the known signature of its parents. We consider the queue stored at the record for σ . By construction, the elementary boxes that should be assigned to v appear consecutively at the beginning of this queue. We remove these boxes from the queue and assign them to v . After this, all the queues are empty, and we can continue by propagating the boxes to the next level. During this procedure, we traverse each node of T a constant number of times, and in each level of the T we consider all the boxes in B. Since T has o(n) nodes, the total running time is O(n + mτ ).
For the second stage, the data structure works just as in the word RAM case, because no table lookup is necessary. Again, we need O(τ log τ ) steps to process one box. After the second stage, we obtain the combinatorial reachability structure of the box in constant time since we precomputed this information for each box (Lemma 3.1). Thus, we have shown the following lemma, independently of the computational model.

Preprocessing a Given Input
Next, we perform a second preprocessing phase that considers the input curves P and Q. Our eventual goal is to compute the intersection of reach(P, Q) with the cell boundaries, taking advantage of the data structure from Sect. 3. For this, we aggregate the cells of FSD(P, Q) into (concrete) elementary boxes consisting of τ × τ cells. There are n 2 /τ 2 such boxes. We may avoid rounding issues by either duplicating vertices or handling a small part of FSD(P, Q) without lookup tables. The goal is to determine the signature for each elementary box S. At this point, this is not quite possible yet, since the signature depends on the intersection of reach(P, Q) with the lower and left boundary of S. Nonetheless, we can find the incomplete signature, in which the positions of s 0 , t 0 (the reach-door) in the (incomplete) door-orders σ r i , σ c j are still to be determined. We aggregate the columns of FSD(P, Q) into vertical strips, each corresponding to a single column of elementary boxes (i.e., τ consecutive columns of cells in FSD(P, Q)). See Fig. 5.
Let A be such a strip. It corresponds to a subcurve P of P with τ edges. The following lemma implies that we can build a data structure for A such that, given any

Lemma 4.1 Given a subcurve P with τ edges, we can compute in O(τ 6 ) time a data structure that requires O(τ 6 ) space and that allows us to determine the incomplete door-order of any line segment on Q in time O(log τ ).
Proof Consider the arrangement A of unit circles whose centers are the vertices of P (see Fig. 6). The incomplete door-order of a line segment s is determined by the intersections of s with the arcs of A (and for a circle not intersecting s by whether s lies inside or outside of the circle). Let s be the line spanned by line segment s. Suppose we wiggle s . The order of intersections of s and the arcs of A changes only when s moves over a vertex of A or if s leaves or enters a circle. We use the standard duality transform that maps a line : y = ax + b to the point * : (a, −b), and vice versa. Consider a unit circle C in A with center (c x , c y ). Elementary geometry shows that the set of all lines that are tangent to C from above dualizes to the curve t * a (C) : Similarly, the lines that are Since any pair of distinct circles C 1 , C 2 has at most four common tangents, one for each choice of above/below C 1 and above/below C 2 , it follows that any two curves in C * intersect at most once.
Let V be the set of vertices in A, and let V * be the lines dual to the points in V (note that |V | = O(τ 2 )). Since for any vertex v ∈ V and any circle C ∈ A there are at most two tangents through v on C, each line in V * intersects each curve in C * at most once. Thus, the arrangement B of the curves in V * ∪ C * is an arrangement of pseudolines with complexity O(τ 4 ). Furthermore, it can be constructed in the same expected time, together with a point location structure that finds the containing cell in B of any given point in time O(log τ ) [60, Chap. 6.6.1].
Now consider a line segment s and the supporting line s . As observed in the first paragraph, the combinatorial structure of the intersection between s and A is completely determined by the cell of B that contains the dual point * s . Thus, for every cell f (s) ∈ B, we construct a list L f (s) that represents the combinatorial structure of We can compute L f (s) by traversing the zone of s in A. Since circles intersect at most twice and since a line intersects any circle at most twice, the zone has complexity O(τ 2 α(τ ) ), where α(·) denotes the inverse Ackermann function [60,Thm. 5.11] Given the list L f (s) , the incomplete door-order of s is determined by the position of the endpoints of s in L f (s) . There are O(τ 2 ) possible ways for this, and we build a table T f (s) that represents them. For each entry in T f (s) , we store a representative for the corresponding incomplete door-order. As described in the previous section, the representative is a positive integer in the word RAM model and a pointer to the appropriate record on a pointer machine.
The total size of the data structure is O(τ 6 ) and it can be constructed in the same time. A query works as follows: given s, we can compute  Proof By building and using the data structure from Lemma 4.1, we determine the incomplete door-order for each row in each vertical τ -strip in total time proportional to n τ (τ 6 + n log τ ) = nτ 5 + n 2 log τ τ .
We repeat the procedure with the horizontal strips. Now we know for each elementary box in FSD(P, Q) the incomplete door-order for each row and each column. We use the data structure of Lemma 3.2 to combine these. As there are n 2 /τ 2 boxes, the number of steps is O(n 2 /τ + n) = O(n 2 /τ ). Hence, the incomplete signature for each elementary box is found in O(nτ 5 + n 2 (log τ )/τ ) steps.

Solving the Decision Problem
With the data structures and preprocessing from the previous sections, we have all ingredients in place to determine whether d F (P, Q) ≤ 1. We know for each elementary box its incomplete signature and we have a data structure to derive its full signature (and with it, the combinatorial reachability structure) when its reach-doors are known. What remains to be shown is that we can efficiently process the free-space diagram to determine whether (n, n) ∈ reach(P, Q). This is captured in the following lemma. Proof We go through all elementary boxes of FSD(P, Q), processing them one column at a time, going from bottom to top in each column. Initially, we know the full signature for the box S in the lower left corner of FSD(P, Q). We use the signature to determine the intersections of reach(P, Q) with the upper and right boundary of S. There is a subtlety here: the signature gives us only the combinatorial reachability structure, and we need to map the resulting s i , t j back to the corresponding vertices on the curves. On the word RAM, this can be done easily through table lookups. On the pointer machine, we use representative records for the s i , t i elements and use O(τ ) time before processing the box to store a pointer from each representative record to the appropriate vertices on P and Q.
We proceed similarly for the other boxes. By the choice of the processing order of the elementary boxes we always know the incoming reach-doors on the bottom and left boundary when processing a box. Given the incoming reach-doors, we can determine the full signature and find the structure of the outgoing reach-doors in total time O(τ log τ ), using Lemma 3.2. Again, we need O(τ ) additional time on the pointer machine to establish the mapping from the abstract s i , t i elements to the concrete vertices of P and Q. In total, we spend O(τ log τ ) time per box. Thus, it takes time O(n 2 (log τ )/τ ) to process all boxes, as claimed.
As a result, we obtain the following theorem for the pointer machine (and, by extension, for the real RAM model). For the word RAM model, we can obtain an even faster algorithm (see Sect. 6).

Theorem 5.2
There is an algorithm that solves the decision version of the Fréchet problem in O(n 2 (log log n) 3/2 / √ log n) time on a pointer machine.
Proof Set τ = λ √ log n/ log log n, for a sufficiently small constant λ > 0. The theorem follows by applying Lemmas 3.2, 4.2, and 5.1 in sequence. to a constant factor). However, we change a number of things. "Signatures" are represented differently and the data structure to obtain combinatorial reachability structures is changed accordingly. Furthermore, we aggregate elementary boxes into clusters and determine "incomplete door-orders" for multiple boxes at the same time. Finally, we walk the free-space diagram based on the clusters to decide d F (P, Q) ≤ 1.

Clusters and Extended Signatures
We introduce a second level of aggregation in the free-space diagram (see Fig. 7): a cluster is a collection of τ ×τ elementary boxes, that is, τ 2 ×τ 2 cells in FSD(P, Q). Let R be a row of cells in FSD(P, Q) of a certain cluster. As before, the row R corresponds to an edge e on Q and a subcurve P of P with τ 2 edges. We associate with R an ordered set Z = e 0 , z 0 , z 1 , z 1 , z 2 , z 2 , . . . , z k , z k , e 1 with 2 · k + 3 elements. Here k is the number of intersections of e with the unit circles centered at the τ 2 vertices of P (all but the very first). Hence, k is bounded by 2τ 2 and |Z | is bounded by 4τ 2 + 3. The order of Z indicates the order of these intersections with e directed along Q. Elements e 0 and e 1 represent the endpoints of e and take a special role. In particular, these are used to represent closed doors and snap open doors to the edge e. The elements z i are placeholders for the positions of the endpoints of the reach-doors: z 0 represents a possible reach-door endpoint between e 0 and z 1 , the element z 1 represents an endpoint between z 1 and z 2 , etc. Consider a row R of an elementary box inside the row R of a cluster, corresponding to an edge e of Q. The door-index of R is an ordered set s 0 , t 0 , . . . , s τ , t τ of size 2τ + 2. Similar to a door-order, elements s 0 and t 0 represent the reach-door at the leftmost boundary of R ; the elements s i and t i (1 ≤ i ≤ τ ) represent the door at the right boundary of the ith cell in R . However, instead of rearranging the set to indicate relative positions, the elements s i and t i simply refer to elements in Z . If the door is open, they refer to the corresponding intersections with e (possibly snapped to e 0 or e 1 ). If the door is closed, s i is set to e 1 and t i is set to e 0 . The elements s 0 and t 0 are special, representing the reach-door, and they refer to one of the elements z i . An incomplete door-index is a door-index without s 0 and t 0 . The advantage of a doorindex over a door-order is that the reach-door is always at the start. Hence, completing

Preprocessing a Given Input
During the preprocessing for a given input P, Q, we use superstrips consisting of τ strips. That is, a superstrip is a column of clusters and consists of τ 2 columns of the free-space diagram. Lemma 4.1 still holds, albeit with a larger constant c in place of 6. The data structure gets as input a query edge e, and it returns in O(log τ ) time a word that contains τ fields. Each field represents the incomplete door-index for e in the corresponding elementary box and thus consists of O(τ log τ ) bits. Hence, the word size is O(τ 2 log τ ) = O(log n) by our choice of τ . Thus, the total time for building a data structure for each superstrip and for processing all rows is O(n/τ 2 (τ c + n log τ )) = O(n 2 (log τ )/τ 2 ). We now have parts of the incomplete indexed signature for each elementary box packed into different words. To obtain the incomplete indexed signature, we need to rearrange the information such that the incomplete door-indices of the rows in one elementary box are in a single word. This corresponds to computing a transpose of a matrix, as is illustrated in Fig. 8. For this, we need the following lemma, which can be found-in slightly different form-in Thorup [62,Lem. 9]. Lemma 6.1 Let X be a sequence of τ words that contain τ fields each, so that X can be interpreted as a τ × τ matrix. Then we can compute in time O(τ log τ ) on a word RAM a sequence Y of τ words with τ fields each that represents the transpose of X .
Proof The algorithm is recursive and solves a more general problem: let X be a sequence of a words that represents a sequence M of b different a × a matrices, such that the ith word in X contains the fields of the ith row of each matrix in M from left to right. Compute a sequence of words Y that represents the sequence M of the transposed matrices in M.
The recursion works as follows: if a = 1, there is nothing to be done. Otherwise, we split X into the sequence X 1 of the first a/2 words and the sequence X 2 of the remaining words. X 1 and X 2 now represent a sequence of 2b (a/2) × (a/2) matrices, which we transpose recursively. After the recursion, we put the (a/2) × (a/2) submatrices back together in the obvious way. To finish, we need to transpose the off-diagonal submatrices. This can be done simultaneously for all matrices in time O(a), by using appropriate bit-operations (or table lookup O(a log a), as desired.
By applying the lemma to the words that represent τ consecutive rows in a superstrip, we obtain the incomplete door-indices of the rows for each elementary box. This takes total time proportional to We repeat this procedure for the horizontal superstrips. By using an appropriate lookup table to combine the incomplete door-indices of the rows and columns, we obtain the incomplete indexed signature for each elementary box in total time O(n 2 (log τ )/τ 2 ).

The Actual Computation
We traverse the free-space diagram cluster by cluster (recall that a cluster consists of τ × τ elementary boxes). The clusters are processed column by column from left to right, and inside each column from bottom to top. Before processing a cluster, we walk along the left and lower boundary of the cluster to determine the incoming reach-doors. This is done by performing a binary search for each box on the boundary, and determining the appropriate elements z i which correspond to the incoming reachdoors. Using this information, we assemble the appropriate words that represent the incoming information for each elementary box. Since there are n 2 /τ 4 clusters, this step requires time O((n 2 /τ 4 )τ 2 log τ ) = O(n 2 (log τ )/τ 2 ). We then process the elementary boxes inside the cluster, in a similar fashion. Now, however, we can process each elementary box in constant time through a single table lookup, so the total time is O(n 2 /τ 2 ). Hence, the total running time of our algorithm is O(n 2 (log τ )/τ 2 ). By our choice of τ = λ √ log n/ log log n for a sufficiently small λ > 0, we obtain the following theorem.

Computing the Fréchet Distance
The optimization version of the Fréchet problem, i.e., computing the Fréchet distance, can be done in O(n 2 log n) time using parametric search with the decision version as a subroutine [8]. We showed that the decision problem can be solved in o(n 2 ) time. However, this does not directly yield a faster algorithm for the optimization problem: if the running time of the decision problem is T (n), parametric search gives an O((T (n)+n 2 ) log n) time algorithm [8]. There is an alternative randomized algorithm by Raichel and Har-Peled [49]. Their algorithm also needs O((T (n) + n 2 ) log n) time, but below we adapt it to obtain the following lemma. If we also include vertex-vertex-edge tuples with no intersection, we can sample a critical value uniformly at random in constant time. The algorithm now works as follows (see Har-Peled and Raichel [49] for more details): first, we sample a set S of K = 4n 2 critical values uniformly at random. Next, we find a , b ∈ S such that the Fréchet distance lies between a and b and such that [a , b ] contains no other value from S. In the original algorithm this is done by sorting S and performing a binary search using the decision version. Using median-finding instead, this step can be done in O(K + T (n) log K ) time. Alternatively, the running time of this step could be reduced by picking a smaller K . However, this does not improve the final bound, since it is dominated by a O(n 2 2 α(n) ) term. The interval [a , b ] with high probability contains only a small number of the remaining critical values. More precisely, for K = 4n 2 the probability that [a , b ] has more than 2cn ln n critical values is at most 1/n c [49,Lem. 6.2].
The remainder of the algorithm proceeds as follows: first, we find all critical values of type vertex-vertex and vertex-edge that lie inside the interval For this, take an edge e of P and the vertices of Q. Conceptually, we start with circles of radius a around the vertices of Q, and we increase the radii until b. During this process, we observe the evolution of the intersection points between the circle arcs and e. Because all vertex-vertex and vertex-edge events have been eliminated, each circle intersects e in either 0 or 2 points, and this does not change throughout the process. A critical value of vertex-vertex-edge type corresponds to the event that two different circles intersect e in the same point, i.e., that two intersection points meet while growing the circles. Two intersection points can meet at most once, and when they do, they exchange their order along e.
This suggests the following algorithm: let A a be the arrangement of circles with radius a around the vertices of Q, and let A b be the concentric arrangement of circles with radius b. We determine the ordered sequence I a of the intersection points of the circles in A a with e, and we number them in their order along e. Next, we find the ordered sequence of intersection points I b between e and the circles in A b . We assign to each point in I b the number of the corresponding intersection points in I a .
Since |I a | = |I b |, this gives a permutation of {1, . . . , |I a |}, Two intersection points change their order from I a to I b exactly if there is a vertex-vertex-edge event in [a, b], so these events correspond to the inversions of the resulting permutation. Given that there are k such inversions, we can find them in time O(|I a | + k) using insertion sort. Thus, the overall running time to find the critical events in [a, b], ignoring the time for computing I a and I b , is O(n 2 + K ).
It remains to show that we can quickly find I a and I b . We describe the algorithm for I a . First, compute the arrangement A a of circles with radius a around the vertices of Q. This takes O(n 2 ) time [32]. To find the intersection order, traverse in A a the zone of the line spanned by e. The time for the traversal is bounded by the complexity of the zone. Since the circles pairwise intersect at most twice and intersects each circle only twice, the complexity of the zone is O(n2 α(n) ) [60, Thm. 5.11]. Summing over all edges e, this adds a total of O(n 2 2 α(n) ) to the running time. To find I b , we proceed similarly with A b . Thus the overall time is O(T (n) log(n) + n 2 2 α(n) + K ). The event K > 8n ln n has probability less than 1/n 4 , and we always have K = O(n 3 ). Thus, this case adds o(1) to the expected running time. Given K ≤ 8n ln n, the running time is O(n log n). Lemma 7.1 follows. Theorem 7.2 now results from Lemma 7.1, Theorem 5.2, and Theorem 6.2.

Theorem 7.2 The Fréchet distance of two polygonal curves with n edges each can be computed by a randomized algorithm in time O(n 2 √
log n(log log n) 3/2 ) on a pointer machine and in time O(n 2 (log log n) 2 ) on a word RAM.

Discrete Fréchet Distance on the Pointer Machine
As mentioned in the introduction, Agarwal et al. [1] give a subquadratic algorithm for finding the discrete Fréchet distance between two point sequences, using the word RAM. In this section, we explain how their algorithm for the decision version of the problem can be adapted to the pointer machine. This shows that, at least for the decision version, the speed-up does not come from bit-manipulation tricks but from a deeper understanding of the underlying geometric structure. Our presentation is slightly different from Agarwal et al. [1], in order to allow for a clearer comparison with our continuous algorithm.
We recall the problem definition: we are given two sequences P = p 1 , p 2 , . . . , p n and Q = q 1 , q 2 , . . . , q n of n points in the plane. For δ > 0, we define a directed graph G δ with vertex set P × Q. In G δ , there is an edge between two vertices ( p i , q j ), ( p i , q j+1 ) if and only if both d( p i , q j ) ≤ δ and d( p i , q j+1 ) ≤ δ. The condition is similar for an edge between vertices ( p i , q j ) and ( p i+1 , q j ), and vertices ( p i , q j ) and The discrete Fréchet distance: (left) two point sequences P (disks) and Q (crosses) with 5 points each; (middle) the associated free-space matrix F (white = 1, gray = 0); (right) the resulting reachability matrix M. Since M 55 = 1, the discrete Fréchet distance is at most 1 ( p i+1 , q j+1 ). There are no further edges in G δ . The discrete Fréchet distance between P and Q is the smallest δ for which G δ has a path from ( p 1 , q 1 ) to ( p n , q n ). In the decision version of the problem, we are given δ > 0, and we need to decide whether there is a path from ( p 1 , q 1 ) to ( p n , q n ) in G δ .
We now describe a subquadratic pointer machine algorithm for the decision version. Thus, let point sequences P, Q be given, and suppose without loss of generality that δ = 1. The discrete analogue of the free-space diagram is an n × n Boolean matrix F where F i j = 1, if d( p i , q j ) ≤ 1, and F i j = 0, otherwise, for i, j = 1, . . . , n. We call F the free-space matrix. Similarly, the discrete analogue of the reachable region is an n × n Boolean matrix M that is defined recursively as follows: Adapting the method of Agarwal et al. [1], we show how to use preprocessing and table lookup in order to decide whether M nn = 1 in o(n 2 ) steps on a pointer machine. Let τ = λ log n, for a suitable constant λ > 0. We subdivide the rows of M into k = O(n/τ ) strips, each consisting of τ consecutive rows: the first strip L 1 consists of rows 1, . . . , τ , the second strip L 2 consists of rows τ + 1, . . . , 2τ , and so on. Each strip L i , i = 1, . . . , k, corresponds to a contiguous subsequence P i of τ points on P. Let A i be the arrangement of disks obtained by drawing a unit disk around each vertex in P i . The arrangement A i has O(τ 2 ) faces.
Next, let ρ = λ log n/ log log n, with λ > 0 as above. We subdivide each strip L i , i = 1, . . . , k into l = O(n/ρ) elementary boxes, each consisting of ρ consecutive columns in L i . We label the elementary boxes as B i j , for i = 1, . . . , k and j = 1, . . . , l. As above, an elementary box B i j has corresponding contiguous subsequences P i of τ vertices on P and Q j of ρ vertices on Q. Now, the incomplete signature of an elementary box B i j consists of (i) the index i of the strip that contains it; and (ii) the sequence f 1 , f 2 , . . . , f ρ of faces in the disk arrangement A i that contain the ρ vertices of Q j , in that order. The full signature of an elementary box B i j consists of its incomplete signature plus a sequence of ρ + τ bits, that represent the entries in the reach matrix M directly above and to the left of B i j . We call these bits the reach bits.  = (i, f 1 , . . . , f ρ ) we build a lookup-table that encodes for each possible setting of the reach bits the resulting reach bits at the bottom and the right boundary of the elementary box. There are 2 ρ+τ ≤ n 1/3 possible settings of the reach bits, by our choice of τ and ρ and for λ small enough. We enumerate all of them and organize them as a complete binary tree of depth ρ + τ . For each setting of the reach bits, we use the information of the incomplete signature to determine the result through a straightforward dynamic programming algorithm [1,18,39] in O(τ · ρ) = O(log 2 n) time, and we store the result as a linked list of length ρ + τ − 1 at the leaf for the corresponding reach bits; see Fig. 11. Thus, the total time for this part of the preprocessing phase is O(n 5/3 log n).
Next, we determine for each elementary box B i j its incomplete signature. For this, we use the point location structure for A i to determine for each vertex in Q j the face of A i that contains it. There are O(n 2 /τρ) elementary boxes, each Q j has ρ vertices, and one point location query takes O(log τ ) time, so the total time for this step is O((n 2 /τρ) · ρ · log τ ) = O((n 2 / log n) log log n). Using this information, we can store with each elementary box a pointer to the lookup table for the corresponding incomplete signature. Remark Agarwal et al. [1] further describe how to get a faster algorithm for the decision version by aggregating the elementary boxes into larger clusters, similar to the method given in Sect. 6. This improved algorithm finally leads to a subquadratic algorithm for computing the discrete Fréchet distance. Unfortunately, as in Sect. 6, it seems that this improvement crucially relies on constant time table lookup, so it does not directly translate to the pointer machine.
The reader may also notice that in this section we could choose τ, ρ ≈ log n, whereas in the previous sections we had τ ≈ √ log n. This is due to the slightly different definition of signature: in the discrete case, once the subsequence P i is fixed, there are only τ O(ρ) possible ways how the subsequence Q j might interact with P i . In the continuous case, this does not seem to be so clear, and we work with the weaker bound of τ O(τ 2 ) possible interactions.

Decision Trees
Our results also have implications for the decision-tree complexity of the Fréchet problem. Since in that model we account only for comparisons between input elements, the preprocessing comes for free, and hence the size of the elementary boxes can be increased. Before we consider the continuous Fréchet problem, we first note that a similar result can be obtained easily for the discrete Fréchet problem.

Theorem 9.1 The discrete Fréchet problem has an algebraic computation tree of depth O(n 4/3 ).
Proof First, we consider the decision version: we are given two sequences P = p 1 , . . . , p n and Q = q 1 , . . . , q n of n points in the plane, and we would like to decide whether the discrete Fréchet distance between P and Q is at most 1. Katz and Sharir [53] showed that we can compute a representation of the set of pairs ( p i , q j ) with p i − q j ≤ 1 in O(n 4/3 ) steps. This information suffices to complete the reachability matrix without further comparisons. As shown by Agarwal et al. [1], one can then solve the optimization problem at the cost of another O(log n)-factor, which is absorbed into the O-notation.
Given our results above, we prove an analogous statement for the continuous Fréchet distance.

Theorem 9.2 There exists an algebraic decision tree for the Fréchet problem (decision version) of depth O(n 2−ε ), for a fixed constant ε > 0.
Proof We reconsider the steps of our algorithm. The only phases that actually involve the input are the second preprocessing phase and the traversal of the elementary boxes. The reason of our choice for τ was to keep the time for the first preprocessing phase small. This is no longer a problem. By Lemmas 4.2 and 5.1, the remaining cost is bounded by O(nτ 5 + n 2 (log τ )/τ ). Choosing τ = n 1/6 , we get a decision tree of depth n · n 5/6 + n 2−1/6 log n. This is O(n 2−(1/6) log n) = O(n 2−ε ), for any fixed 0 < ε < 1/6.

Weak Fréchet Distance
The weak Fréchet distance is a variant of the Fréchet distance where we are allowed to walk backwards along the curves [8]. More precisely, let P and Q be two polygonal curves, each with n edges, and let be the set of all continuous functions α : [0, 1] → [0, n] with α(0) = 0 and α(1) = n. The weak Fréchet distance between P and Q is defined as Compared to the regular Fréchet distance, the set now also contains non-monotone functions. The weak Fréchet distance was also introduced by Alt and Godau [8], who showed how to compute it in O(n 2 log n) worst-case time. We will now use our framework to obtain an algorithm that runs in o(n 2 ) expected time on a word RAM.

A Decision Algorithm for the Pointer Machine
As usual, we start with the decision version: given two polygonal curves P and Q, each with n edges, decide whether d wF (P, Q) ≤ 1. This has an easy interpretation in Let τ, ρ ∈ N be parameters, to be determined later. We subdivide the cells into k = O(n/τ ) vertical strips L 1 , . . . , L k , each consisting of τ consecutive columns. Each strip L i corresponds to a subcurve P i of P with τ edges. For each such subcurve P i , we define two arrangements A i and B i . To obtain A i , we take for each edge e of P i the "stadium" c e of points with distance exactly 1 from e, and we compute the resulting arrangement. Since two distinct curves c e , c e cross in O(1) points, the complexity of Fig. 13. The arrangement B i is the arrangement B described in the proof of Lemma 4.1, i.e., the arrangement of the curves dual to the tangent lines for the unit circles around the vertices of P i .
Next, we subdivide each strip into = O(n/ρ) elementary boxes, each consisting of ρ consecutive rows. We label the elementary boxes as B i j , with 1 ≤ i ≤ k, 1 ≤ j ≤ . The rows of an elementary box B i j correspond to a subcurve Q j of Q with ρ edges. The signature of B i j consists of (i) the index i of the corresponding strip; (ii) for each vertex of Q j the face of A i that contains it; and (ii) for each edge e of Q j the face of B i that contains the point that is dual to the supporting line of e, plus two indices a, b ∈ {1, . . . , τ } that indicate the first and the last unit circle around a vertex of P i that e intersects, as we walk from one endpoint to another.
Given an elementary box B, the connection graph G B of B has τρ vertices, one for each cell in B, and an edge between two cells C, C of B if and only if C and C Finally, given the signature, we can build the connection graph G B i j in O(τρ) time, assuming that the arrangements A i and B i provide suitable data structures. With G B i j at hand, the connection list can be found in O(τρ) steps, using breadth first search.
As usual, our strategy now is to preprocess all possible signatures and to determine the signature of each elementary box. Using this information, we can then process the elementary boxes quickly in our main algorithm. The next lemma describes the preprocessing steps. With the information from the preprocessing phase, we can easily solve the decision problem with a union-find data structure. The following theorem summarizes our algorithm for the decision problem. Next, we group the elementary boxes into clusters. A cluster consists of log n vertically adjacent elementary boxes from a single strip. The first set of clusters come from bottommost log n elementary boxes, the second set of clusters from following log n elementary boxes, etc. The boundary and the connectivity list of a cluster are defined analogously as for an elementary box. Below, in Lemma 10.6, we show that we can compute the (pointer-based) connectivity list of a cluster in time O((τ + ρ)(log log n) 4 ). Then, the lemma follows: there are O(n 2 /(τρ log n)) clusters, so the total time to find connectivity lists for all clusters is O(n 2 /(ρ log n)(log log n) 4 + n 2 /(τ log n)(log log n) 4 ) = O((n 2 / log 2 n)(log log n) 5  Proof We adapt usual techniques for packed integer sorting [5]. We precompute a merge-operation that receives two packed words, each with their entries sorted according to a given field, and returns two packed words that represent the sorted sequence of all entries. We also precompute an operation that receives one packed word and returns a packed word with the same entries, sorted according to a given field. Then, we can perform a merge sort in O(μ log μ) time, because in each level of the recursion, the total time for merging is O(μ). Once we are down to a single word, we can sort its entries in one step. More details can be found in the literature on packed sorting [5,27].

Lemma 10.2 We can determine for each elementary box B i j a pointer to its connectivity list in total time O(nτ
Now we describe the strategy of the main algorithm. As explained above, a cluster consists of log n vertically adjacent elementary boxes. We number the boxes B 1 , . . . , B log n , from bottom to top. For i = 1, . . . , log n, we denote by G i the connection graph for the boxes B 1 , . . . , B i . That is, G i is obtained by taking the union i j=1 G B j of the individual connection graphs and adding edges for adjacent cells in neighboring boxes that share an open door. Our algorithm proceeds in two phases. In the first phase, we propagate the connectivity information upwards. That is, for i = 1, . . . , log n, we compute a sequence W i of O((τ + ρ) log log n/ log n) packed words. Each entry in W i corresponds to a cell C on the boundary of B i . The first field stores a unique identifier for the cell, and the second entry stores an identifier of the connected component of C in G i . In the second phase, we go in the reverse direction. For i = log n, . . . , 1, we compute a sequence W i of O((τ + ρ) log log n/ log n) packed words that store for each cell C on the boundary of B i an identifier of the connected component of C in G log n . Once this information is available, the (pointerbased) connectivity list for the cluster boundary can be extracted in O(τ + ρ) time, using appropriate precomputed operations on packed words, see Fig. 14 .
We begin with the first phase. The upper boundary of an elementary box are the τ cells in the topmost row of the box, the lower boundary are the τ cells in the bottommost row. From the preprocessing phase, we have a pointer to the packed connectivity lists for all elementary boxes B 1 , . . . , B log n . We make local copies of   / log n)) log log n) words whose entries represent the edges of H , where each entry stores the component identifiers of the two endpoints of the edge. We are now ready to implement the algorithm of Hirschberg, Chandra, and Sarwate. The main steps of the algorithm are as follows, see Fig. 15.
Step 1: Find for each vertex of H the neighbor with the smallest and with the largest identifier. This can be done in O((τ/ log n) log log 2 n) time by sorting the edge lists twice, once in lexicographic order and once in reverse lexicographic order of the identifiers of the endpoints. From these sorted lists, we can extract the desired information in the claimed time, using appropriate word operations.
Step 2: Let V H be the vertices of H with at least one neighbor, and let V H ⊆ V H be the vertices v ∈ V H having a neighbor with a smaller identifier than the identifier of v. If |V H | ≥ |V H |/2, we set the successor of each v ∈ V H to the neighbor of v with the smallest identifier. Otherwise, at least half of the nodes in V H have a neighbor with a larger identifier. In this case, we let V H be the set of these nodes, and we set the successor of each v ∈ V H to the neighbor of v with the largest identifier. The successor relation defines a directed forest F on V H such that at least half of the vertices in V H are not a root in F. Given the information available from Step 1 and appropriate word operations, this step can carried out in O((τ/ log n) log log n) time.
Step 3: Use pointer jumping to determine for each vertex v ∈ V H the identifier of the root of the tree in F that contains v. For this, we set the successor of each v ∈ V H that does not yet have a successor to v itself. Then, for log |V H | = O(log log n) rounds, we set simultaneously for each v ∈ V H the new successor of v to the old successor of the old successor of v (pointer jumping). Each step at least halves the distance of v to its root in F, so it takes O(log |V H |) rounds until each vertex in V H has found the root of its tree in F. Each round can be implemented in O((τ/ log n) log log 2 n) time by sorting the vertices according to their successors. Thus, this step takes O((τ/ log n) log log 3 n) time in total.
Step 4: Contract each tree of F into a single vertex whose identifier is the smallest identifier in the tree.. Maintain for the original vertices of H a list that gives the identifier of the contracted node that represents it. Again, this step can be carried out in O(τ/ log n) log log 2 n) time using sorting.
After Steps 1-4, the number of non-singleton components in H has at least halved. Thus, by repeating the steps O(log |V H |) = O(log log n) times, we can identify the connected components of H . The total time of the algorithm is O((τ/ log n) log log 4 n), as claimed.
Given the connected components of H , we can find the desired sequence W i+1 for the boundary B i+1 . Indeed, the procedure from Claim 10.8 outputs a sequence of O((τ/ log n) log log n) packed words that gives for each vertex in H an identifier of the component in H that contains it. We can use this list as a lookup table to update the identifiers of the components in the connectivity list of B i+1 . This takes O((τ + ρ)/ log n) log log 2 n) time, using sorting.
In summary, since we consider log n elementary boxes, the total time for the first phase if O((τ + ρ) log log 4 n). The second phase is much easier. For i = log n , . . . , 2, we propagate the connectivity information from B i+1 to B i . For this, we need to update the identifiers of the connected components for the cells on the upper boundary of B i using the identifiers of the connected components on the lower boundary of B i+1 , and then adjust the connectivity list B i+1 with these new indices. Again, this takes O((τ + ρ)/ log n) log log 2 n) time, using sorting.

Computing the Weak Fréchet Distance
To actually compute the weak Fréchet distance, we use a simplified version of the procedure from Sect. 7. In particular, for the weak Fréchet distance, there are only critical values of the type vertex-vertex and vertex-edge, i.e., there are only O(n 2 ) critical values. However, we aim for a subquadratic running time, so we need to perform the sampling procedure in a slightly different way.

Theorem 10.9
Suppose we can answer the decision problem for the weak Fréchet distance in time T (n), for input curves P and Q with n edges each. Then, we can compute the weak Fréchet distance of P and Q in expected time O(n 3/2 log c n + T (n) log n), for some fixed constant c > 0.
Proof First, we sample a set S of K = 6n 1/2 critical values uniformly at random. Then, we find a, b such that the weak Fréchet distance lies between a and b and such that the interval [a, b] contains no other element from S. This takes O(K +T (n) log n) time, using median finding.
Similarly to Har-Peled and Raichel [49, Lem. 6.2], we see that the probability, that the interval [a, b] has more than 2γ n 3/2 ln n critical values is at most 1/n γ . Indeed, there are at most n 2 vertex-vertex and at most 2n 2 vertex-edge events. Thus, the total number of critical values is at most L ≤ 3n 2 . Let U + be the next γ n 3/2 ln n larger critical values after d wF (P, Q). Then, the probability that S contains no value from U + is Analogously, the probability that S contains none of the next γ n 3/2 ln n smaller critical values is also at most 1/2n γ , so the claim follows. For γ > 0 large enough, the contribution of this event to the expected running time is negligible.
Next, we find all critical values in the interval [a, b]. For this, we must determine all vertex-vertex and vertex-edge pairs with distance in [a, b]. We report for every vertex p of P or Q the set of vertices of the other curve that lie in the annulus with radii a and b around p. Furthermore, we report for every edge e of P or Q the set of vertices of the other curve that lie in the stadium with radii a and b. This can be done efficiently with a range-searching structure for semi-algebraic sets. Agarwal, Matoušek and Sharir [3] show that we can preprocess the vertices of P and the vertices of Q into a data structure that can answer our desired range reporting queries in time O(n 1/2 log c n + k), where k is the output size and c > 0 is some fixed constant. The expected preprocessing time is O(n 1+ε ), where ε > 0 can be made arbitrarily small. We perform O(n) queries, and the total expected output size of our queries is O(n 3/2 ln n), so it takes expected time O(n 3/2 log c n) to find the critical values in [a, b]. Finally, we perform a binary search on theses critical values to compute the weak Fréchet distance. This takes O(T (n) log n) time.
The following theorem summarizes our results on the weak Fréchet distance. Theorem 10. 10 The weak Fréchet distance of two polygonal curves, each with n edges, can be computed by a randomized algorithm in time O(n 2 α(n) log log n) on a pointer machine and in time O((n 2 / log n)(log log n) 5 ) on a word RAM.

Conclusion
We have broken the long-standing quadratic upper bound for the decision version of the Fréchet problem. Moreover, we have shown that this problem has an algebraic decision tree of depth O(n 2−ε ), for some ε > 0 and where n is the number of vertices of the polygonal curves. We have shown how our faster algorithm for the decision version can be used for a faster algorithm to compute the Fréchet distance. If we allow constant-time table-lookup, we obtain a running time in close reach of O(n 2 ). This leaves us with intriguing open research questions. Can we devise a quadratic or even a slightly subquadratic algorithm for the optimization version? Can we devise such an algorithm on the word RAM, that is, with constant-time table-lookup? What can be said about approximation algorithms? in polynomial time [58]. Therefore, we usually have only a restricted floor function at our disposal.

Word RAM
The word RAM is essentially a real RAM without support for real numbers. However, on a real RAM, the integers are usually treated as atomic, whereas the word RAM allows for powerful bit-manipulation tricks [42]. More precisely, the word RAM represents the data as a sequence of w-bit words, where w = (log n). Data can be accessed arbitrarily, and standard operations, such as Boolean operations (and, xor, shl, . . .), addition, or multiplication take constant time. There are many variants of the word RAM, depending on precisely which instructions are supported in constant time [27,42]. The general consensus seems to be that any function in AC 0 is acceptable. 7 However, it is always preferable to rely on a set of operations as small, and as non-exotic, as possible. Note that multiplication is not in AC 0 [43], but nevertheless is often included in the word RAM instruction set [42].

Pointer Machine
The pointer machine model disallows the use of constant time table lookup, and is therefore a restriction of the (real) RAM model [59,61]. The data structure is modeled as a directed graph G with bounded out-degree. Each node in G represents a record, with a bounded number of pointers to other records and a bounded number of (real or integer) data items. The algorithm can access data only by following pointers from the inputs (and a bounded number of global entry records); random access is not possible. The data can be manipulated through the usual real RAM operations, but without support for the floor function, for reasons mentioned above.

Algebraic Computation Tree
Algebraic computation trees (ACTs) [11] are the computational geometry analogue of binary decision trees, and like these they are mainly used for proving lower bounds. Let x 1 , . . . , x n ∈ R be the inputs. An ACT is a binary tree with two kinds of nodes: computation and branch nodes. A computation node v has one child and is labeled with an expression of the type y v = y u ⊕ y w , where ⊕ ∈ {+, −, * , /, √ ·} is a operator and y u , y w is either an input variable x 1 , . . . , x n or corresponds to a computation node that is an ancestor of v. A branch node has degree 2 and is labeled by y u = 0 or y u > 0, where again y u is either an input or a variable for an ancestor. A family of algebraic computation trees (T n ) n∈N solves a computational problem (like Delaunay triangulation or convex hull computation), if for each n ∈ N, the tree T n accepts inputs of size n, and if for any such input x 1 , . . . , x n the corresponding path in T n (where the children of the branch nodes are determined according the conditions they represent) constitutes a computation that represents the answer in the variables y v encountered during the path.
Algebraic decision trees are defined as follows: we allow only branch nodes. Each branch node is labeled with a predicate of the form p(x 1 , . . . , x n ) = 0 or p(x 1 , . . . , x n ) > 0. The leaves are labeled yes or no. Fix some r ∈ {1, . . . , n}. If p is restricted to be of the form p(x 1 , . . . , x n ) = n i=1 a i x i − b, with at most r coefficients a i = 0, we call the decision tree r -linear. Erickson [40] showed that any 3-linear decision tree for 3SUM has depth (n 2 ). However, Grønlund and Pettie showed that there is a 4-linear decision tree of depth O(n 3/2 √ log n) for the problem. In geometric problems, linear predicates are often much too restrictive. For example, there is no r -linear decision tree for the Fréchet problem, no matter the choice of r : with r -linear decision trees, we cannot even decide whether two given points p and q have Euclidean distance at most 1.