Fast Algorithm for Finding Maximum Distance with Space Subdivision in E2
- 2 Citations
- 2 Mentions
- 1.5k Downloads
Abstract
Finding an exact maximum distance of two points in the given set is a fundamental computational problem which is solved in many applications. This paper presents a fast, simple to implement and robust algorithm for finding this maximum distance of two points in E2. This algorithm is based on a polar subdivision followed by division of remaining points into uniform grid. The main idea of the algorithm is to eliminate as many input points as possible before finding the maximum distance. The proposed algorithm gives the significant speed up compared to the standard algorithm.
Keywords
Maximum distance Polar space subdivision Uniform 2D grid Points reduction1 Introduction
Finding a maximum distance of two points in the given data set is a fundamental computational problem. The solution of this problem is needed in many applications. A standard brute force (BF) algorithm with \( O\left( {N^{2} } \right) \) complexity is usually used, where \( N \) is a number of points in the input dataset. If large sets of points have to be processed, then the BF algorithm leads to very bad time performance. Typical size of datasets in computer graphics is usually \( 10^{5} \) and more points. Therefore the processing time of the BF algorithm for such sets is unacceptable.
However, our main goal is to find the maximum distance, not all the pairs of two points having a maximum distance. Therefore the complexity of this algorithm should be lower.
Various approaches, how to solve finding the maximum distance, are described in [9]. Other algorithms for finding the maximum distance of two points are in [1, 7].
1.1 Brute Force Algorithm
The standard BF algorithm for finding a maximum distance in set of points uses two nested loops. We can find such type of algorithms in many books dealing with fundamental algorithms and data structures, e.g. [4, 6]. In general, the BF algorithm can be described by Algorithm 1.
Complexity of Algorithm 1 is clearly \( O\left( {N^{2} } \right) \) and thus run time significantly increases with size of the input dataset.
In practice, we can expect that points in input set are not organized in a very specific manner and points are more or less uniformly distributed. In this case, we can use “output sensitive” algorithms which lead to efficient solutions. We propose such algorithm in Sect. 2.
2 Proposed Algorithm
In this section, we introduce a new algorithm for finding a maximum distance of two points in the given dataset in E2. The main idea of this algorithm is to eliminate as many input points as possible using an algorithm with \( O\left( N \right) \) complexity using space subdivision and determines the maximum distance for the remaining points with \( O\left( {k^{2} } \right) \) complexity, where \( k \ll N \). We use polar space subdivision for this elimination of points.
This section is organized as follows. In Sect. 2.1, we present the first step of the algorithm which is an axis aligned bounding box (AABB) and an initial convex polygon construction followed by the location of points inside the initial convex polygon. Section 2.2 describes how to divide the points into non-overlapping \( 2D \) triangular shape sectors. Section 2.3 presents reduction of the points [2] which have absolutely no influence on the value of maximum distance. In Sect. 2.4, we describe the division of remaining points into uniform \( 2D \) grid. Finally, the finding of the maximum distance of two points is made in Sect. 2.5.
2.1 Location of Points Inside Initial Polygon
An important property is that two points with maximal distance are lying on the convex hull of a given set of points [10]. This fact is apparent if we consider a case in which two points with the largest distance are part of the convex hull. It is then obvious that there are another two points with larger distance. We also know that the most extreme point on any axis is part of the convex hull. These properties are used to significantly speedup the proposed algorithm for finding the exact maximal distance.
At the beginning of our proposed algorithm, we need to find the exact extremal points in both axes, i.e. axis aligned bounding box (AABB) of a given dataset. The time complexity of this step is \( O\left( N \right) \). So we generally get four distinct extremal points or less.
Location of AABB and initial testing polygon for \( 10^{4} \) points: (a) uniform points in ellipse, (b) uniform points in rectangle, (c) Gauss points.
2.2 Division of Points into Polar Sectors
Non-overlapping sectors for division and uniform distribution of simplified angle on AABB. Angle \( \varphi \in [0,8) \) instead of \( [0,2\pi ) \).
However, calculation of function \( {\text{arctg2}} \) takes a lot of computing time. Therefore, we use a simplified calculation of approximated angle. When the angle is determined, we have to locate the exact sectors (half of the quadrant for square AABB), where the point is located, and then calculate the intersection with the given edge. Calculation of the intersection with the given edge of AABB is easy. The distribution of simplified angle can be seen in Fig. 2. Calculation of simplified angle is faster than the formula (2).
Now we have the procedure how to calculate the simplified angle and therefore we are able to divide the points into sectors to which the given points belong.
Visualization of initial \( R_{i}^{min} \) points (red dots on the edges of the initial polygon) (Color figure online).
All minimum points \( \varvec{R}_{i}^{min} \) are connected into a polygon with vertices \( \varvec{R}_{1}^{min} \), …, \( \varvec{R}_{8}^{min} \).
Visualization of test lines \( l^{ - } \) and \( l^{ + } \).
In the next step, we check whether the processed point lies over or under the test line segments \( l^{ - } \) and \( l^{ + } \). We can compare the angle of the point with the angle of point \( \varvec{R}_{i}^{min} \). If the angle is smaller, then we have to use the line \( l^{ - } \), otherwise we have to use the line \( l^{ + } \). If the point lies under the test line, it can be eliminated, because such a point has no influence on the value of maximum distance. Otherwise we add this point into the sector with index \( i \).
2.3 Reduction of Points for Testing
Visualization of test lines for rechecking all remaining points
In this step, we check whether the processed point lies over or under the line segments \( l^{ - - } \), \( l^{ - } \), \( l^{ + } \) and \( l^{ + + } \), see Fig. 5(b). We select the concrete test line according to the angle again.
Remaining points (red dots) which have influence on the maximum distance (\( 10^{4} \) input points): (a) uniform points in ellipse, (b) uniform points in rectangle, (c) Gauss points.
2.4 Division of Remaining Points into Uniform Grid
Uniform grid of AABB. Value \( D^{cell} \) presents the largest distance of two cells and \( d^{cell} \) presents the shortest distance of two cells.
After performing previous step, we determined all possible pairs of nonempty cells. Moreover, for each pair of nonempty cells, the shortest distance \( d_{ij}^{cell} \), i.e. the distance of the nearest corners of cells, and the largest distance \( D_{ij}^{cell} \), i.e. the distance of the farthest corners of cells, are determined, see Fig. 7.
2.5 Find Maximal Distance of Two Points
Now a maximum distance of two points in the given dataset can be found by following steps. We determine the maximum value \( d_{max}^{cell} \) from the shortest distances \( d_{ij}^{cell} \) which were calculated for all pairs of nonempty cells. When this value is known, we can eliminate all pairs of nonempty cells for which the largest distance \( D_{ij}^{cell} \) is smaller than \( d_{max}^{cell} \).
For remaining pairs of nonempty cells, we perform the following. For each pair of nonempty cells, the maximum distance \( D_{ij} \) between points in these cells is determined, i.e. we calculate all distances from points in one cell to points in second cell and determine their maximum. Finally, we find the maximum value of these maximum distances \( D_{ij} \).
3 Experimental Results
-
CPU: Intel® Core™ i7-2600 (4 × 3.40 GHz)
-
Memory: 16 GB RAM
-
Operating system Microsoft Windows 7 64 bits
3.1 Distribution of Points
The proposed algorithm for finding the maximum distance of two points has been tested using different datasets. These datasets have different types of distributions of points. For our experiments, we used well-known distributions such as randomly distributed uniform points in an ellipse, uniform points in a rectangle or points with a Gaussian distribution. Other distributions used were Halton points and Gauss ring points. Both of these distributions are described in the following text.
Halton Points.
\( 2D \) Halton points generated by \( Halton\left( {2,3} \right) \) (left) and \( 2D \) random points in a rectangle with uniform distribution (right). Number of points is \( 10^{3} \) in both cases.
Gauss Ring Points.
2D Gauss ring points. Number of points is \( 10^{3} \).
3.2 Optimal Size of Grid
In the proposed approach, the remaining points are divided into uniform grid \( k \times k \) after their elimination by polar division. The size of the grid has significantly influence on the number of pairs of points for which their mutual distance is determined. Simultaneously the time complexity is increasing with growing size of the grid. Therefore, we need know an estimation of the optimal size of the grid, which should be dependent on the distribution of points and on the number of points. Therefore, we have to measure it for each type of input points separately.
The time performance of algorithm for finding maximum distance of two points for different points distributions and different size of grid. The size of grid denotes the number of cells in one axis. The number of input points is \( 10^{7} \). Distribution of points are: (a) uniform points in ellipse, (b) uniform points in rectangle, (c) Halton points, (d) Gauss points, (e) Gauss ring points.
Optimal number of grid size for algorithm for finding maximum distance of two points for different points distributions and different number of points. The size of grid denotes the number of cells in one axis. Distribution of points are: (a) uniform points in ellipse, (b) uniform points in rectangle, (c) Halton points, (d) Gauss points, (e) Gauss ring points.
Evaluating experimental results for different distributions of points and different numbers of input points, i.e. \( 10^{6} \), \( \sqrt {10} \cdot 10^{6} \), \( 10^{7} \), \( \sqrt {10} \cdot 10^{7} \) and \( 10^{8} \), including results from Figs. 10 and 11, we came to the following conclusion.
The optimal size of the grid is dependent on the number of input points, more precisely the size of the grid is dependent on number of suspicious point. Size of the grid has to increase with the increasing number of points.
3.3 Time Performance
The time performance of convex hull for different number of input points and different distributions of points.
| Time [ms] | |||||
|---|---|---|---|---|---|
| Number of points | Uniform ○ | Uniform □ | Halton | Gauss | GaussRing |
| 1E+5 | 32.9 | 11.5 | 11.0 | 9.0 | 8.8 |
| \( \sqrt {10} \)E+5 | 137.6 | 37.4 | 36.3 | 30.6 | 29.8 |
| 1E+6 | 466.2 | 119.1 | 113.5 | 93.3 | 93.4 |
| \( \sqrt {10} \)E+6 | 1 745.5 | 367.8 | 355.8 | 315.0 | 296.0 |
| 1E+7 | 5 631.3 | 1 203.9 | 1 158.0 | 1 009.2 | 954.9 |
| \( \sqrt {10} \)E+7 | 17 976.5 | 3 596.6 | 3 579.0 | 3 221.5 | 3 057.9 |
| 1E+8 | 56 769.0 | 11 154.0 | 11 505.0 | 12 004.0 | 9 680.0 |
The time performance of algorithm for finding maximum distance two points for different number of input points and different distribution of this points.
It can be seen that the best time performance is for the Gauss ring points. The time performance for Halton points and for uniform distribution of points inside a rectangle is similar. Overall, we can say that for all tested distributions of input points, except uniform points in an ellipse, is the running time practically similar. This is expected behavior because most of the points are eliminated during the phase of polar division. Therefore, there are only a few points and nonempty cells of uniform grid for finding the maximum distance. The worst time performance was obtained for uniform points in an ellipse.
3.4 Comparison with Other Algorithms
We compared our proposed algorithm for finding exact maximum distance of two points in the given dataset with the BF algorithm, whose time complexity is \( O\left( {N^{2} } \right) \), and with the algorithm proposed in [8], which has expected time complexity \( O\left( N \right) \), where \( N \) is the number of input points. It should be noted that the results for the algorithm in [8] are based on the use of the ratio of the BF algorithm to this algorithm.
The speed-up of our proposed algorithm for uniformly distributed points with respect to BF algorithm for the same datasets.
The speed-up of our proposed algorithm for uniformly distributed points with respect to algorithm in [8] for the same datasets.
It can be seen that the speed-up of the proposed algorithm is significant with respect to BF algorithm and grows with the number of points processed. Moreover, our algorithm is in average 1.5 times faster than the algorithm in [8].
4 Conclusion
A new fast algorithm for finding an exact maximum distance of two points in \( E^{2} \) with \( {\mathcal{O}}_{\text{expected}} (N) \) complexity has been presented. This algorithm uses a space division technique. It is robust and can process a large number of points. The advantages of our proposed algorithm are simple implementation and robustness. Moreover, our algorithm can be easily extended to E3 by a simple modification.
For future work, the algorithm for finding exact maximum distance of two points, can be easily parallelized, as most of the steps are independent. The second thing is to extend this algorithm to E3.
Notes
Acknowledgments
The authors would like to thank their colleagues at the University of West Bohemia, Plzen, for their discussions and suggestions, and anonymous reviewers for their valuable comments and hints provided. The research was supported by MSMT CR projects LH12181 and SGS 2013-029.
References
- 1.Clarkson, K.L., Shor, P.W.: Applications of random sampling in computational geometry, II. Discrete Comput. Geom. 4(1), 387–421 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
- 2.Dobkin, D.P., Snyder, L.: On a general method for maximizing and minimizing among certain geometric problems. In: Proceedings of the 20th Annual Symposium on the Foundations of Computer Science, pp. 9–17 (1979)Google Scholar
- 3.Fasshauer, G.E.: Meshfree Approximation Methods with MATLAB. World Scientific Publishing Co., Inc., Singapore (2007)CrossRefzbMATHGoogle Scholar
- 4.Hilyard, J., Teilhet, S.: C# Cookbook. O’Reilly Media Inc., Sebastopol (2006)Google Scholar
- 5.Liu, G., Chen, C.: A new algorithm for computing the convex hull of a planar point set. J. Zhejiang Univ. Sci. A 8(8), 1210–1217 (2007)CrossRefzbMATHGoogle Scholar
- 6.Mehta, D.P., Sahni, S.: Handbook of Data Structures and Applications. CRC Press, Boca Raton (2004)CrossRefGoogle Scholar
- 7.O’Rourke, J.: Computational Geometry in C. Cambridge University Press, Cambridge (1998)CrossRefzbMATHGoogle Scholar
- 8.Skala, V.: Fast Oexpected (N) algorithm for finding exact maximum distance in E2 instead of O (N2) or O (N lgN). In: AIP Conference Proceedings, no. 1558, pp. 2496–2499 (2013)Google Scholar
- 9.Snyder, W.E., Tang, D.A.: Finding the extrema of a region. IEEE Trans. Pattern Anal. Mach. Intell. 2(3), 266–269 (1980)CrossRefGoogle Scholar
- 10.Vince, J.: Geometric Algebra for Computer Graphics. Springer, Berlin (2008)CrossRefzbMATHGoogle Scholar














