Abstract
In this contribution, a simple and efficient algorithm for the closest-pair problem in \(E^1\) is described using the preprocessing based on exponent bucketing and exponent windowing respecting accuracy of the floating point representation. The preprocessing is of the O(N) complexity. Experiments made for the uniform distribution proved significant speedup. The proposed approach is applicable for the \(E^2\) case.
Supported by the University of West Bohemia - Institutional research support Martinez, D.E., Martinez, A.E., Moreno, F.H.—students contributed during their Erasmus ACG course at the University of West Bohemia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Data set, where \(N \ggg 10^6\). Note that \(2~ 147~483~648 \doteq 2.147~10^9\) unsigned distinct values only can be represented in single precision.
- 2.
It actually saves \(O(N^2)\) computations of the \(\sqrt{*}\) function.
- 3.
Let us consider a sorted sequence \(1.01, 1.05, \ldots , 10.0001, 10.0005\), then the minimum distance is 0.0004 not 0.04.
- 4.
Note: 130–120 means exponents interval [-120,...,-111].
- 5.
the binary exponent is shifted, i.2. a value \(2^{-128}\) has the shifted exponent 0.
- 6.
The array list, i.e. extensible arrays were used in the actual implementation.
- 7.
The value \(k \doteq 33\) is obtained by solving \(10^{10}*2^k=1\), which is \(k \doteq -33\) and the shifted exponent \(Ex=128-33=95\).
- 8.
In the case of \(10^6\) values, over \(10^3\) bucket extensions were called, but with the \(20\%\) additional memory allocation, the extension was called only 7 times.
- 9.
Sort-CPP - standard Shell sort in C++.
- 10.
Note, that p depends on the FP precision used.
References
Bespamyatnikh, S.: An optimal algorithm for closest-pair maintenance. Discrete Comput. Geom. 19(2), 175–195 (1998). https://doi.org/10.1007/PL00009340
Daescu, O., Teo, K.: 2D closest pair problem: a closer look. In: CCCG 2017–29th Canadian Conference on Computational Geometry, Proceedings, pp. 185–190 (2017)
Daescu, O., Teo, K.: Two-dimensional closest pair problem: a closer look. Discrete Appl. Math. 287, 85–96 (2020). https://doi.org/10.1016/j.dam.2020.08.006
Golin, M.: Randomized data structures for the dynamic closest-pair problem. SIAM J. Comput. 27(4), 1036–1072 (1998). https://doi.org/10.1137/S0097539794277718
Kamousi, P., Chan, T., Suri, S.: Closest pair and the post office problem for stochastic points. Comput. Geom. Theory Appl. 47(2 PART B), 214–223 (2014). https://doi.org/10.1016/j.comgeo.2012.10.010
Katajainen, J., Koppinen, M., Leipälä, T., Nevalainen, O.: Divide and conquer for the closest-pair problem revisited. Int. J. Comput. Math. 27(3–4), 121–132 (1989). https://doi.org/10.1080/00207168908803714
Khuller, S., Matias, Y.: A simple randomized sieve algorithm for the closest-pair problem. Inf. Comput. 118(1), 34–37 (1995). https://doi.org/10.1006/inco.1995.1049
Mavrommatis, G., Moutafis, P., Corral, A.: Enhancing the slicenbound algorithm for the closest-pairs query with binary space partitioning. In: ACM International Conference Proceeding Series, pp. 107–112 (2021). https://doi.org/10.1145/3503823.3503844
Pereira, J., Lobo, F.: An optimized divide-and-conquer algorithm for the closest-pair problem in the planar case. J. Comput. Sci. Technol. 27(4), 891–896 (2012). https://doi.org/10.1007/s11390-012-1272-6
Roumelis, G., Vassilakopoulos, M., Corral, A., Manolopoulos, Y.: A new plane-sweep algorithm for the K-closest-pairs query. In: Geffert, V., Preneel, B., Rovan, B., Štuller, J., Tjoa, A.M. (eds.) SOFSEM 2014. LNCS, vol. 8327, pp. 478–490. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04298-5_42
Shamos, M., Hoey, D.: Closest-point problems. In: Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS 1975-October, pp. 151–162 (1975). https://doi.org/10.1109/SFCS.1975.8
Skala, V.: Fast \(O_{expected}(N)\) algorithm for finding exact maximum distance in E2 instead of \(O(N^2)\) or \(O(N lg{N})\). AIP Conf. Proc. 1558, 2496–2499 (2013). https://doi.org/10.1063/1.4826047
Skala, V., Cerny, M., Saleh, J.: Simple and efficient acceleration of the smallest enclosing ball for large data sets in e2: Analysis and comparative results. LNCS 13350, 720–733 (2022). https://doi.org/10.1007/978-3-031-08751-6_52
Skala, V., Majdisova, Z.: Fast algorithm for finding maximum distance with space subdivision in E2. LNCS 9218, 261–274 (2015). https://doi.org/10.1007/978-3-319-21963-9_24
Skala, V., Smolik, M.: Simple and fast \(oexp(n)\) algorithm for finding an exact maximum distance in E2 instead of \(o(n^2)\) or \(o(n \lg {N})\). LNCS 11619, 367–380 (2019). https://doi.org/10.1007/978-3-030-24289-3_27
Smolik, M., Skala, V.: Efficient speed-up of the smallest enclosing circle algorithm. Informatica 33(3), 623–633 (2022). https://doi.org/10.15388/22-INFOR477
Acknowledgments
The author would like to thank colleagues at the University of West Bohemia, Plzen for their comments and suggestions, comments and hints provided, especially to Martin Cervenka and Lukas Rypl for some additional additional counter tests. Thanks also belong to anonymous reviewers for their critical view and recommendations that helped to improve this manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Responsibilities
Skala, V.: theoretical part, algorithm design, algorithm implementation and verification, manuscript preparation; Esteban Martinez, A., Esteban Martinez, D., Hernandez Moreno, F.: algorithm implementation and experimental verification.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Skala, V., Martinez, A.E., Martinez, D.E., Moreno, F.H. (2023). A New Algorithm for the Closest Pair of Points for Very Large Data Sets Using Exponent Bucketing and Windowing. In: Mikyška, J., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 14074. Springer, Cham. https://doi.org/10.1007/978-3-031-36021-3_40
Download citation
DOI: https://doi.org/10.1007/978-3-031-36021-3_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36020-6
Online ISBN: 978-3-031-36021-3
eBook Packages: Computer ScienceComputer Science (R0)