Skip to main content

A New Algorithm for the Closest Pair of Points for Very Large Data Sets Using Exponent Bucketing and Windowing

  • Conference paper
  • First Online:
Computational Science – ICCS 2023 (ICCS 2023)

Abstract

In this contribution, a simple and efficient algorithm for the closest-pair problem in \(E^1\) is described using the preprocessing based on exponent bucketing and exponent windowing respecting accuracy of the floating point representation. The preprocessing is of the O(N) complexity. Experiments made for the uniform distribution proved significant speedup. The proposed approach is applicable for the \(E^2\) case.

Supported by the University of West Bohemia - Institutional research support Martinez, D.E., Martinez, A.E., Moreno, F.H.—students contributed during their Erasmus ACG course at the University of West Bohemia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Data set, where \(N \ggg 10^6\). Note that \(2~ 147~483~648 \doteq 2.147~10^9\) unsigned distinct values only can be represented in single precision.

  2. 2.

    It actually saves \(O(N^2)\) computations of the \(\sqrt{*}\) function.

  3. 3.

    Let us consider a sorted sequence \(1.01, 1.05, \ldots , 10.0001, 10.0005\), then the minimum distance is 0.0004 not 0.04.

  4. 4.

    Note: 130–120 means exponents interval [-120,...,-111].

  5. 5.

    the binary exponent is shifted, i.2. a value \(2^{-128}\) has the shifted exponent 0.

  6. 6.

    The array list, i.e. extensible arrays were used in the actual implementation.

  7. 7.

    The value \(k \doteq 33\) is obtained by solving \(10^{10}*2^k=1\), which is \(k \doteq -33\) and the shifted exponent \(Ex=128-33=95\).

  8. 8.

    In the case of \(10^6\) values, over \(10^3\) bucket extensions were called, but with the \(20\%\) additional memory allocation, the extension was called only 7 times.

  9. 9.

    Sort-CPP - standard Shell sort in C++.

  10. 10.

    Note, that p depends on the FP precision used.

References

  1. Bespamyatnikh, S.: An optimal algorithm for closest-pair maintenance. Discrete Comput. Geom. 19(2), 175–195 (1998). https://doi.org/10.1007/PL00009340

    Article  MathSciNet  MATH  Google Scholar 

  2. Daescu, O., Teo, K.: 2D closest pair problem: a closer look. In: CCCG 2017–29th Canadian Conference on Computational Geometry, Proceedings, pp. 185–190 (2017)

    Google Scholar 

  3. Daescu, O., Teo, K.: Two-dimensional closest pair problem: a closer look. Discrete Appl. Math. 287, 85–96 (2020). https://doi.org/10.1016/j.dam.2020.08.006

    Article  MathSciNet  MATH  Google Scholar 

  4. Golin, M.: Randomized data structures for the dynamic closest-pair problem. SIAM J. Comput. 27(4), 1036–1072 (1998). https://doi.org/10.1137/S0097539794277718

    Article  MathSciNet  MATH  Google Scholar 

  5. Kamousi, P., Chan, T., Suri, S.: Closest pair and the post office problem for stochastic points. Comput. Geom. Theory Appl. 47(2 PART B), 214–223 (2014). https://doi.org/10.1016/j.comgeo.2012.10.010

    Article  MathSciNet  MATH  Google Scholar 

  6. Katajainen, J., Koppinen, M., Leipälä, T., Nevalainen, O.: Divide and conquer for the closest-pair problem revisited. Int. J. Comput. Math. 27(3–4), 121–132 (1989). https://doi.org/10.1080/00207168908803714

    Article  Google Scholar 

  7. Khuller, S., Matias, Y.: A simple randomized sieve algorithm for the closest-pair problem. Inf. Comput. 118(1), 34–37 (1995). https://doi.org/10.1006/inco.1995.1049

    Article  MathSciNet  MATH  Google Scholar 

  8. Mavrommatis, G., Moutafis, P., Corral, A.: Enhancing the slicenbound algorithm for the closest-pairs query with binary space partitioning. In: ACM International Conference Proceeding Series, pp. 107–112 (2021). https://doi.org/10.1145/3503823.3503844

  9. Pereira, J., Lobo, F.: An optimized divide-and-conquer algorithm for the closest-pair problem in the planar case. J. Comput. Sci. Technol. 27(4), 891–896 (2012). https://doi.org/10.1007/s11390-012-1272-6

    Article  MathSciNet  MATH  Google Scholar 

  10. Roumelis, G., Vassilakopoulos, M., Corral, A., Manolopoulos, Y.: A new plane-sweep algorithm for the K-closest-pairs query. In: Geffert, V., Preneel, B., Rovan, B., Štuller, J., Tjoa, A.M. (eds.) SOFSEM 2014. LNCS, vol. 8327, pp. 478–490. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04298-5_42

    Chapter  Google Scholar 

  11. Shamos, M., Hoey, D.: Closest-point problems. In: Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS 1975-October, pp. 151–162 (1975). https://doi.org/10.1109/SFCS.1975.8

  12. Skala, V.: Fast \(O_{expected}(N)\) algorithm for finding exact maximum distance in E2 instead of \(O(N^2)\) or \(O(N lg{N})\). AIP Conf. Proc. 1558, 2496–2499 (2013). https://doi.org/10.1063/1.4826047

    Article  Google Scholar 

  13. Skala, V., Cerny, M., Saleh, J.: Simple and efficient acceleration of the smallest enclosing ball for large data sets in e2: Analysis and comparative results. LNCS 13350, 720–733 (2022). https://doi.org/10.1007/978-3-031-08751-6_52

    Article  Google Scholar 

  14. Skala, V., Majdisova, Z.: Fast algorithm for finding maximum distance with space subdivision in E2. LNCS 9218, 261–274 (2015). https://doi.org/10.1007/978-3-319-21963-9_24

    Article  Google Scholar 

  15. Skala, V., Smolik, M.: Simple and fast \(oexp(n)\) algorithm for finding an exact maximum distance in E2 instead of \(o(n^2)\) or \(o(n \lg {N})\). LNCS 11619, 367–380 (2019). https://doi.org/10.1007/978-3-030-24289-3_27

    Article  Google Scholar 

  16. Smolik, M., Skala, V.: Efficient speed-up of the smallest enclosing circle algorithm. Informatica 33(3), 623–633 (2022). https://doi.org/10.15388/22-INFOR477

    Article  MATH  Google Scholar 

Download references

Acknowledgments

The author would like to thank colleagues at the University of West Bohemia, Plzen for their comments and suggestions, comments and hints provided, especially to Martin Cervenka and Lukas Rypl for some additional additional counter tests. Thanks also belong to anonymous reviewers for their critical view and recommendations that helped to improve this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vaclav Skala .

Editor information

Editors and Affiliations

Ethics declarations

Responsibilities

Skala, V.: theoretical part, algorithm design, algorithm implementation and verification, manuscript preparation; Esteban Martinez, A., Esteban Martinez, D., Hernandez Moreno, F.: algorithm implementation and experimental verification.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Skala, V., Martinez, A.E., Martinez, D.E., Moreno, F.H. (2023). A New Algorithm for the Closest Pair of Points for Very Large Data Sets Using Exponent Bucketing and Windowing. In: Mikyška, J., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 14074. Springer, Cham. https://doi.org/10.1007/978-3-031-36021-3_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36021-3_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36020-6

  • Online ISBN: 978-3-031-36021-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics