Skip to main content

Selecting a subset of diverse points based on the squared euclidean distance

Abstract

In this paper we consider two closely related problems of selecting a diverse subset of points with respect to squared Euclidean distance. Given a set of points in Euclidean space, the first problem is to find a subset of a specified size M maximizing the sum of squared Euclidean distances between the chosen points. The second problem asks for a minimum cardinality subset of points, given a constraint on the sum of squared Euclidean distances between them. We consider the computational complexity of both problems and propose exact dynamic programming algorithms in the case of integer input data. If the dimension of the Euclidean space is bounded by a constant, these algorithms have a pseudo-polynomial time complexity. We also develop an FPTAS for the special case of the first problem, where the dimension of the Euclidean space is bounded by a constant.

This is a preview of subscription content, access via your institution.

References

  1. 1.

    Aggarwal, H., Imai, N., Katoh, N., Suri, S.: Finding k points with minimum diameter and related problems. J. Alg. 12(1), 38–56 (1991)

    MathSciNet  Article  Google Scholar 

  2. 2.

    Aringhieri, R.: Composing medical crews with equity and efficiency. Cent. Eur. J. Oper. Res. 17(3), 343–357 (2009)

    MathSciNet  Article  Google Scholar 

  3. 3.

    Castillo, C., Segura, J.C.: Differential evolution with enhanced diversity maintenance. Optim. Lett (2019)

  4. 4.

    Cevallos, A., Eisenbrand, F., andMorell, S.: Diversity maximization in doubling metrics. In: Proc. of 29th International symposium on algorithms and computation (ISAAC 2018), LIPIcs Vol. 123, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, pp. 33:1–33:12 (2016)

  5. 5.

    Cevallos, A., Eisenbrand, F., Zenklusen, R.: Max-sum diversity via convex programming. In: 32nd Annual Symposium on Computational Geometry (SoCG), LIPIcs Vol. 51, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, pp. 26:1–26:14 (2016)

  6. 6.

    Eremeev, A.V., Kel’manov, A.V., Kovalyov, M.Y., Pyatkin, A.V.: Maximum diversity problem with squared Euclidean distance. In: Khachay, M., Kochetov, Y., Pardalos, P (eds.) Proc. of International Conference on Mathematical Optimization Theory and Operations Research, MOTOR 2019, Yekaterinburg, Russia, July 8-12, 2019. Lecture Notes in Computer Science, vol. 11548, pp 541–551. Springer, Cham (2019)

  7. 7.

    Eremeev, A.V., Kovalyov, M.Y., Pyatkin, A.V.: On finding minimum cardinality subset of vectors with a constraint on the sum of squared Euclidean pairwise distances. In: Kotsireas, I., Pardalos, P (eds.) Learning and Intelligent Optimization. LION 2020. Lecture Notes in Computer Science, vol. 12096, pp 40–45. Springer, Cham (2020)

  8. 8.

    Edwards, A.W.F., Cavalli-Sforza, L.L.: A method for cluster analysis. Biometrics 21, 362–375 (1965)

    Article  Google Scholar 

  9. 9.

    Garey, M.R., Johnson, D.S.: Computers and intractability. A guide to the theory of NP-completeness . W.H. Freeman and Company, San Francisco (1979)

    MATH  Google Scholar 

  10. 10.

    Ibarra, O., Kim, C.E.: Fast approximation algorithms for the knapsack and sum of subset problems. J. ACM 22, 463–468 (1975)

    MathSciNet  Article  Google Scholar 

  11. 11.

    Kel’manov, A.V., Pyatkin, A.V.: NP-completeness of some problems of choosing a vector subset. J. Appl. Ind. Math. 5(3), 352–357 (2011)

    MathSciNet  Article  Google Scholar 

  12. 12.

    Kel’manov, A.V., Romanchenko, S.M.: Pseudopolynomial algorithms for certain computationally hard vector subset and cluster analysis problems. Autom. Remote. Control. 73(2), 349–354 (2012)

    MathSciNet  Article  Google Scholar 

  13. 13.

    Kel’manov, A.V., Romanchenko, S.M.: An FPTAS for a vector subset search problem. J. Appl. Ind. Math. 8(3), 329–336 (2014)

    MathSciNet  Article  Google Scholar 

  14. 14.

    Kuo, C.C., Glover, F., Dhir, K.S.: Analyzing and modeling the maximum diversity problem by zero-one programming. Decis. Sci. 24(6), 1171–1185 (1993)

    Article  Google Scholar 

  15. 15.

    McConnell, S.: The new battle over immigration. Fortune 117(10), 89–102 (1988)

    Google Scholar 

  16. 16.

    Papadimitriou, C.H.: Computational Complexity. Addison-Wesley, New York (1994)

    MATH  Google Scholar 

  17. 17.

    Porter, W.M., Eawal, K.M., Rachie, K.O., Wien, H.C., Willians, R.C.: Cowpea germplasm catalog No. 1. International Institute of Tropical Agriculture, Ibadan, Nigeria (1975)

    Google Scholar 

  18. 18.

    Shenmaier, V.V.: An approximation scheme for a problem of search for a vector subset. J. Appl. Ind. Math. 6(3), 381–386 (2012)

    MathSciNet  Article  Google Scholar 

  19. 19.

    Shenmaier, V.V.: Solving some vector subset problems by Voronoi diagrams. J. Appl. Ind. Math. 10(4), 560–566 (2016)

    MathSciNet  Article  Google Scholar 

  20. 20.

    Woeginger, G.J.: Some easy and some not so easy geometric optimization problems. In: Epstein, L., Erlebach, T (eds.) Approximation and Online Algorithms. WAOA 2018. Lecture Notes in Computer Science, vol. 11312, pp 3–18. Springer, Cham (2018)

Download references

Acknowledgements

The authors thank Yulia Kovalenko for her helpful comments. The study presented in Section 3 was supported by the RFBR grant 19-01-00308, the study presented Section 6 was carried out within the state task of Sobolev Institute of Mathematics SB RAS.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Anton V. Eremeev.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Eremeev, A.V., Kel’manov, A.V., Kovalyov, M.Y. et al. Selecting a subset of diverse points based on the squared euclidean distance. Ann Math Artif Intell (2021). https://doi.org/10.1007/s10472-021-09773-z

Download citation

Keywords

  • Euclidean space
  • Subset of points
  • Given size
  • Maximum variance
  • Strong NP-hardness
  • Integer instance
  • Exact algorithm
  • Fixed space dimension
  • Pseudo-polynomial time

Mathematics Subject Classification (2010)

  • 62H30
  • 90C09
  • 68W25