An FPTAS for a vector subset search problem

  • A. V. Kel’manovEmail author
  • S. M. Romanchenko


Under study is a strongly NP-hard problem of finding a subset of a given size of a finite set of vectors in Euclidean space which minimizes the sum of squared distances from the elements of this subset to its center. The center of the subset is defined as the average vector calculated with all subset elements. It is proved that, unless P=NP, in the general case of the problem there is no fully polynomial time approximation scheme (FPTAS). Such a scheme is provided in the case when the dimension of the space is fixed.


finding a vector subset Euclidean space minimum of the sum of squared distances NP-hardness fully polynomial time approximation scheme 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A. V. Kel’manov and A. V. Pyatkin, “NP-Completeness of Some Problems of Choosing a Vector Subset,” Diskretn. Anal. Issled. Oper. 17(5), 37–45 (2010) [J. Appl. Indust. Math. 5 (3), 352–357 (2011)].zbMATHGoogle Scholar
  2. 2.
    A. V. Kel’manov and S. M. Romanchenko, “An Approximation Algorithm for Solving a Problem of Search for a Vector Subset,” Diskretn. Anal. Issled. Oper. 18(1), 61–69 (2011) [J. Appl. Indust. Math. 6 (1), 90–96 (2012)].zbMATHMathSciNetGoogle Scholar
  3. 3.
    A. V. Kel’manov and S. M. Romanchenko, “Pseudopolynomial Algorithms for Certain Computationally Hard Vector Subset and Cluster Analysis Problems,” Avtomat. i Telemekh. No 2, 156–162 (2012) [Automat. Remote Control 73 2), 349–354 (2012)].Google Scholar
  4. 4.
    V. V. Shenmaier, “An Approximation Scheme for a Problem of Search for a Vector Subset,” Diskretn. Anal. Issled. Oper. 19(2), 92–100 (2019) [J. Appl. Indust. Math. 6 (3), 381–386 (2012)].Google Scholar
  5. 5.
    D. Aloise, A. Deshpande, P. Hansen, and P. Popat, “NP-Hardness of Euclidean Sum-of-Squares Clustering,” Les Cahiers du GERAD, G-2008-33 (2008) [Machine Learning, 75 (2), 245–248 (2009)].Google Scholar
  6. 6.
    K. Anil and K. Jain, “Data Clustering: 50 Years Beyond k-Means,” Pattern Recognit. Lett. 31, 651–666 (2010).CrossRefGoogle Scholar
  7. 7.
    M. R. Garey and D. S. Johnson, Computers and Intractability: a Guide to the Theory of NP-Completeness (Freeman, San Francisco, 1979).zbMATHGoogle Scholar
  8. 8.
    T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: DataMining, Inference, and Prediction (Springer, New York, 2001).CrossRefGoogle Scholar
  9. 9.
    J. B. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” in Proceedings of the 5th Berkeley Symposium on Mathematics, Statistics, and Probability (Berkeley, June 21–July 18, 1965; December 27, 1965–January 7, 1966), Vol. 1 (Univ. of California Press, Berkeley, 1967), pp. 281–297.Google Scholar
  10. 10.
    C. H. Papadimitriou, Computational Complexity (Addison-Wesley, New York, 1994).zbMATHGoogle Scholar
  11. 11.
    M. Rao, “Cluster Analysis and Mathematical Programming,” J. Amer. Stat. Assoc. 66, 622–626 (1971).CrossRefzbMATHGoogle Scholar
  12. 12.
    H. Wirth, Algorithms + Data Structures = Programs (Prentice Hall, New Jersey, 1976).zbMATHGoogle Scholar

Copyright information

© Pleiades Publishing, Ltd. 2014

Authors and Affiliations

  1. 1.Sobolev Institute of MathematicsNovosibirskRussia
  2. 2.Novosibirsk State UniversityNovosibirskRussia

Personalised recommendations