On Complexity of Searching a Subset of Vectors with Shortest Average Under a Cardinality Restriction

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 661)


In this paper, we study the computational complexity of the following subset search problem in a set of vectors. Given a set of N Euclidean q-dimensional vectors and an integer M, choose a subset of at least M vectors minimizing the Euclidean norm of the arithmetic mean of chosen vectors. This problem is induced, in particular, by a problem of clustering a set of points into two clusters where one of the clusters consists of points with a mean close to a given point. Without loss of generality the given point may be assumed to be the origin.

We show that the considered problem is NP-hard in the strong sense and it does not admit any approximation algorithm with guaranteed performance, unless P = NP. An exact algorithm with pseudo-polynomial time complexity is proposed for the special case of the problem, where the dimension q of the space is bounded from above by a constant and the input data are integer.


Vectors sum Subset selection Euclidean norm NP-hardness Pseudo-polymonial time 



This research is supported by RFBR, projects 15-01-00462, 16-01-00740 and 15-01-00976.


  1. 1.
    Aggarwal, C.C.: Data Mining: The Textbook. Springer International Publishing, Switzerland (2015)CrossRefzbMATHGoogle Scholar
  2. 2.
    Bishop, M.C.: Pattern Recognition and Machine Learning. Springer Science+Business Media, LLC, New York (2006)zbMATHGoogle Scholar
  3. 3.
    Baburin, A.E., Gimadi, E.K., Glebov, N.I., Pyatkin, A.V.: The problem of finding a subset of vectors with the maximum total weight. J. Appl. Ind. Math. 2(1), 32–38 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Borisovsky, P.A., Eremeev, A.V., Grinkevich, E.B., Klokov, S.A., Vinnikov, A.V.: Trading hubs construction for electricity markets. In: Kallrath, J., Pardalos, P.M., Rebennack, S., Scheidt, M. (eds.) Optimization in the Energy Industry. Energy Systems, pp. 29–58. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Dolgushev, A.V., Kel’manov, A.V.: An approximation algorithm for solving a problem of cluster analysis. J. Appl. Ind. Math. 5(4), 551–558 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Dolgushev, A.V., Kel’manov, A.V., Shenmaier, V.V.: Polynomial-time approximation scheme for a problem of partitioning a finite set into two clusters. Trudy Instituta Matematiki i Mekhaniki UrO RAN 21(3), 100–109 (2015). (in Russian)Google Scholar
  7. 7.
    Garey, M.R., Johnson, D.S.: Computers and Intractability. A Guide to the Theory of \(NP\)-Completeness. W.H. Freeman and Company, San Francisco (1979)zbMATHGoogle Scholar
  8. 8.
    Gimadi, E.K., Glazkov, Y.V., Rykov, I.A.: On two problems of choosing some subset of vectors with integer coordinates that has maximum norm of the sum of elements in euclidean space. J. Appl. Ind. Math. 3(3), 343–352 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Gimadi, E.K., Kel’manov, A.V., Kel’manova, M.A., Khamidullin, S.A.: Aposteriori finding a quasiperiodic fragment with given number of repetitions in a number sequence (in Russian). Sibirskii Zhurnal Industrial’noi Matematiki 9(25), 55–74 (2006)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Gimadi, E.K., Kel’manov, A.V., Kel’manova, M.A., Khamidullin, S.A.: A posteriori detecting a quasiperiodic fragment in a numerical sequence. Pattern Recogn. Image Anal. 18(1), 30–42 (2008)CrossRefzbMATHGoogle Scholar
  11. 11.
    Gimadi, E.K., Pyatkin, A.V., Rykov, I.A.: On polynomial solvability of some problems of a vector subset choice in a Euclidean space of fixed dimension. J. Appl. Ind. Math. 4(4), 48–53 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Gimadi, E.K., Rykov, I.A.: A randomized algorithm for finding a subset of vectors. J. Appl. Ind. Math. 9(3), 351–357 (2015)CrossRefzbMATHGoogle Scholar
  13. 13.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2001)CrossRefzbMATHGoogle Scholar
  14. 14.
    Kel’manov, A.V.: Off-line detection of a quasi-periodically recurring fragment in a numerical sequence. Proc. Steklov Inst. Math. 263(S2), 84–92 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Kel’manov, A.V.: On the complexity of some data analysis problems. Comput. Math. Math. Phys. 50(11), 1941–1947 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Kel’manov, A.V.: On the complexity of some cluster analysis problems. Comput. Math. Math. Phys. 51(11), 1983–1988 (2011)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Kel’manov, A.V., Khamidullin, S.A., Kel’manova, M.A.: Joint finding and evaluation of a repeating fragment in noised number sequence with given number of quasiperiodic repetitions (in Russian). In: Book of Abstracts of the Russian Conference “Discret Analysis and Operations Research” (DAOR-2004), p. 185. Sobolev Institute of Mathematics SB RAN, Novosibirsk (2004)Google Scholar
  18. 18.
    Kel’manov, A.V., Khandeev, V.I.: A 2-approximation polynomial algorithm for a clustering problem. J. Appl. Ind. Math. 7(4), 515–521 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Kel’manov, A.V., Khandeev, V.I.: A randomized algorithm for two-cluster partition of a set of vectors. Comput. Math. Math. Phys. 55(2), 330–339 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Kel’manov, A.V., Khandeev, V.I.: An exact pseudopolynomial algorithm for a problem of the two-cluster partitioning of a set of vectors. J. Appl. Ind. Math. 9(4), 497–502 (2015)CrossRefzbMATHGoogle Scholar
  21. 21.
    Kel’manov, A.V., Khandeev, V.I.: Fully polynomial-time approximation scheme for a special case of a quadratic Euclidean 2-clustering problem. Comput. Math. Math. Phys. 56(2), 334–341 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Kel’manov, A.V., Pyatkin, A.V.: On the complexity of a search for a subset of “similar” vectors. Doklady Math. 78(1), 574–575 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Kel’manov, A.V., Pyatkin, A.V.: On a version of the problem of choosing a vector subset. J. Appl. Ind. Math. 3(4), 447–455 (2009)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Kel’manov, A.V., Pyatkin, A.V.: Complexity of certain problems of searching for subsets of vectors and cluster analysis. Comput. Math. Math. Phys. 49(11), 1966–1971 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Tarasenko, E.: On complexity of single-hub selection problem. In: Proceedings of 24-th Regional Conference of Students “Molodezh tretjego tysacheletija”, pp. 45–48. Omsk State University, Omsk (2010). (in Russian)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Omsk Branch of Sobolev Institute of MathematicsSiberian Branch of Russian Academy of SciencesOmskRussia
  2. 2.Omsk State University n.a. F.M. DostoevskyOmskRussia
  3. 3.Sobolev Institute of MathematicsSiberian Branch of Russian Academy of SciencesNovosibirskRussia
  4. 4.Novosibirsk State UniversityNovosibirskRussia

Personalised recommendations