Skip to main content
Log in

Computational Complexity of the Problem of Choosing Typical Representatives in a \(\boldsymbol 2\) -Clustering of a Finite Set of Points in a Metric Space

Journal of Applied and Industrial Mathematics Aims and scope Submit manuscript

Abstract

We consider the computational complexity of one extremal problem of choosing a subset of \(p \) points from some given \(2 \)-clustering of a finite set in a metric space. The chosen subset of points has to describe the given clusters in the best way from the viewpoint of some geometric criterion. This is a formalization of an applied problem of data mining which consists in finding a subset of typical representatives of a dataset composed of two classes based on the function of rival similarity. The problem is proved to be NP-hard. To this end, we polynomially reduce to the problem one of the well-known problems NP-hard in the strong sense, the \(p \)-median problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

REFERENCES

  1. M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, San Francisco, 1979).

    MATH  Google Scholar 

  2. D. Aloise, A. Deshpande, P. Hansen, and P. Popat, “NP-Hardness of Euclidean Sum-of-Squares Clustering,” Machine Learning 75 (2), 245–248 (2009).

    Article  Google Scholar 

  3. S. Dasgupta, “The Hardness of \(k \)-Means Clustering,” in Technical Report CS2007-0890 (University of California, San Diego, 2008), pp. 1–6.

  4. C. H. Papadimitriou, “Worst-Case and Probabilistic Analysis of a Geometric Location Problem,” SIAM J. Comput. 10 (3), 542–557 (1981).

    Article  MathSciNet  Google Scholar 

  5. S. Har-Peled and S. Mazumdar, “Coresets for \(k \)-Means and \(k \)-Median Clustering and Their Applications,” inProceedings of 36th Annual ACM Symposium on Theory of Computing (Chicago, Illinois, USA, June 13-15, 2004), pp. 291–300.

  6. L. Kaufman and P. J. Rousseeuw, “Clustering by Means of Medoids,” inStatistical Data Analysis Based on the \(L_1 \)-Norm and Related Methods, Ed. by Y. Dodge (North Holland, Amsterdam, 1987), pp. 405–416.

  7. A. V. Kel’manov, A. V. Pyatkin, and V. I. Khandeev, “Complexity of Some Max-Min Clustering Problems,” Trudy Inst. Mat. Mekh. Ural. Otdel. Ross. Akad. Nauk24 (4), 189–198 (2018).

    MathSciNet  Google Scholar 

  8. A. V. Kel’manov and A. V. Pyatkin, “NP-Hardness of Some Euclidean Problems of Partition of a Finite Point Set,” Zh. Vychisl. Mat. Mat. Fiz. 58 (5), 852–856 (2018).

    MATH  Google Scholar 

  9. H. Aggarwal, N. Imai, N. Katoh, and S. Suri, “Finding \(k \) Points with Minimum Diameter and Related Problems,” J. Algorithms 12 (1), 38–56 (1991).

    Article  MathSciNet  Google Scholar 

  10. S. Banerjee, S. Bhore, and R. Chitnis, “Algorithms and Hardness Results for Nearest Neighbor Problems in Bicolored Point Sets,” in Proceedings of the 13th Latin American Theoretical Informatics Symposium (LATIN) (Buenos Aires, Argentina, April 16-19, 2018), pp. 80–93.

  11. A. V. Zukhba, “NP-Completeness of the Problem of Prototype Selection in the Nearest Neighbor Method,” Pattern Recognition and Image Analysis 20 (4), 484–494 (2010).

    Article  Google Scholar 

  12. V. N. Vapnik, The Reconstruction of Dependencies from Empirical Data (Nauka, Moscow, 1974) [in Russian].

    Google Scholar 

  13. C. J. C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition,” Data Mining Knowl. Discov. 2 (2), 121–167 (1998).

    Article  Google Scholar 

  14. N. G. Zagoruiko, I. A. Borisova, V. V. Dyubanov, and O. A. Kutnenko, “Methods of Recognition Based on the Function of Rival Similarity,” Pattern Recognition and Image Analysis 18 (1), 1–6 (2008).

    Article  Google Scholar 

  15. O. Kariv and S. Hakimi, “An Algorithmic Approach to Network Location Problems. The p-Medians,” SIAM J. Appl. Math. 37, 539–560 (1979).

    Article  MathSciNet  Google Scholar 

Download references

Funding

The author was supported by the State Task to the Sobolev Institute of Mathematics (project no. 0314–2019–0015).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to I. A. Borisova.

Additional information

Translated by Ya.A. Kopylov

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Borisova, I.A. Computational Complexity of the Problem of Choosing Typical Representatives in a \(\boldsymbol 2\) -Clustering of a Finite Set of Points in a Metric Space. J. Appl. Ind. Math. 14, 242–248 (2020). https://doi.org/10.1134/S1990478920020039

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1990478920020039

Keywords

Navigation