Determination of Similarity Threshold in Clustering Problems for Large Data Sets

Sánchez-Díaz, Guillermo; Martínez-Trinidad, José F.

doi:10.1007/978-3-540-24586-5_75

Guillermo Sánchez-Díaz⁶ &
José F. Martínez-Trinidad⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2905))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

1539 Accesses
3 Citations
3 Altmetric

Abstract

A new automatic method based on an intra-cluster criterion, to obtain a similarity threshold that generates a well-defined clustering (or near to it) for large data sets, is proposed. This method uses the connected component criterion, and it neither calculates nor stores the similarity matrix of the objects in main memory. The proposed method is focused on unsupervised Logical Combinatorial Pattern Recognition approach. In addition, some experimentations of the new method with large data sets are presented.

Download to read the full chapter text

Chapter PDF

The Cluster Algorithms for Solving Problems with Asymmetric Proximity Measures

Article 01 April 2018

Consensus of Clusterings Based on High-Order Dissimilarities

Nonsmooth Optimization Based Algorithms in Cluster Analysis

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley, New York (2000)
Google Scholar
Martínez-Trinidad, J.F., Guzmán-Arenas, A.: The logical combinatorial approach to pattern recognition an overview through selected works. Pattern Recognition 34(4), 741–751 (2001)
Article MATH Google Scholar
Ruiz-Shulcloper, J., Mongi, A.: A. Logical Combinatorial Pattern Recognition: A Review. In: Pandalai (ed.) Recent Research Developments in Pattern Recognition, Pub. Transword Research Networks, USA (to appear)
Google Scholar
Martinez Trinidad, J.F., Ruiz Shulcloper, J., Lazo Cortes, M.: Structuraliation of universes. Fuzzy Sets and Systems 112(3), 485–500 (2000)
Article MATH MathSciNet Google Scholar
Sanchez-Diaz, G., Ruiz-Shulcloper, J.: MID mining: a logical combinatorial pattern recognition approach to clustering large data sets. In: Proc. 5th Iberoamerican Symposium on Pattern Recognition, Lisbon, Portugal, pp. 475–483 (2000)
Google Scholar
Pico Peña, R.: Determining the similarity threshold for clustering algorithms in the Logical Combinatorial Pattern Recognition through a dendograme. In: Proc. 4th Iberoamerican Simposium of Pattern Recognition, Havana Cuba, pp. 259–265 (1999)
Google Scholar
Reyes Gonzales, R., Ruiz-Shulcloper, J.: An algorithm for restricted structuralization of spaces. In: Proc. 4th Iberoamerican Simposium of Pattern Recognition, Havana Cuba, pp. 267–278 (1999)
Google Scholar
Ruiz-Shulcloper, J., Montellano-Ballesteros, J.: A new model of fuzzy clustering algorithms. In: Proc. of the 3rd EUFIT, Aachen, Germany, pp. 1484–1488 (1995)
Google Scholar
Ruiz-Shulcloper, J., Sanchez-Diaz, G., Abidi, M.: Clustering Mixed Incomplete Data. Heuristics & Optimization for Knowledge Discovery, pp. 88–106. Idea Group Publishing, USA (2002)
Google Scholar
Han, J., Kamber, M.: Data mining: concepts and techniques. The Morgan Kaufmann Series in Data Management Systems, Jim Gray Series Editor (2000)
Google Scholar
Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. Department of Information and Computer Science. University of California, Irvine (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Google Scholar

Download references

Author information

Authors and Affiliations

Center of Technologies Research on Information and Systems, The Autonomous University of the Hidalgo State, Carr. Pachuca Tulancingo Km. 4.5, C.U 42084, Pachuca, Hgo, Mexico
Guillermo Sánchez-Díaz
National Institute of Astrophysics, Optics and Electronics, Luis Enrique Erro No. 1, Sta. María Tonantzintla, 72840, Puebla, Mexico
José F. Martínez-Trinidad

Authors

Guillermo Sánchez-Díaz
View author publications
You can also search for this author in PubMed Google Scholar
José F. Martínez-Trinidad
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. System Engineering and Automation, Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
Alberto Sanfeliu
Advanced Technologies Applications Center, MINBAS, Cuba
José Ruiz-Shulcloper

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sánchez-Díaz, G., Martínez-Trinidad, J.F. (2003). Determination of Similarity Threshold in Clustering Problems for Large Data Sets. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds) Progress in Pattern Recognition, Speech and Image Analysis. CIARP 2003. Lecture Notes in Computer Science, vol 2905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24586-5_75

Download citation

DOI: https://doi.org/10.1007/978-3-540-24586-5_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20590-6
Online ISBN: 978-3-540-24586-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Determination of Similarity Threshold in Clustering Problems for Large Data Sets

Abstract

Chapter PDF

Similar content being viewed by others

The Cluster Algorithms for Solving Problems with Asymmetric Proximity Measures

Consensus of Clusterings Based on High-Order Dissimilarities

Nonsmooth Optimization Based Algorithms in Cluster Analysis

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Determination of Similarity Threshold in Clustering Problems for Large Data Sets

Abstract

Chapter PDF

Similar content being viewed by others

The Cluster Algorithms for Solving Problems with Asymmetric Proximity Measures

Consensus of Clusterings Based on High-Order Dissimilarities

Nonsmooth Optimization Based Algorithms in Cluster Analysis

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation