Hierarchical Document Clustering Based on Tolerance Rough Set Model

Kawasaki, Saori; Binh, Ngoc; Bao, Tu

doi:10.1007/3-540-45372-5_51

Saori Kawasaki⁴,
Ngoc Binh⁴ &
Tu Bao⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1910))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

2745 Accesses
23 Citations

Abstract

Clustering is a powerful tool for knowledge discovery in text collections. The quality of document clustering depends not only on clustering algorithms but also on document representation models. We develop a hierarchical document clustering algorithm based on a tolerance rough set model (TRSM) for representing documents, which offers a way of considering semantics relatedness between documents. The results of validation and evaluation of this method suggest that this clustering algorithm can be well adapted to text mining.

Download to read the full chapter text

Chapter PDF

Semantic Clustering of Scientific Articles Using Explicit Semantic Analysis

Application of Tolerance Rough Sets in Structured and Unstructured Text Categorization: A Survey

Model for Automatic Textual Data Clustering in Relational Databases Schema

References

Fakes, W. B. and Baeza-Yates, Information Retrieval.Data Structures and Algorithms(eds.), Prentice Hall, 1992.
Google Scholar
Ho, T. B. and Funakoshi K., “Information retrieval using rough sets”, Journal of Japanese Society for Artificial Intelligence, Vol. 13, N. 3, 1998, 424–433.
Google Scholar
Pawlak, Z., Rough sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, 1991.
Google Scholar
Polkowski, L. and Skowron, A., Rough Sets in Knowledge Discovery 2. Applications, Case Studies and Software Systems(eds.), Physica-Verlag, 1998.
Google Scholar
Skowron, A. and Stepaniuk, J., “Generalized approximation spaces”, The 3rd International Workshop on Rough Sets and Soft Computing, 1994, 156–163.
Google Scholar

Download references

Author information

Authors and Affiliations

Japan Advanced Institute of Science and Technology, 923-1292, Tatsunokuchi, Ishikawa, JAPAN
Saori Kawasaki, Ngoc Binh & Tu Bao

Authors

Saori Kawasaki
View author publications
You can also search for this author in PubMed Google Scholar
Ngoc Binh
View author publications
You can also search for this author in PubMed Google Scholar
Tu Bao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Information Science, Norwegian University of Science and Technology, O.S. Bragstads plass 2E, 7491, Trondheim, Norway
Jan Komorowski
Department of Computer Science, University of North Carolina, Charlotte, NC 28223, USA
Jan Żytkow
Laboratoire ERIC, Université Lyon 2, 5 avenue Pierre Mendès-France, 69676, Bron, France
Djamel A. Zighed

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kawasaki, S., Binh, N., Bao, T. (2000). Hierarchical Document Clustering Based on Tolerance Rough Set Model. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2000. Lecture Notes in Computer Science(), vol 1910. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45372-5_51

Download citation

DOI: https://doi.org/10.1007/3-540-45372-5_51
Published: 18 July 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41066-9
Online ISBN: 978-3-540-45372-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Hierarchical Document Clustering Based on Tolerance Rough Set Model

Abstract

Chapter PDF

Similar content being viewed by others

Semantic Clustering of Scientific Articles Using Explicit Semantic Analysis

Application of Tolerance Rough Sets in Structured and Unstructured Text Categorization: A Survey

Model for Automatic Textual Data Clustering in Relational Databases Schema

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Hierarchical Document Clustering Based on Tolerance Rough Set Model

Abstract

Chapter PDF

Similar content being viewed by others

Semantic Clustering of Scientific Articles Using Explicit Semantic Analysis

Application of Tolerance Rough Sets in Structured and Unstructured Text Categorization: A Survey

Model for Automatic Textual Data Clustering in Relational Databases Schema

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation