Autonomous clustering using rough set theory



This paper proposes a clustering technique that minimizes the need for subjective human intervention and is based on elements of rough set theory (RST). The proposed algorithm is unified in its approach to clustering and makes use of both local and global data properties to obtain clustering solutions. It handles single-type and mixed attribute data sets with ease. The results from three data sets of single and mixed attribute types are used to illustrate the technique and establish its efficiency.


Rough set theory (RST) data clustering knowledge-oriented clustering autonomous 


  1. [1]
    T. Sorensen. A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and Its Application to the Analyses of the Vegetation on Danish Commons. Biologiske Skrifter, vol. 5, no. 4, pp. 1–34, 1948.Google Scholar
  2. [2]
    P. Sneath. The Application of Computers to Taxonomy. Journal of General Microbiology, vol. 17, no. 1, pp. 201–226, 1957.Google Scholar
  3. [3]
    R. R. Sokal, P. H. A. Sneath. Principles of Numerical Taxonomy, W. H. Freeman, San Francisco, USA, 1963.Google Scholar
  4. [4]
    J. H. Ward. Hierarchical Grouping to Optimize an Objective Function. Journal of the American Statistical Association, vol. 58, no. 301, pp. 236–244, 1963.CrossRefMathSciNetGoogle Scholar
  5. [5]
    M. R. Anderberg. Cluster Analysis for Applications, Academic Press, New York, USA, 1973.MATHGoogle Scholar
  6. [6]
    M. S. Aldenderfer, R. K. Blashfield. Cluster Analysis, Sage University Paper, Newbury Park, USA, 1984.Google Scholar
  7. [7]
    B. S. Everitt. Cluster Analysis, Edward Arnold, Cambridge, UK, 1993.Google Scholar
  8. [8]
    S. Sharma. Applied Multivariate Techniques, John Wiley & Sons, New York, USA, 1996.Google Scholar
  9. [9]
    A. K. Jain, M. N. Murty, P. J. Flynn. Data Clustering: A review. ACM Computing Surveys, vol. 31, no.3, pp. 264–323, 1999.CrossRefGoogle Scholar
  10. [10]
    R. R. Yegar. Intelligent Control of the Hierarchical Clustering Process. IEEE Transactions on Systems, Man, and Cybenetics-Part B, vol. 30, no. 6, pp. 835–845, 2000.CrossRefGoogle Scholar
  11. [11]
    E. W. Forgey. Cluster Analysis of Multivariate Data: Efficiency Versus Interpretability of Classifications. Biometrics, vol. 21, no. 3, pp. 768–769, 1965.Google Scholar
  12. [12]
    R. C. Jancey. Multidimensional Group Analysis. Australian Journal of Botany, vol. 14, no. 1, pp. 127–130, 1966.CrossRefGoogle Scholar
  13. [13]
    J. B. MacQueen. Some Methods of Classification and Analysis of Multivariate Observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley, USA, vol. 1, pp. 281–297, 1967.Google Scholar
  14. [14]
    G. H. Ball, D. J. Hall. ISODATA, a Novel Method of Data Analysis and Pattern Classification, Technical Report AD 699616, Stanford Research Institute, Menlo Park, USA, 1965.Google Scholar
  15. [15]
    F. H. C. Marriott. Optimization Methods of Cluster Analysis. Biometrika, vol. 69, no. 2, pp. 417–421, 1982.CrossRefMathSciNetGoogle Scholar
  16. [16]
    S. Z. Selim, M. A. Ismail. K-means Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality. IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 6, no. 1, pp. 81–87, 1984.MATHCrossRefGoogle Scholar
  17. [17]
    J. C. Dunn. A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact, Well Separated Clusters. Journal of Cybernetics, vol. 3, no. 3, pp. 32–57, 1973.MATHMathSciNetGoogle Scholar
  18. [18]
    J. C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithm, Plenum Press, New York, USA, 1981.Google Scholar
  19. [19]
    A. K. Jain, R. C. Dubes. Algorithms for Clustering Data, Prentice-Hall, USA, 1988.MATHGoogle Scholar
  20. [20]
    M. S. Kamel, S. Z. Selim. New Algorithms for Solving the Fuzzy Clustering Problem. Pattern Recognition, vol. 27, no. 3, pp. 421–428, 1994.CrossRefGoogle Scholar
  21. [21]
    J. S. R. Jang, C. T. Sun, E. Mizutani. Neuro-fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence, Prentice-Hall, USA, 1996.Google Scholar
  22. [22]
    F. Höppner, F. Klawonn, R. Kruse, T. Runkler. Fuzzy Cluster Analysis. Wiley & Sons, Chichester, England, 1999.MATHGoogle Scholar
  23. [23]
    Z. Pawlak. Rough Sets. International Journal of Information and Computer Sciences, vol. 11, no. 5, pp. 341–356, 1982.CrossRefMathSciNetMATHGoogle Scholar
  24. [24]
    Z. Pawlak. Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic, Dordrecht, Holland, 1991.MATHGoogle Scholar
  25. [25]
    A. Skowron, C. Rauszer. The Discernibility Matrices and Functions in Information Systems. Intelligent Decision Support, Handbook of Applications and Advances of the Rough Sets Theory, R. Slowinski (ed.), Kluwer Academic, Dordrecht, Holland, pp. 331–362, 1992.Google Scholar
  26. [26]
    J. Komorowski, Z. Pawlak, L. Polkowski, A. Skowron. Rough Sets: A Tutorial. Rough Fuzzy Hybridization: A New Method for Decision Making, S. Pal, A. Skowron (eds.), Springer, Berlin, Germany, 1998.Google Scholar
  27. [27]
    D. Dubois, H. Prade. Rough Fuzzy Sets and Fuzzy Rough Sets. International Jounal of General Systems, vol. 17, no. 2, pp. 191–209, 1989.CrossRefGoogle Scholar
  28. [28]
    C. L. Bean, C. Kambhampati. Knowledge-oriented Clustering for Decision Support. In Proceedings of IEEE International Joint Conference on Neural Networks, Portland, Oregon, USA, vol. 4, pp. 3244–3249, 2003.Google Scholar
  29. [29]
    T. Okuzaki, S. Hirano, S. Kobashi, Y. Hata, Y. Takahashi. A Rough Set Based Clustering Method by Knowledge Combination. IEICE Transactions on Information and Systems, vol. 85, no. 12, pp. 1898–1908, 2002.Google Scholar
  30. [30]
    C. L. Bean, C. Kambhampati, S. Rajasekharan. A Rough Set Solution to a Fuzzy Set Problem. In Proceedings of IEEE International Conference on Fuzzy Systems, World Congress in Computational Intelligence, Honolulu, Hawaii, vol. 1, pp. 18–23, 2002.CrossRefGoogle Scholar
  31. [31]
    S. Hirano, S. Tsumoto. A Knowledge-oriented Clustering Technique Based on Rough Sets. In Proceedings of 25th IEEE International Conference on Computer and Software Applications, Chicago, USA, pp. 632–637, 2001.Google Scholar
  32. [32]
    B. J. F. Manly. Multivariate Statistical Methods, A Primer, Chapman & Hall, New York, USA, 2000.Google Scholar
  33. [33]
    J. A. Hartigan. Clustering Algorithms, John Wiley & Sons, New York, USA, 1975.MATHGoogle Scholar

Copyright information

© Institute of Automation, Chinese Academy of Sciences 2008

Authors and Affiliations

  1. 1.Warwick Medical School Gibbet Hill CampusUniversity of WarwickCoventryUK
  2. 2.Department of Computer ScienceUniversity of HullHullUK

Personalised recommendations