Advertisement

Similarity measures in formal concept analysis

  • Faris Alqadah
  • Raj Bhatnagar
Article

Abstract

Formal concept analysis (FCA) has been applied successively in diverse fields such as data mining, conceptual modeling, social networks, software engineering, and the semantic web. One shortcoming of FCA, however, is the large number of concepts that typically arise in dense datasets hindering typical tasks such as rule generation and visualization. To overcome this shortcoming, it is important to develop formalisms and methods to segment, categorize and cluster formal concepts. The first step in achieving these aims is to define suitable similarity and dissimilarity measures of formal concepts. In this paper we propose three similarity measures based on existent set-based measures in addition to developing the completely novel zeros-induced measure. Moreover, we formally prove that all the measures proposed are indeed similarity measures and investigate the computational complexity of computing them. Finally, an extensive empirical evaluation on real-world data is presented in which the utility and character of each similarity measure is tested and evaluated.

Keywords

Cluster similarity Formal concept analysis 

Mathematics Subject Classifications (2010)

62H30 68T10 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alqadah, F., Bhatnagar, R.: Discovering substantial distinctions among incremental bi-clusters. In: Proceedings, 2009 SIAM International Conference on Data Mining (2009)Google Scholar
  2. 2.
    Amigo, E., Gonzalo, J., Artiles, J., Verdejo, F.: A comparison of extrinsic clustering evaluation metrics based on formal constraints. Information Retrieval Online (2008). doi: 10.1007/s10791-008-9066-8 Google Scholar
  3. 3.
    Asuncion, A., Newman, D.: UCI machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html (2007)
  4. 4.
    Bělohlávek, R.: Similarity relations in concept lattices. J. Log. Comput. 10(6), 823–845 (2000)zbMATHCrossRefGoogle Scholar
  5. 5.
    Bělohlávek, R.: Combination of knowledge in fuzzy concept lattices. Int. J. Knowl.-Based Intell. Engg. Syst. 6(1), 9–14 (2002)Google Scholar
  6. 6.
    Bělohlávek, R., Dvorák, J., Outrata, J.: Fast factorization of concept lattices by similarity. In: Proceedings, Concept Lattices and their Applications (2004)Google Scholar
  7. 7.
    Berry, A., Bordat, J.P., Sigayret, A.: A local approach to concept generation. Ann. Math. Artif. Intell. 49, 117–136 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Blachon, S., Pensa, R.G., Benson, J., Robardet, C., Boulicat, J.F., Gandrillon, O.: Clustering formal concepts to discover biologically relevant knowledge from gene expression data. In: Silico Biolgy, vol. 7, pp. 467–483 (2007)Google Scholar
  9. 9.
    Ding, Y., Fensel, D., Klein, M., Omelayenko, B.: The semantic web: yet another hip? Data Knowl. Eng. 41, 205–227 (2002)zbMATHCrossRefGoogle Scholar
  10. 10.
    Formica, A.: Concept similarity in formal concept analysis: An information content approach. Knowl.-Based Syst. 21, 80–87 (2007)CrossRefGoogle Scholar
  11. 11.
    Gamter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Berlin (1999)CrossRefGoogle Scholar
  12. 12.
    Giunchiglia, F., Shvaiko, P., Yatskevich, M.: S-Match: An Algorithm and an Implementation of Semantic Matching, pp. 61–75. Springer, Berlin (2004)Google Scholar
  13. 13.
    Ichise, R.: Evaluation of Similarity Measures for Ontology Mapping, pp. 15–25. Springer, Berlin (2009)Google Scholar
  14. 14.
    Karypis Lab: Cluto: Family of Data Clustering Software Tools. http://glaros.dtc.umn.edu/gkhome/views/cluto (2009)
  15. 15.
    Li, J., Liu, G., Li, H., Wong, L.: Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: a one-to-one correspondence and mining algorithms. IEEE Trans. Knowl. Data Eng. 19(12), 1625–1637 (2007)CrossRefGoogle Scholar
  16. 16.
    Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: Proceedings, International Conference on Data Engineering (ICDE’02), pp. 117–128. IEEE Computer Society, Los Alamitos (2002). doi:10.1109/ICDE.2002.994702 Google Scholar
  17. 17.
    Pfaltz, J.L.: Representing numeric values in concept lattices. In: Fifth International Conference on Concept Lattices and Their Applications (2007)Google Scholar
  18. 18.
    Priss, U.: Formal concept analysis in information science. Annu. Rev. Inf. Sci. Technol. 40, 521–543 (2006)CrossRefGoogle Scholar
  19. 19.
    Snasel, V., Horák, Z., Ajith, A.: Understanding social networks using formal concept analysis. In: Proceedings, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT ’08) (2008)Google Scholar
  20. 20.
    Tonella, P.: Formal concept analysis in software engineering. In: Proceedings, International Conference on Software Engineering (2004)Google Scholar
  21. 21.
    Zaki, M.J., Ogihara, M.: Theoretical foundations of association rules. In: 3rd SIGMOD’98 Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD) (1998)Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.University of CincinnatiCincinnatiUSA

Personalised recommendations