Cognitive Computation

, Volume 8, Issue 3, pp 442–461 | Cite as

Granular Computing Techniques for Classification and Semantic Characterization of Structured Data

  • Filippo Maria Bianchi
  • Simone Scardapane
  • Antonello Rizzi
  • Aurelio Uncini
  • Alireza Sadeghian
Article

Abstract

We propose a system able to synthesize automatically a classification model and a set of interpretable decision rules defined over a set of symbols, corresponding to frequent substructures of the input dataset. Given a preprocessing procedure which maps every input element into a fully labeled graph, the system solves the classification problem in the graph domain. The extracted rules are then able to characterize semantically the classes of the problem at hand. The structured data that we consider in this paper are images coming from classification datasets: they represent an effective proving ground for studying the ability of the system to extract interpretable classification rules. For this particular input domain, the preprocessing procedure is based on a flexible segmentation algorithm whose behavior is defined by a set of parameters. The core inference engine uses a parametric graph edit dissimilarity measure. A genetic algorithm is in charge of selecting suitable values for the parameters, in order to synthesize a classification model based on interpretable rules which maximize the generalization capability of the model. Decision rules are defined over a set of information granules in the graph domain, identified by a frequent substructures miner. We compare the system with two other state-of-the-art graph classifiers, evidencing both its main strengths and limits.

Keywords

Granular computing Automatic semantic interpretation  Frequent substructures miner Graph matching Graph classification Evolutionary optimization Watershed segmentation 

References

  1. 1.
    Agarwal B, Poria S, Mittal N, Gelbukh A, Hussain A. Concept-level sentiment analysis with dependency-based semantic parsing: a novel approach. Cogn Comput. 2015;7(4):487–99.CrossRefGoogle Scholar
  2. 2.
    Alves R, Rodriguez-Baena DS, Aguilar-Ruiz JS. Gene association analysis: a survey of frequent pattern mining from gene expression data. Brief Bioinform. 2010;11(2):210–24.CrossRefPubMedGoogle Scholar
  3. 3.
    Antonini M, Barlaud M, Mathieu P, Daubechies I. Image coding using wavelet transform. IEEE Trans Image Process. 1992;1(2):205–20.CrossRefPubMedGoogle Scholar
  4. 4.
    Bargiela A, Pedrycz W. Granular computing: an introduction. Springer Science & Business Media; 2012.Google Scholar
  5. 5.
    Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell. 2013;35(8):1798–828.CrossRefPubMedGoogle Scholar
  6. 6.
    Bianchi FM, Livi L, Rizzi A. Two density-based k-means initialization algorithms for non-metric data clustering. Pattern Anal Appl. 2015. doi:10.1007/s10044-014-0440-4.
  7. 7.
    Bianchi FM, Maiorino E, Livi L, Rizzi A, Sadeghian A. An agent-based algorithm exploiting multiple local dissimilarities for clusters mining and knowledge discovery. Soft Comput. 2015. doi:10.1007/s00500-015-1876-1.
  8. 8.
    Bianchi FM, Scardapane S, Livi L, Uncini A, Rizzi A. An interpretable graph-based image classifier. In: 2014 International Joint Conference on Neural Networks (IJCNN), p. 2339–2346. IEEE (2014).Google Scholar
  9. 9.
    Bianchi FM, Livi L, Rizzi A, Sadeghian A. A granular computing approach to the design of optimized graph classification systems. Soft Comput. 2014;18(2):393–412. doi:10.1007/s00500-013-1065-z.CrossRefGoogle Scholar
  10. 10.
    Borgelt C. Canonical forms for frequent graph mining. In: Advances in data analysis. Studies in classification, data analysis, and knowledge organization. Berlin Heidelberg: Springer; 2007. p. 337–349. doi:10.1007/978-3-540-70981-7_38.CrossRefGoogle Scholar
  11. 11.
    Borgwardt KM, Ong CS, Schönauer S, Vishwanathan SVN, Smola AJ, Kriegel HP. Protein function prediction via graph kernels. Bioinformatics. 2005;21:47–56.CrossRefGoogle Scholar
  12. 12.
    Boussaïd I, Lepagnot J, Siarry P. A survey on optimization metaheuristics. Inf Sci. 2013;237:82–117.CrossRefGoogle Scholar
  13. 13.
    Cover T, Hart P. Nearest neighbor pattern classification. Inf Theory IEEE Trans. 1967;13(1):21–7.CrossRefGoogle Scholar
  14. 14.
    Del Vescovo G, Livi L, Frattale Mascioli FM, Rizzi A. On the problem of modeling structured data with the MinSOD representative. Int J Comput Theory Eng. 2014;6(1):9–14.CrossRefGoogle Scholar
  15. 15.
    Del Vescovo G, Rizzi A. Automatic Classification of Graphs by Symbolic Histograms. In: Granular Computing, 2007. GRC 2007. IEEE International Conference on, p. 410–410.Google Scholar
  16. 16.
    Del Vescovo G, Rizzi A. Online Handwriting Recognition by the Symbolic Histograms Approach. In: Proceedings of the 2007 IEEE International Conference on Granular Computing., GRC ’07, p. 686–700. IEEE Computer Society, Washington, DC (2007).Google Scholar
  17. 17.
    Eichinger F, Bohm K. Software-bug localization with graph mining. In: Managing and mining graph data. Springer; 2010. vol. 40, p. 515–546. doi:10.1007/978-1-4419-6045-0_17.
  18. 18.
    Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157–82.Google Scholar
  19. 19.
    Han J, Cheng H, Xin D, Yan X. Frequent pattern mining: current status and future directions. Data Min Knowl Discov. 2007;15(1):55–86.CrossRefGoogle Scholar
  20. 20.
    Han D, Hu Y, Ai S, Wang G. Uncertain graph classification based on extreme learning machine. Cognitive Comput. 2015;7(3):346–58.CrossRefGoogle Scholar
  21. 21.
    Hanbury A. A survey of methods for image annotation. J Vis Lang Comput. 2008;19(5):617–27.CrossRefGoogle Scholar
  22. 22.
    Huan J, Wang W, Prins J. Efficient mining of frequent subgraphs in the presence of isomorphism. In: 2003 Third IEEE International Conference on Data Mining (ICDM’03), p. 549–552. IEEE (2003).Google Scholar
  23. 23.
    Ketkar NS, Holder LB, Cook DJ. Mining in the Proximity of Subgraphs. In: ACM KDD Workshop on Link Analysis: Dynamics and Statics of Large Networks (2006).Google Scholar
  24. 24.
    Lange J, von der Malsburg C, et al. Distortion invariant object recognition by matching hierarchically labeled graphs. In: 1989 International Joint Conference on Neural Networks (IJCNN’89), p. 155–159. IEEE (1989).Google Scholar
  25. 25.
    Li LJ, Su H, Fei-Fei L, Xing EP. Object bank: A high-level image representation for scene classification & semantic feature sparsification. In: Lafferty J, Williams C, Shawe-Taylor J, Zemel R, Culotta A, editors. Advances in neural information processing systems 23. Curran Associates, Inc., 2010. p. 1378–86.Google Scholar
  26. 26.
    Livi L, Del Vescovo G, Rizzi A. Combining graph seriation and substructures mining for graph recognition. In: Pattern recognition - applications and methods. Advances in intelligent systems and computing. Berlin Heidelberg: Springer; 2013. vol. 204, p. 79–91. doi:10.1007/978-3-642-36530-0_7.CrossRefGoogle Scholar
  27. 27.
    Livi L, Del Vescovo G, Rizzi A, Frattale Mascioli FM. Building Pattern Recognition Applications with the SPARE Library. ArXiv preprint arXiv:1410.5263 (2014).
  28. 28.
    Livi L, Rizzi A. The graph matching problem. Pattern Anal Appl. 2013;16(3):253–83. doi:10.1007/s10044-012-0284-8.CrossRefGoogle Scholar
  29. 29.
    Lu D, Weng Q. A survey of image classification methods and techniques for improving classification performance. Int J Remote Sens. 2007;28(5):823–70.CrossRefGoogle Scholar
  30. 30.
    Mukundan R, Ramakrishnan KR. Moment functions in image analysis: theory and applications. Singapore: World Scientific; 1998.CrossRefGoogle Scholar
  31. 31.
    Neuhaus M, Bunke H. Bridging the gap between graph edit distance and kernel machines. Series in machine perception and artificial intelligence. London: World Scientific; 2007.Google Scholar
  32. 32.
    Nijssen S, Kok JN. A quickstart in frequent structure mining can make a difference. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, p. 647–652. ACM (2004).Google Scholar
  33. 33.
    Pavlidis T. Representation of figures by labeled graphs. Pattern Recognit. 1972;4(1):5–17.CrossRefGoogle Scholar
  34. 34.
    Rizzi A, Panella M, Frattale Mascioli F. Adaptive resolution min-max classifiers. Neural Netw IEEE Trans. 2002;13(2):402–14.CrossRefGoogle Scholar
  35. 35.
    Rizzi A, Del Vescovo G. A symbolic approach to the solution of F-classification problems. In: 2005 Proceedings of the IEEE International Joint Conference on Neural Networks, 2005, vol. 3, p. 1953–1958. IEEE (2005).Google Scholar
  36. 36.
    Rizzi A, Del Vescovo G. Automatic Image Classification by a Granular Computing Approach. In: Proceedings of the 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing, p. 33–38. IEEE (2006).Google Scholar
  37. 37.
    Roerdink JB, Meijster A. The watershed transform: definitions, algorithms and parallelization strategies. Fundam Inform. 2000;41(1):187–228.Google Scholar
  38. 38.
    Scardapane S, Wang D, Panella M, Uncini A. Distributed learning for random vector functional-link networks. Inf Sci. 2015;301(0):271–84.CrossRefGoogle Scholar
  39. 39.
    SPImR2: A set of 24 Instances of Synthetic and Photographic Image Classification problems. 2014. http://infocom.uniroma1.it/~rizzi/index.htm.
  40. 40.
    Theodoridis S, Koutroumbas K. Pattern recognition. Elsevier: Academic Press; 2006.Google Scholar
  41. 41.
    Tun K, Dhar P, Palumbo M, Giuliani A. Metabolic pathways variability and sequence/networks comparisons. BMC Bioinform. 2006;7(1):24.CrossRefGoogle Scholar
  42. 42.
    Wang JZ, Li J, Wiederhold G. SIMPLIcity: semantics-sensitive integrated matching for picture libraries. IEEE Trans Pattern Anal Mach Intell. 2001;23(9):947–63.CrossRefGoogle Scholar
  43. 43.
    Weng CH. Mining fuzzy specific rare itemsets for education data. Knowl-Based Syst. 2011;24(5):697–708.CrossRefGoogle Scholar
  44. 44.
    Wiskott L, Fellous JM, Kuiger N, Von Der Malsburg C. Face recognition by elastic bunch graph matching. IEEE Trans Pattern Anal Mach Intell. 1997;19(7):775–9.CrossRefGoogle Scholar
  45. 45.
    Yan X, Han J. gspan: Graph-based substructure pattern mining. In: 2002 IEEE International Conference on Data Mining (ICDM’02), p. 721–724. IEEE (2002).Google Scholar
  46. 46.
    Yun U, Ryu KH. Approximate weighted frequent pattern mining with/without noisy environments. Knowl-Based Syst. 2011;24(1):73–82.CrossRefGoogle Scholar
  47. 47.
    Zhang J, Zhan ZH, Lin Y, Chen N, Gong YJ, Zhong JH, Chung HS, Li Y, Shi YH. Evolutionary computation meets machine learning: a survey. IEEE Comput Intell Mag. 2011;6(4):68–75.CrossRefGoogle Scholar
  48. 48.
    Zhang S, He B, Nian R, Wang J, Han B, Lendasse A, Yuan G. Fast image recognition based on independent component analysis and extreme learning machine. Cogn Comput. 2014;6(3):405–22.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of Information Engineering, Electronics, and TelecommunicationsSAPIENZA University of RomeRomeItaly
  2. 2.Department of Computer ScienceRyerson UniversityTorontoCanada

Personalised recommendations