# Granular Computing Techniques for Classification and Semantic Characterization of Structured Data

- 206 Downloads
- 3 Citations

## Abstract

We propose a system able to synthesize automatically a classification model and a set of interpretable decision rules defined over a set of symbols, corresponding to frequent substructures of the input dataset. Given a preprocessing procedure which maps every input element into a fully labeled graph, the system solves the classification problem in the graph domain. The extracted rules are then able to characterize semantically the classes of the problem at hand. The structured data that we consider in this paper are images coming from classification datasets: they represent an effective proving ground for studying the ability of the system to extract interpretable classification rules. For this particular input domain, the preprocessing procedure is based on a flexible segmentation algorithm whose behavior is defined by a set of parameters. The core inference engine uses a parametric graph edit dissimilarity measure. A genetic algorithm is in charge of selecting suitable values for the parameters, in order to synthesize a classification model based on interpretable rules which maximize the generalization capability of the model. Decision rules are defined over a set of information granules in the graph domain, identified by a frequent substructures miner. We compare the system with two other state-of-the-art graph classifiers, evidencing both its main strengths and limits.

## Keywords

Granular computing Automatic semantic interpretation Frequent substructures miner Graph matching Graph classification Evolutionary optimization Watershed segmentation## Notes

## Compliance with Ethical Standards

## Conflict of Interest

Filippo Maria Bianchi, Simone Scardapane, Antonello Rizzi, Aurelio Uncini, and Alireza Sadeghian declare that they have no conflict of interest.

## Informed Consent

All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2008 (5). Additional informed consent was obtained from all patients for which identifying information is included in this article.

## Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

## References

- 1.Agarwal B, Poria S, Mittal N, Gelbukh A, Hussain A. Concept-level sentiment analysis with dependency-based semantic parsing: a novel approach. Cogn Comput. 2015;7(4):487–99.CrossRefGoogle Scholar
- 2.Alves R, Rodriguez-Baena DS, Aguilar-Ruiz JS. Gene association analysis: a survey of frequent pattern mining from gene expression data. Brief Bioinform. 2010;11(2):210–24.CrossRefPubMedGoogle Scholar
- 3.Antonini M, Barlaud M, Mathieu P, Daubechies I. Image coding using wavelet transform. IEEE Trans Image Process. 1992;1(2):205–20.CrossRefPubMedGoogle Scholar
- 4.Bargiela A, Pedrycz W. Granular computing: an introduction. Springer Science & Business Media; 2012.Google Scholar
- 5.Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell. 2013;35(8):1798–828.CrossRefPubMedGoogle Scholar
- 6.Bianchi FM, Livi L, Rizzi A. Two density-based k-means initialization algorithms for non-metric data clustering. Pattern Anal Appl. 2015. doi: 10.1007/s10044-014-0440-4.
- 7.Bianchi FM, Maiorino E, Livi L, Rizzi A, Sadeghian A. An agent-based algorithm exploiting multiple local dissimilarities for clusters mining and knowledge discovery. Soft Comput. 2015. doi: 10.1007/s00500-015-1876-1.
- 8.Bianchi FM, Scardapane S, Livi L, Uncini A, Rizzi A. An interpretable graph-based image classifier. In: 2014 International Joint Conference on Neural Networks (IJCNN), p. 2339–2346. IEEE (2014).Google Scholar
- 9.Bianchi FM, Livi L, Rizzi A, Sadeghian A. A granular computing approach to the design of optimized graph classification systems. Soft Comput. 2014;18(2):393–412. doi: 10.1007/s00500-013-1065-z.CrossRefGoogle Scholar
- 10.Borgelt C. Canonical forms for frequent graph mining. In: Advances in data analysis. Studies in classification, data analysis, and knowledge organization. Berlin Heidelberg: Springer; 2007. p. 337–349. doi: 10.1007/978-3-540-70981-7_38.CrossRefGoogle Scholar
- 11.Borgwardt KM, Ong CS, Schönauer S, Vishwanathan SVN, Smola AJ, Kriegel HP. Protein function prediction via graph kernels. Bioinformatics. 2005;21:47–56.CrossRefGoogle Scholar
- 12.Boussaïd I, Lepagnot J, Siarry P. A survey on optimization metaheuristics. Inf Sci. 2013;237:82–117.CrossRefGoogle Scholar
- 13.Cover T, Hart P. Nearest neighbor pattern classification. Inf Theory IEEE Trans. 1967;13(1):21–7.CrossRefGoogle Scholar
- 14.Del Vescovo G, Livi L, Frattale Mascioli FM, Rizzi A. On the problem of modeling structured data with the MinSOD representative. Int J Comput Theory Eng. 2014;6(1):9–14.CrossRefGoogle Scholar
- 15.Del Vescovo G, Rizzi A. Automatic Classification of Graphs by Symbolic Histograms. In: Granular Computing, 2007. GRC 2007. IEEE International Conference on, p. 410–410.Google Scholar
- 16.Del Vescovo G, Rizzi A. Online Handwriting Recognition by the Symbolic Histograms Approach. In: Proceedings of the 2007 IEEE International Conference on Granular Computing., GRC ’07, p. 686–700. IEEE Computer Society, Washington, DC (2007).Google Scholar
- 17.Eichinger F, Bohm K. Software-bug localization with graph mining. In: Managing and mining graph data. Springer; 2010. vol. 40, p. 515–546. doi: 10.1007/978-1-4419-6045-0_17.
- 18.Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157–82.Google Scholar
- 19.Han J, Cheng H, Xin D, Yan X. Frequent pattern mining: current status and future directions. Data Min Knowl Discov. 2007;15(1):55–86.CrossRefGoogle Scholar
- 20.Han D, Hu Y, Ai S, Wang G. Uncertain graph classification based on extreme learning machine. Cognitive Comput. 2015;7(3):346–58.CrossRefGoogle Scholar
- 21.Hanbury A. A survey of methods for image annotation. J Vis Lang Comput. 2008;19(5):617–27.CrossRefGoogle Scholar
- 22.Huan J, Wang W, Prins J. Efficient mining of frequent subgraphs in the presence of isomorphism. In: 2003 Third IEEE International Conference on Data Mining (ICDM’03), p. 549–552. IEEE (2003).Google Scholar
- 23.Ketkar NS, Holder LB, Cook DJ. Mining in the Proximity of Subgraphs. In: ACM KDD Workshop on Link Analysis: Dynamics and Statics of Large Networks (2006).Google Scholar
- 24.Lange J, von der Malsburg C, et al. Distortion invariant object recognition by matching hierarchically labeled graphs. In: 1989 International Joint Conference on Neural Networks (IJCNN’89), p. 155–159. IEEE (1989).Google Scholar
- 25.Li LJ, Su H, Fei-Fei L, Xing EP. Object bank: A high-level image representation for scene classification & semantic feature sparsification. In: Lafferty J, Williams C, Shawe-Taylor J, Zemel R, Culotta A, editors. Advances in neural information processing systems 23. Curran Associates, Inc., 2010. p. 1378–86.Google Scholar
- 26.Livi L, Del Vescovo G, Rizzi A. Combining graph seriation and substructures mining for graph recognition. In: Pattern recognition - applications and methods. Advances in intelligent systems and computing. Berlin Heidelberg: Springer; 2013. vol. 204, p. 79–91. doi: 10.1007/978-3-642-36530-0_7.CrossRefGoogle Scholar
- 27.Livi L, Del Vescovo G, Rizzi A, Frattale Mascioli FM. Building Pattern Recognition Applications with the SPARE Library. ArXiv preprint arXiv:1410.5263 (2014).
- 28.Livi L, Rizzi A. The graph matching problem. Pattern Anal Appl. 2013;16(3):253–83. doi: 10.1007/s10044-012-0284-8.CrossRefGoogle Scholar
- 29.Lu D, Weng Q. A survey of image classification methods and techniques for improving classification performance. Int J Remote Sens. 2007;28(5):823–70.CrossRefGoogle Scholar
- 30.Mukundan R, Ramakrishnan KR. Moment functions in image analysis: theory and applications. Singapore: World Scientific; 1998.CrossRefGoogle Scholar
- 31.Neuhaus M, Bunke H. Bridging the gap between graph edit distance and kernel machines. Series in machine perception and artificial intelligence. London: World Scientific; 2007.Google Scholar
- 32.Nijssen S, Kok JN. A quickstart in frequent structure mining can make a difference. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, p. 647–652. ACM (2004).Google Scholar
- 33.Pavlidis T. Representation of figures by labeled graphs. Pattern Recognit. 1972;4(1):5–17.CrossRefGoogle Scholar
- 34.Rizzi A, Panella M, Frattale Mascioli F. Adaptive resolution min-max classifiers. Neural Netw IEEE Trans. 2002;13(2):402–14.CrossRefGoogle Scholar
- 35.Rizzi A, Del Vescovo G. A symbolic approach to the solution of F-classification problems. In: 2005 Proceedings of the IEEE International Joint Conference on Neural Networks, 2005, vol. 3, p. 1953–1958. IEEE (2005).Google Scholar
- 36.Rizzi A, Del Vescovo G. Automatic Image Classification by a Granular Computing Approach. In: Proceedings of the 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing, p. 33–38. IEEE (2006).Google Scholar
- 37.Roerdink JB, Meijster A. The watershed transform: definitions, algorithms and parallelization strategies. Fundam Inform. 2000;41(1):187–228.Google Scholar
- 38.Scardapane S, Wang D, Panella M, Uncini A. Distributed learning for random vector functional-link networks. Inf Sci. 2015;301(0):271–84.CrossRefGoogle Scholar
- 39.SPImR2: A set of 24 Instances of Synthetic and Photographic Image Classification problems. 2014. http://infocom.uniroma1.it/~rizzi/index.htm.
- 40.Theodoridis S, Koutroumbas K. Pattern recognition. Elsevier: Academic Press; 2006.Google Scholar
- 41.Tun K, Dhar P, Palumbo M, Giuliani A. Metabolic pathways variability and sequence/networks comparisons. BMC Bioinform. 2006;7(1):24.CrossRefGoogle Scholar
- 42.Wang JZ, Li J, Wiederhold G. SIMPLIcity: semantics-sensitive integrated matching for picture libraries. IEEE Trans Pattern Anal Mach Intell. 2001;23(9):947–63.CrossRefGoogle Scholar
- 43.Weng CH. Mining fuzzy specific rare itemsets for education data. Knowl-Based Syst. 2011;24(5):697–708.CrossRefGoogle Scholar
- 44.Wiskott L, Fellous JM, Kuiger N, Von Der Malsburg C. Face recognition by elastic bunch graph matching. IEEE Trans Pattern Anal Mach Intell. 1997;19(7):775–9.CrossRefGoogle Scholar
- 45.Yan X, Han J. gspan: Graph-based substructure pattern mining. In: 2002 IEEE International Conference on Data Mining (ICDM’02), p. 721–724. IEEE (2002).Google Scholar
- 46.Yun U, Ryu KH. Approximate weighted frequent pattern mining with/without noisy environments. Knowl-Based Syst. 2011;24(1):73–82.CrossRefGoogle Scholar
- 47.Zhang J, Zhan ZH, Lin Y, Chen N, Gong YJ, Zhong JH, Chung HS, Li Y, Shi YH. Evolutionary computation meets machine learning: a survey. IEEE Comput Intell Mag. 2011;6(4):68–75.CrossRefGoogle Scholar
- 48.Zhang S, He B, Nian R, Wang J, Han B, Lendasse A, Yuan G. Fast image recognition based on independent component analysis and extreme learning machine. Cogn Comput. 2014;6(3):405–22.CrossRefGoogle Scholar