Integration of Multiple Graph Datasets and Their Linguistic Summaries: An Application to Linked Data

  • Lukasz StrobinEmail author
  • Adam Niewiadomski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9692)


This paper presents a novel method of generating and evaluating linguistic summaries of content stored in distributed graph datasets, like LinkedData. Linguistic summarization is a well known data mining technique, aimed to discover patterns in data and present them in natural language. So far, this method has been researched only for relational databases. In our recent paper we have presented how to adapt this method for graph datasets. We have solved the problems of subject definition (further extended in this paper), retrieval of the attributes for summarization, generalization of summarizers and qualifiers. In this paper we extend that research by adapting proposed method to distributed interlinked graph datasets, which results in obtaining new summaries, and therefore new knowledge. We discuss how to follow different types of equivalence links that may exists between graph datasets. In order to measure characteristics specific for summaries of distributed graph data we propose new truth values (degree of subject appropriateness, degree of summarizer order and degree of linkage), and adapt existing ones (degree of covering). We run several experiments on Linked Data and discuss the results.


  1. 1.
    Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40(1), 1:1–1:39 (2008)CrossRefGoogle Scholar
  2. 2.
    Hausenblas, M., Halb, W., Raimond, Y., Heath, T.: What is the size of the semantic web. In: Proceedings of the International Conference on Semantic Systems. ISemantics 2008 (2008)Google Scholar
  3. 3.
    Yager, R.R.: A new approach to the summarization of data. Inf. Sci. 28(1), 69–86 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Kacprzyk, J., Yager, R.R., Zadrożny, S.: A fuzzy logic based approach to linguistic summaries of databases. Int. J. Appl. Math. Comput. Sci. 10(4), 813–834 (2000)zbMATHGoogle Scholar
  5. 5.
    Kacprzyk, J., Wilbik, A., Zadrozny, S.: An approach to the linguistic summarization of time series using a fuzzy quantifier driven aggregation. Int. J. Intell. Syst. 25(5), 411–439 (2010)zbMATHGoogle Scholar
  6. 6.
    Srivastava, J., Cooley, R., Deshpande, M., Tan, P.N.: Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explor. Newsl. 1(2), 12–23 (2000)CrossRefGoogle Scholar
  7. 7.
    Kosala, R., Blockeel, H.: Web mining research: a survey. SIGKDD Explor. Newsl. 2(1), 1–15 (2000)CrossRefGoogle Scholar
  8. 8.
    Stumme, G., Hotho, A., Berendt, B.: Semantic web mining: state of the art and future directions. Web Seman. Sci. Serv. Agents World Wide Web 4(2), 124–143 (2006). Semantic Grid - The Convergence of TechnologiesCrossRefGoogle Scholar
  9. 9.
    Aggarwal, C.C., Wang, H.: Managing and Mining Graph Data, 1st edn. Springer Publishing Company, Incorporated, US (2010)CrossRefzbMATHGoogle Scholar
  10. 10.
    Cook, D.J., Holder, L.B.: Mining Graph Data. John Wiley & Sons, Hoboken (2006)CrossRefzbMATHGoogle Scholar
  11. 11.
    Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings of the 2001 IEEE International Conference on Data Mining. ICDM 2001, Computer Society, pp. 313–320. IEEE, Washington, DC (2001)Google Scholar
  12. 12.
    Yan, X., Han, J.: gspan: graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining. ICDM 2002, Computer Society 721-724. IEEE, Washington, DC (2002)Google Scholar
  13. 13.
    Castelltort, A., Laurent, A.: Fuzzy queries over NoSQL graph databases: perspectives for extending the cypher language. In: Laurent, A., Strauss, O., Bouchon-Meunier, B., Yager, R.R. (eds.) IPMU 2014, Part III. CCIS, vol. 444, pp. 384–395. Springer, Heidelberg (2014)Google Scholar
  14. 14.
    Strobin, L., Niewiadomski, A.: Linguistic summaries of graph datasets using ontologies: an application to semantic web. In: Núñez, M., Nguyen, N.T., Camacho, D., Trawinski, B. (eds.) ICCCI 2015. LNCS, vol. 9329, pp. 380–389. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-24069-5_36 CrossRefGoogle Scholar
  15. 15.
    Castelltort, A., Laurent, A.: Extracting fuzzy summaries from nosql graph databases. In: Andreasen, T., Christiansen, H., Kacprzyk, J., Larsen, H., Pasi, G., Pivert, O., De Tré, G., Vila, M.A., Yazici, A., Zadrozny, S. (eds.) Flexible Query Answering Systems 2015. Advances in Intelligent Systems and Computing, vol. 400, pp. 189–200. Springer International Publishing, Switzerland (2016)CrossRefGoogle Scholar
  16. 16.
    Lehmann, J.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2014)Google Scholar
  17. 17.
    Cingolani, P., Alcalá-Fdez, J.: jfuzzylogic: a robust and flexible fuzzy-logic inference system language implementation. In: FUZZ-IEEE, pp. 1–8. IEEE (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Institute of Information TechnologyLodz University of TechnologyLodzPoland

Personalised recommendations