Klink-2: Integrating Multiple Web Sources to Generate Semantic Topic Networks

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9366)


The amount of scholarly data available on the web is steadily increasing, enabling different types of analytics which can provide important insights into the research activity. In order to make sense of and explore this large-scale body of knowledge we need an accurate, comprehensive and up-to-date ontology of research topics. Unfortunately, human crafted classifications do not satisfy these criteria, as they evolve too slowly and tend to be too coarse-grained. Current automated methods for generating ontologies of research areas also present a number of limitations, such as: i) they do not consider the rich amount of indirect statistical and semantic relationships, which can help to understand the relation between two topics – e.g., the fact that two research areas are associated with a similar set of venues or technologies; ii) they do not distinguish between different kinds of hierarchical relationships; and iii) they are not able to handle effectively ambiguous topics characterized by a noisy set of relationships. In this paper we present Klink-2, a novel approach which improves on our earlier work on automatic generation of semantic topic networks and addresses the aforementioned limitations by taking advantage of a variety of knowledge sources available on the web. In particular, Klink-2 analyses networks of research entities (including papers, authors, venues, and technologies) to infer three kinds of semantic relationships between topics. It also identifies ambiguous keywords (e.g., “ontology”) and separates them into the appropriate distinct topics – e.g., “ontology/philosophy” vs. “ontology/semantic web”. Our experimental evaluation shows that the ability of Klink-2 to integrate a high number of data sources and to generate topics with accurate contextual meaning yields significant improvements over other algorithms in terms of both precision and recall.


Scholarly data Ontology learning Bibliographic data Scholarly ontologies Data mining 


  1. 1.
    Moller, K., Heath, T., Handschuh, S., Domingue, J.: Recipes for semantic web dog food — the ESWC and ISWC metadata projects. In: 6th International Semantic Web Conference, November 11–15, 2007, Busan, South Korea (2007)Google Scholar
  2. 2.
    Latif, A., Afzal, M.T., Helic, D., Tochtermann, K., Maurer, H.: Discovery and construction of authors’ profile from linked data (A case study for Open Digital Journal). In: WWW 2010 Workshop on Linked Data on the Web (LDOW 2010). CEUR-WS, vol. 628, Raleigh, North Carolina, USA (2010)Google Scholar
  3. 3.
    Glaser, H., Millard, I.: Knowledge-enabled research support: In: Proceedings of Web Science 2009, Athens, Greece (2009)Google Scholar
  4. 4.
    Peroni, S., Shotton, D.: FaBiO and CiTO: ontologies for describing bibliographic resources and citations. Journal of Web Semantics 17, 33–43 (2012)CrossRefGoogle Scholar
  5. 5.
    Osborne, F., Motta, E., Mulholland, P.: Exploring scholarly data with rexplore. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 460–477. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  6. 6.
    Decker, S.L., Aleman-Meza, B., Cameron, D., Arpinar, I.B.: Detection of bursty and emerging trends towards identification of researchers at the early stage of trends (Doctoral dissertation, University of Georgia) (2007)Google Scholar
  7. 7.
    Erétéo, G., Gandon, F., Buffa, M.: SemTagP: semantic community detection in folksonomies. In: 2011 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 1, pp. 324–331. IEEE (2011)Google Scholar
  8. 8.
    Diederich, J., Balke, W., Thaden, U.: Demonstrating the semantic GrowBag: automatically creating topic facets for FacetedDBLP. In: JCDL 2007, NY, USA (2007)Google Scholar
  9. 9.
    Monaghan, F., Bordea, G., Samp, K., Buitelaar, P.: Exploring your research: sprinkling some saffron on semantic web dog food. In: Semantic Web Challenge at the International Semantic Web Conference (2010)Google Scholar
  10. 10.
    Osborne, F., Scavo, G., Motta, E.: Identifying diachronic topic-based research communities by clustering shared research trajectories. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 114–129. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  11. 11.
    Osborne, F., Motta, E.: Mining semantic relations between research areas. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 410–426. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. 12.
    Osborne, F., Motta, E.: Inferring semantic relations by user feedback. In: Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E. (eds.) EKAW 2014. LNCS, vol. 8876, pp. 339–355. Springer, Heidelberg (2014)Google Scholar
  13. 13.
    Sanderson, M., Croft, B.: Deriving concept hierarchies from text. In: Proceedings of the SIGIR Conference, pp. 206–213 (1999)Google Scholar
  14. 14.
    Müller, A., Dorre, J.: The TaxGen framework: automating the generation of a taxonomy for a large document collection. In: Proceedings of the 32nd Hawaii International Conference on System Sciences, vol. 2, pp. 20–34 (1999)Google Scholar
  15. 15.
    Hofmann, T.: Probabilistic latent semantic indexing. In: the 22nd Conference on Research and Development in Information Retrieval, pp. 50–57, Berkeley, CA (1999)Google Scholar
  16. 16.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1033 (2003)zbMATHGoogle Scholar
  17. 17.
    Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: Arnetminer: extraction and mining of academic social networks. In Proceedings of the 14th ACM International Conference on Knowledge Discovery and Data Mining, pp. 990–998. ACM (2008)Google Scholar
  18. 18.
    Mortensen, J.M., Alexander, P.R., Musen, M.A., Noy, N.F.: Crowdsourcing ontology verification. In: The Semantic Web–ISWC 2013, pp. 448–455 (2013)Google Scholar
  19. 19.
    Wohlgenannt, G., Weichselbraun, A., Scharl, A., Sabou, M.: Dynamic integration of multiple evidence sources for ontology learning. Journal of Information and Data Management 3(3), 243 (2012)Google Scholar
  20. 20.
    Suominen, O, Viljanen, K., Hyvänen, E.: User-centric faceted search for semantic portals. In: 4th European Conference on the Semantic Web (ESWC 2007), pp. 356–370 (2007)Google Scholar
  21. 21.
    Cimiano, P., Völker, J.: Text2Onto. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 227–238. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  22. 22.
    Assadi, H.: Construction of a regional ontology from text and its use within a documentary system. In: Guarino, N. (ed.) Proceedings of FOIS 1998 Formal Ontology in Information Systems, pp. 236–249, Trento, Italy (1999)Google Scholar
  23. 23.
    Hearst, M.: Automated discovery of WordNet relations. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, pp. 131–153. MIT Press (1998)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Knowledge Media InstituteThe Open UniversityMilton KeynesUK

Personalised recommendations