Advertisement

Automatic Classification of Springer Nature Proceedings with Smart Topic Miner

  • Francesco Osborne
  • Angelo Salatino
  • Aliaksandr Birukou
  • Enrico Motta
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9982)

Abstract

The process of classifying scholarly outputs is crucial to ensure timely access to knowledge. However, this process is typically carried out manually by expert editors, leading to high costs and slow throughput. In this paper we present Smart Topic Miner (STM), a novel solution which uses semantic web technologies to classify scholarly publications on the basis of a very large automatically generated ontology of research areas. STM was developed to support the Springer Nature Computer Science editorial team in classifying proceedings in the LNCS family. It analyses in real time a set of publications provided by an editor and produces a structured set of topics and a number of Springer Nature Classification tags, which best characterise the given input. In this paper we present the architecture of the system and report on an evaluation study conducted with a team of Springer Nature editors. The results of the evaluation, which showed that STM classifies publications with a high degree of accuracy, are very encouraging and as a result we are currently discussing the required next steps to ensure large-scale deployment within the company.

Keywords

Scholarly data Ontology learning Bibliographic data Scholarly ontologies Data mining Conference proceedings Metadata 

Notes

Acknowledgements

We would like to thank the Springer Nature editors for assisting us in the evaluation of STM.

References

  1. 1.
    Möller, K., Heath, T., Handschuh, S., Domingue, J.: Recipes for semantic web dog food – the ESWC and ISWC metadata projects. In: Aberer, K., et al. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 802–815. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Latif, A., Afzal, M.T., Helic, D., Tochtermann, K., Maurer, H.A.: Discovery and construction of authors’ profile from linked data (A case study for Open Digital Journal). In: LDOW 2010 (2010)Google Scholar
  3. 3.
    Glaser, H., Millard, I.: Knowledge-enabled research support: RKBExplorer.com. In: Proceedings of Web Science (2009)Google Scholar
  4. 4.
    Bryl, V., Birukou, A., Eckert, K., Kessler, M.: What’s in the proceedings? Combining publisher’s and researcher’s perspectives. In: SePublica 2014 (2014)Google Scholar
  5. 5.
    Hammond, T., Pasin, M.: The nature.com ontologies portal. In: 5th Workshop on Linked Science 2015, Colocated with International Semantic Web Conference 2015, Bethlehem, USA (2015)Google Scholar
  6. 6.
    Osborne, F., Motta, E.M.: Klink-2: integrating multiple web sources to generate semantic topic networks. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 375–391. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-25007-6_22 CrossRefGoogle Scholar
  7. 7.
    Osborne, F., Motta, E., Mulholland, P.: Exploring scholarly data with rexplore. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 460–477. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  8. 8.
    Usbeck, R., Ngonga Ngomo, A.-C., Röder, M., Gerber, D., Coelho, S.A., Auer, S., Both, A.: AGDISTIS - graph-based disambiguation of named entities using linked data. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 457–471. Springer, Heidelberg (2014)Google Scholar
  9. 9.
    Chvatal, V.: A greedy heuristic for the set-covering problem. Math. Oper. Res. 4, 233–235 (1979)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 233–242. ACM, New York (2007)Google Scholar
  11. 11.
    Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, vol. 6, pp. 9−16 (2006)Google Scholar
  12. 12.
    Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 1–8. ACM, New York (2011)Google Scholar
  13. 13.
    Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. Trans. Assoc. Comput. Linguist. 2, 231–244 (2014)Google Scholar
  14. 14.
    Cheng, X., Roth, D.: Relational inference for wikification. Urbana 51, 61801 (2013)Google Scholar
  15. 15.
    Hoffart, J., Seufert, S., Nguyen, D.B., Theobald, M., Weikum, G.: KORE: keyphrase overlap relatedness for entity disambiguation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 545–554. ACM, New York (2012)Google Scholar
  16. 16.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATHGoogle Scholar
  17. 17.
    He, Q., Chen, B., Pei, J., Qiu, B., Mitra, P., Giles, L.: Detecting topic evolution in scientific literature: how can citations help? In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 957–966. ACM, New York (2009)Google Scholar
  18. 18.
    Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: Arnetminer: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 990–998. ACM, New York (2008)Google Scholar
  19. 19.
    Decker, S.L., Aleman-Meza, B., Cameron, D., Arpinar, I.B.: Detection of bursty and emerging trends towards identification of researchers at the early stage of trends (2007). http://athenaeum.libs.uga.edu/handle/10724/9958
  20. 20.
    Erétéo, G., Gandon, F., Buffa, M.: SemtagP: semantic community detection in folksonomies. In: Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (2011)Google Scholar
  21. 21.
    Osborne, F., Scavo, G., Motta, E.: Identifying diachronic topic-based research communities by clustering shared research trajectories. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 114–129. Springer, Heidelberg (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Francesco Osborne
    • 1
  • Angelo Salatino
    • 1
  • Aliaksandr Birukou
    • 2
  • Enrico Motta
    • 1
  1. 1.Knowledge Media InstituteThe Open UniversityMilton KeynesUK
  2. 2.Springer-Verlag GmbHHeidelbergGermany

Personalised recommendations