Ontology Forecasting in Scientific Literature: Semantic Concepts Prediction Based on Innovation-Adoption Priors

  • Amparo Elizabeth Cano-Basave
  • Francesco OsborneEmail author
  • Angelo Antonio Salatino
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10024)


The ontology engineering research community has focused for many years on supporting the creation, development and evolution of ontologies. Ontology forecasting, which aims at predicting semantic changes in an ontology, represents instead a new challenge. In this paper, we want to give a contribution to this novel endeavour by focusing on the task of forecasting semantic concepts in the research domain. Indeed, ontologies representing scientific disciplines contain only research topics that are already popular enough to be selected by human experts or automatic algorithms. They are thus unfit to support tasks which require the ability of describing and exploring the forefront of research, such as trend detection and horizon scanning. We address this issue by introducing the Semantic Innovation Forecast (SIF) model, which predicts new concepts of an ontology at time \(t+1\), using only data available at time t. Our approach relies on lexical innovation and adoption information extracted from historical data. We evaluated the SIF model on a very large dataset consisting of over one million scientific papers belonging to the Computer Science domain: the outcomes show that the proposed approach offers a competitive boost in mean average precision-at-ten compared to the baselines when forecasting over 5 years.


Topic evolution Ontology forecasting Ontology evolution Latent semantics LDA Innovation priors Adoption priors Scholarly data 



We would like to thank Elsevier BV and Springer DE for providing us with access to their large repositories of scholarly data.


  1. 1.
    Ahmed, A., Xing, E., Timeline.: A dynamic hierarchical Dirichlet process model for recovering birth/death and evolution of topics in text stream. Uncert. Artif. Intell. (2010)Google Scholar
  2. 2.
    Andrzejewski, D., Zhu, X., Craven, M., Recht, B.: A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, vol. 2, pp. 1171–1177. AAAI Press (2011)Google Scholar
  3. 3.
    Bicer, V., Tran, T., Ma, Y., Studer, R.: TRM – learning dependencies between text and structure with topical relational models. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 1–16. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  4. 4.
    Ng, A.Y., Blei, D.M., Jordan, M.I.: Latent Dirichlet allocation. In. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  5. 5.
    Bolelli, L., Ertekin, Ş., Giles, C.L.: Topic and trend detection in text collections using latent Dirichlet allocation. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 776–780. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  6. 6.
    Bolelli, L., Ertekin, S., Zhou, D., Giles, C. L.: Finding topic trends in digital libraries. In: Proceedings of 9th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2009, pp. 69–72. ACM, New York (2009)Google Scholar
  7. 7.
    Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, vol. 6, pp. 9–16 (2006)Google Scholar
  8. 8.
    Chen, S., Beeferman, D., Rosenfeld, R.: Evaluation metrics for language models (1998)Google Scholar
  9. 9.
    Danescu-Niculescu-Mizil, C., West, R., Jurafsky, D., Leskovec, J., Potts, C.: No country for old members: user lifecycle and linguistic change in online communities. In: Proceedings of 22nd International Conference on World Wide Web, WWW 2013, pp. 307–318 (2013)Google Scholar
  10. 10.
    Deng, H., Han, J., Zhao, B., Yu, Y., Lin, C. X.: Probabilistic topic models with biased propagation on heterogeneous information networks. In: Proceedings of 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011, pp. 1271–1279. ACM, New York (2011)Google Scholar
  11. 11.
    Gohr, A., Hinneburg, A., Schult, R., Spiliopoulou, M.: Topic evolution in a stream of documents. In: SDM, pp. 859–872 (2009)Google Scholar
  12. 12.
    Griffiths, T., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. U.S.A. 101(Suppl. 1), 52285235 (2004)Google Scholar
  13. 13.
    He, Q., Chen, B., Pei, J., Qiu, B., Mitra, P., Giles, L.: Detecting topic evolution in scientific literature: how can citations help? In: Proceedings of 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 957–966. ACM, New York (2009)Google Scholar
  14. 14.
    Katz, S.M.: Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoust. Speech Sig. Process. 35, 400–401 (1987)CrossRefGoogle Scholar
  15. 15.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefzbMATHGoogle Scholar
  16. 16.
    Mei, Q., Zhai, C.: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: Proceedings of 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 198–207. ACM (2005)Google Scholar
  17. 17.
    Minka, T.: Estimating a Dirichlet distribution. Technical report (2003)Google Scholar
  18. 18.
    Monaghan, F., Bordea, G., Samp, K., Buitelaar, P.: Exploring your research: sprinkling some saffron on semantic web dog food. In: Semantic Web Challenge at the International Semantic Web Conference, vol. 117, pp. 420–435. Citeseer (2010)Google Scholar
  19. 19.
    Morinaga, S., Yamanishi, K.: Tracking dynamics of topic trends using a finite mixture model. In: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2004)Google Scholar
  20. 20.
    Osborne, F., Motta, E.: Klink-2: integrating multiple web sources to generate semantic topic networks. In: 14th International Semantic Web Conference (2015)Google Scholar
  21. 21.
    Osborne, F., Motta, E., Mulholland, P.: Exploring scholarly data with rexplore. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 460–477. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  22. 22.
    Osborne, F., Salatino, A., Birukou, A., Mottam, E.: Automatic classification of springer nature proceedings with smart topic miner. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 383–399. Springer, Heidelberg (2016)CrossRefGoogle Scholar
  23. 23.
    Pesquita, C., Couto, F.M.: Predicting the extension of biomedical ontologies. PLoS Comput. Biol. 8(9), e1002630 (2012)CrossRefGoogle Scholar
  24. 24.
    Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of 20th Conference on Uncertainty in Artificial Intelligence, pp. 487–494. AUAI Press (2004)Google Scholar
  25. 25.
    Tseng, Y.-H., Lin, Y.-I., Lee, Y.-Y., Hung, W.-C., Lee, C.-H.: A comparison of methods for detecting hot topics. Scientometrics 81(1), 73–90 (2009)CrossRefGoogle Scholar
  26. 26.
    Wang, H., Tudorache, T., Dou, D., Noy, N.F., Musen, M.A.: Analysis and prediction of user editing patterns in ontology development projects. J. Data Semant. 4(2), 117–132 (2015)CrossRefGoogle Scholar
  27. 27.
    Willett, P.: The porter stemming algorithm: then and now. Program 40(3), 219–223 (2006)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Zablith, F., Antoniou, G., d’Aquin, M., Flouris, G., Kondylakis, H., Motta, E., Plexousakis, D., Sabou, M.: Ontology evolution: a process-centric survey. Knowl. Eng. Rev. 30(01), 45–75 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Amparo Elizabeth Cano-Basave
    • 1
  • Francesco Osborne
    • 2
    Email author
  • Angelo Antonio Salatino
    • 2
  1. 1.Aston Business SchoolAston UniversityBirminghamUK
  2. 2.Knowledge Media InstituteOpen UniversityMilton KeynesUK

Personalised recommendations