Towards Research Infrastructures that Curate Scientific Information: A Use Case in Life Sciences

  • Markus StockerEmail author
  • Manuel Prinz
  • Fatemeh Rostami
  • Tibor Kempf
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11371)


Scientific information communicated in scholarly literature remains largely inaccessible to machines. The global scientific knowledge base is little more than a collection of (digital) documents. The main reason is in the fact that the document is the principal form of communication and—since underlying data, software and other materials mostly remain unpublished—the fact that the scholarly article is, essentially, the only form used to communicate scientific information. Based on a use case in life sciences, we argue that virtual research environments and semantic technologies are transforming the capability of research infrastructures to systematically acquire and curate machine readable scientific information communicated in scholarly literature.


Scientific information Scholarly communication Knowledge representation Virtual research environments Research infrastructures Knowledge infrastructures 



We thank the TIB Leibniz Information Centre for Science and Technology for supporting this project and our colleagues and the participants of the project’s workshop series for their contributions.


  1. 1.
    Aamodt, A., Nygård, M.: Different roles and mutual dependencies of data, information, and knowledge - an AI perspective on their integration. Data Knowl. Eng. 16(3), 191–222 (1995)CrossRefGoogle Scholar
  2. 2.
    Allan, R.: Virtual Research Environments: From Portals to Science Gateways. Chandos Publishing, Oxford (2009)CrossRefGoogle Scholar
  3. 3.
    Aryani, A., Wang, J.: Research graph: building a distributed graph of scholarly works using research data switchboard. In: Open Repositories Conference (2017)Google Scholar
  4. 4.
    Atkinson, M., Filgueira, R., Spinuso, A., Trani, L.: Download considered harmful (2018). Manuscript in preparationGoogle Scholar
  5. 5.
    Auer, S.: Towards an open research knowledge graph, January 2018Google Scholar
  6. 6.
    Auer, S. Kovtun, V., Prinz, M., Kasprzik, A., Stocker, M., Vidal, M.E.: Towards a Knowledge Graph for Science. In: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, WIMS 2018, pp. 1:1–1:6. ACM, New York (2018)Google Scholar
  7. 7.
    Barwise, J., Perry, J.: Situations and attitudes. J. Philos. 78(11), 668–691 (1981)CrossRefGoogle Scholar
  8. 8.
    Bechhofer, S., Roure, D.D., Gamble, M., Goble, C., Buchan, I.: Research objects: towards exchange and reuse of digital knowledge. In: Nature Precedings, July 2010Google Scholar
  9. 9.
    Bornmann, L., Mutz, R.: Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J. Assoc. Inf. Sci. Technol. 66(11), 2215–2222 (2015)CrossRefGoogle Scholar
  10. 10.
    Burton, A.: The Scholix framework for interoperability in data-literature information exchange. D-Lib Mag. 23(1/2) (2017)Google Scholar
  11. 11.
    Candela, L., Castelli, D., Pagano, P.: D4Science: an e-infrastructure for supporting virtual research environments. In: Agosti, M., Esposito, F., Thanos, C. (eds) Proceedings of the 5th Italian Research Conference on Digital Libraries (IRCDL 2009), Padova January 2009Google Scholar
  12. 12.
    Candela, L., Castelli, D., Pagano, P.: Virtual research environments: an overview and a research agenda. Data Sci. J. 12 GRDI75-GRDI81 (2013)CrossRefGoogle Scholar
  13. 13.
    Capadisli, S., Guy, A., Verborgh, R., Lange, C., Auer, S., Berners-Lee, T.: Decentralised authoring, annotations and notifications for a read-write web with dokieli. In: Cabot, J., De Virgilio, R., Torlone, R. (eds.) ICWE 2017. LNCS, vol. 10360, pp. 469–481. Springer, Cham (2017). Scholar
  14. 14.
    Chen, X., Dallmeier-Tiessen, S., Dani, A., Dasler, R., Fernández, J.D., Fokianos, P., Herterich, P., Šimko, T.: CERN analysis preservation: a novel digital library service to enable reusable and reproducible research. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds.) Research and Advanced Technology for Digital Libraries. pp, pp. 347–356. Springer International Publishing, Cham (2016)CrossRefGoogle Scholar
  15. 15.
    Ciccarese, P., Ocana, M., Castro, L.J.G., Das, S., Clark, T.: An open annotation ontology for science on web 3.0. J. Biomed. Semant. 2(2), S4 (2011)CrossRefGoogle Scholar
  16. 16.
    de Sompel, H.V., Payette, S., Erickson, J., Lagoze, C., Warner, S.: Rethinking scholarly communication. D-Lib Mag. 10(9), (2004)Google Scholar
  17. 17.
    de Waard, A., Breure, L., Kircz, J.G., van Oostendorp, H.: Modeling rhetoric in scientific publications. In Proceedings of the International Conference on Multidisciplinary Information Sciences and Technologies (InSciT 2006) (2006)Google Scholar
  18. 18.
    de Waard, A., Shum, S.M., Carusi, A., Park, J., Samwald, M., Sándor, Á.: Hypotheses, evidence and relationships: the HypER approach for representing scientific knowledge claims. In: Clark, T., Luciano, J.S., Marshall, M.S., Prud’hommeaux, E.., Stephens, S. (eds), Proceedings of the Workshop on Semantic Web Applications in Scientific Discourse (SWASD 2009), vol. 523, Washington October 2009. CEURGoogle Scholar
  19. 19.
    Devlin, K.: Logic and Information. Cambridge University Press, Cambridge (1991)zbMATHGoogle Scholar
  20. 20.
    Fathalla, S., Vahdati, S., Auer, S., Lange, C.: Towards a knowledge graph representing research findings by semantifying survey articles. In: Kamps, J., Tsakonas, G., Manolopoulos, Y., Iliadis, L., Karydis, I. (eds.) Research and Advanced Technology for Digital Libraries. pp, pp. 315–327. Springer International Publishing, Cham (2017). Scholar
  21. 21.
    Floridi, L.: The Philosophy of Information. Oxford University Press, Oxford (2011)CrossRefGoogle Scholar
  22. 22.
    García-Castro, L.J., Giraldo, O.X., García-Castro, A.: Using annotations to model discourse: an extension to the annotation ontology. In García-Castro, A. Lange, C., van Harmelen, F., Good, B. (eds), Proceedings of the 2nd Workshop on Semantic Publishing, vol. 903, pp. 13–22, Hersonissos, May 2012. CEURGoogle Scholar
  23. 23.
    Groth, P., Gibson, A., Velterop, J.: The anatomy of a nanopublication. Inf. Serv. Use 30(1–2), 51–56 (2010)CrossRefGoogle Scholar
  24. 24.
    Haddad, S.: Iron-regulatory proteins secure iron availability in cardiomyocytes to prevent heart failure. Eur. Heart J. 38(5), 362–372 (2017)Google Scholar
  25. 25.
    Hanson, K.L., DiLauro, T., Donoghue, M.: The RMap project: capturing and preserving associations amongst multi-part distributed publications. In: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2015, pp. 281–282. ACM, New York (2015)Google Scholar
  26. 26.
    Hentze, M.W., Muckenthaler, M.U., Galy, B., Camaschella, C.: Two to tango: regulation of mammalian iron metabolism. Cell 142(1), 24–38 (2010)CrossRefGoogle Scholar
  27. 27.
    Jinha, A.E.: Article 50 million: an estimate of the number of scholarly articles in existence. Learn. Publishing 23(3), 258–263 (2010)CrossRefGoogle Scholar
  28. 28.
    Jones, E., Oliphant, T., Peterson, P. et al.: SciPy: Open source scientific tools for Python (2001)Google Scholar
  29. 29.
    Kluyver, T.: Jupyter notebooks–a publishing format for reproducible computational workflows. In: Loizides, F., Schmidt, B. (eds), Positioning and Power in Academic Publishing: Players, Agents and Agendas, pp. 87–90. IOS Press (2016)Google Scholar
  30. 30.
    Manola, F., Miller, E., McBride, B.: RDF Primer. W3C Recommendation 10(1–107), 6 (2004)Google Scholar
  31. 31.
    Mons, B., Velterop, J.: Nano-publication in the e-science era. In: Workshop on Semantic Web Applications in Scientific Discourse (SWASD 2009), Washington (2009)Google Scholar
  32. 32.
    Priem, J.: Beyond the paper. Nature 495(7442), 437–440 (2013)CrossRefGoogle Scholar
  33. 33.
    Schneider, C.A., Rasband, W.S., Eliceiri, K.W.: NIH Image to imageJ: 25 years of image analysis. Nat. Methods 9(7), 671–675 (2012)CrossRefGoogle Scholar
  34. 34.
    Smith, B., et al.: The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25(11), 1251–1255 (2007)CrossRefGoogle Scholar
  35. 35.
    Star, S.L.: The ethnography of infrastructure. Am. Behav. Sci. 43(3), 377–391 (1999)CrossRefGoogle Scholar
  36. 36.
    Stocker, M.: Advancing the software systems of environmental knowledge infrastructures. In: Chabbi, A., Loescher, H.W. (eds.) Terrestrial Ecosystem Research Infrastructures: Challenges and Opportunities, pp. 399–423. CRC Press, Taylor & Francis Group (2017)CrossRefGoogle Scholar
  37. 37.
    Stocker, M.: From data to machine readable information aggregated in research objects. D-Lib Mag. 23(1/2) (2017)Google Scholar
  38. 38.
    Stocker, M.: Jupyter notebook for DILS 2018 paper on research infrastructures that curate scientific information. Figshare, July 2018Google Scholar
  39. 39.
    Stocker, M., Baranizadeh, E., Portin, H., Komppula, M., Rönkkö, M., Hamed, A., Virtanen, A., Lehtinen, K., Laaksonen, A., Kolehmainen, M.: Representing situational knowledge acquired from sensor data for atmospheric phenomena. Environ. Model. Softw. 58, 27–47 (2014)CrossRefGoogle Scholar
  40. 40.
    Stocker, M., et al.: Representing situational knowledge for disease outbreaks in agriculture. J. Agric. Inf. 7(2), 29–39 (2016)Google Scholar
  41. 41.
    Stocker, M., Paasonen, P., Fiebig, M., Zaidan, M.A., Hardisty, A.: Curating scientific information in knowledge infrastructures. Data Sci. J. 17 (2018).
  42. 42.
    Stocker, M., Rönkkö, M., Kolehmainen, M.: Situational knowledge representation for traffic observed by a pavement vibration sensor network. IEEE Trans. Intell. Transp. Syst. 15(4), 1441–1450 (2014)CrossRefGoogle Scholar
  43. 43.
    White, K.E., Robbins, C., Khan, B., Freyman, C.: Science and engineering publication output trends: 2014 shows rise of developing country output while developed countries dominate highly cited publications. Technical Report NSF 18–300, National Science Foundation, October 2017Google Scholar
  44. 44.
    Wilkinson, M.D., et al.. The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3 March 2016Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.TIB Leibniz Information Centre for Science and TechnologyHannoverGermany
  2. 2.PANGAEA Data Publisher for Earth & Environmental ScienceMARUM Center for Marine Environmental SciencesBremenGermany
  3. 3.Division of Molecular and Translational Cardiology, Department of Cardiology and AngiologyHannover Medical SchoolHannoverGermany

Personalised recommendations