One Year of the OpenCitations Corpus

Releasing RDF-Based Scholarly Citation Data into the Public Domain
  • Silvio PeroniEmail author
  • David Shotton
  • Fabio Vitali
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10588)


Reference lists from academic articles are core elements of scholarly communication that permit the attribution of credit and integrate our independent research endeavours. Hitherto, however, they have not been freely available in an appropriate machine-readable format such as RDF and in aggregate for use by scholars. To address this issue, one year ago we started ingesting citation data from the Open Access literature into the OpenCitations Corpus (OCC), creating an RDF dataset of scholarly citation data that is open to all. In this paper we introduce the OCC and we discuss its outcomes and uses after the first year of life.


OCC Open citation data OpenCitations OpenCitations corpus Public domain SPAR ontologies 



We would like to thank all the reviewers for having provided useful comments and desiderata to include in OpenCitations Corpus. Their suggestions have been already added as issues in the GitHub repository (see issues 11–19) and they will be taken into consideration as future developments of the resource.


  1. 1.
    Alexiou, G., Vahdati, S., Lange, C., Papastefanatos, G., Lohmann, S.: OpenAIRE LOD services: scholarly communication data as linked data. In: González-Beltrán, A., Osborne, F., Peroni, S. (eds.) SAVE-SD 2016. LNCS, vol. 9792, pp. 45–50. Springer, Cham (2016). doi: 10.1007/978-3-319-53637-8_6 CrossRefGoogle Scholar
  2. 2.
    Bagnacani, A., Ciancarini, P., Di Iorio, A., Nuzzolese, A.G., Peroni, S., Vitali, F.: The semantic lancet project: a linked open dataset for scholarly publishing. In: Lambrix, P., Hyvönen, E., Blomqvist, E., Presutti, V., Qi, G., Sattler, U., Ding, Y., Ghidini, C. (eds.) EKAW 2014. LNCS, vol. 8982, pp. 101–105. Springer, Cham (2015). doi: 10.1007/978-3-319-17966-7_10 Google Scholar
  3. 3.
    Bryl, V., Birukou, A., Eckert, K., Kessler, M.: What’s in the proceedings? Combining publisher’s and researcher’s perspectives. In: Proceedings of SePublica 2014 (2014).
  4. 4.
    Falco, R., Gangemi, A., Peroni, S., Shotton, D., Vitali, F.: Modelling OWL ontologies with graffoo. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8798, pp. 320–325. Springer, Cham (2014). doi: 10.1007/978-3-319-11955-7_42 Google Scholar
  5. 5.
    Nuzzolese, A.G., Gentile, A.L., Presutti, V., Gangemi, A.: Conference linked data: the scholarlydata project. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 150–158. Springer, Cham (2016). doi: 10.1007/978-3-319-46547-0_16 CrossRefGoogle Scholar
  6. 6.
    OpenCitations: the entire OCC n-quads data dump, made on 26 April 2017. figshare (2017). doi: 10.6084/m9.figshare.5147068
  7. 7.
    OpenCitations: the entire OCC triplestore data dump, made on 26 April 2017. figshare (2017). doi: 10.6084/m9.figshare.4959869
  8. 8.
    Peroni, S.: The semantic publishing and referencing ontologies. Semantic Web Technologies and Legal Scholarly Publishing. LGTS, vol. 15, pp. 121–193. Springer, Cham (2014). doi: 10.1007/978-3-319-04777-5_5 Google Scholar
  9. 9.
    Peroni, S., Dutton, A., Gray, T., Shotton, D.: Setting our bibliographic references free: towards open citation data. J. Doc. 71(2), 253–277 (2015). doi: 10.1108/JD-12-2013-0166 CrossRefGoogle Scholar
  10. 10.
    Peroni, S., Shotton, D.: Metadata for the OpenCitations corpus. Figshare (2016). doi: 10.6084/m9.figshare.3443876
  11. 11.
    Peroni, S., Shotton, D., Vitali, F.: A document-inspired way for tracking changes of RDF data – The case of the opencitations corpus. In: Proceedings of Drift-a-LOD 2016, pp. 26–33 (2016).
  12. 12.
    Peroni, S., Shotton, D., Vitali, F.: Freedom for bibliographic references: opencitations arise. In Proceedings of LD4IE 2016, pp. 32–43 (2016).
  13. 13.
    Shotton, D.: Open citations. Nature 502(7471), 295–297 (2013). doi: 10.1038/502295a CrossRefGoogle Scholar
  14. 14.
    Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., et al.: The fair Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016). doi: 10.1038/sdata.2016.18 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.DASPLab, DISIUniversity of BolognaBolognaItaly
  2. 2.Oxford e-Research CentreUniversity of OxfordOxfordUK

Personalised recommendations