Semantic Annotation of Scholarly Documents and Citations

  • Paolo Ciancarini
  • Angelo Di Iorio
  • Andrea Giovanni Nuzzolese
  • Silvio Peroni
  • Fabio Vitali
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8249)


Scholarly publishing is in the middle of a revolution based on the use of Web-related technologies as medium of communication. In this paper we describe our ongoing study of semantic publishing and automatic annotation of scholarly documents, presenting several models and tools for the automatic annotation of structural and semantic components of documents. In particular, we focus on citations and their automatic classification obtained by CiTalO, a framework that combines ontology learning techniques with NLP techniques.


CiTO PDF jailbreaking Semantic Web citation networks citation patterns semantic annotations semantic publishing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Attwood, T.K., Kell, D.B., McDermott, P., Marsh, J., Pettifer, S., Thorne, D.: Utopia documents: linking scholarly literature with research data. Bioinformatics 26(18), 568–574 (2010), doi:10.1093/bioinformatics/btq383CrossRefGoogle Scholar
  2. 2.
    Constantin, A., Pettifer, S., Voronkov, A.: PDFX: fully-automated PDF-to-XML conversion of scientific literature. In: Proceedings of the 2013 ACM Symposium on Document Engineering (DocEng 2013), pp. 181–184. ACM Press, New York (2013), doi:10.1145/2494266.2494271Google Scholar
  3. 3.
    Copestake, A., Corbett, P., Murray-Rust, P., Rupp, C.J., Siddharthan, A., Teufel, S., Waldron, B.: An architecture for language processing for scientific text. In: Proceedings of the UK e-Science All Hands Meeting 2006 (2006)Google Scholar
  4. 4.
    De Waard, A.: From Proteins to Fairytales: Directions in Semantic Publishing. IEEE Intelligent Systems 25(2), 83–88 (2010), doi:10.1109/MIS.2010.49CrossRefGoogle Scholar
  5. 5.
    Di Iorio, A., Nuzzolese, A., Peroni, S.: Towards the automatic identification of the nature of citations. In: Proceedings of 3rd Workshop on Semantic Publishing (SePublica 2013), pp. 63–74 (2013),
  6. 6.
    Di Iorio, A., Peroni, S., Poggi, F., Shotton, D., Vitali, F.: Recognising document components in XML-based academic articles. In: Proceedings of the 2013 ACM Symposium on Document Engineering (DocEng 2013), pp. 177–180. ACM, New York (2013), doi:10.1145/2494266.2494319Google Scholar
  7. 7.
    Di Iorio, A., Peroni, S., Poggi, F., Vitali, F.: Dealing with structural patterns of XML documents. To appear in Journal of the American Society for Information Science and Technology (2013)Google Scholar
  8. 8.
    Di Iorio, A., Peroni, S., Vitali, F.: A Semantic Web Approach To Everyday Overlapping Markup. Journal of the American Society for Information Science and Technology 62(9), 1696–1716 (2011), doi:10.1002/asi.21591CrossRefGoogle Scholar
  9. 9.
    Gangemi, A., Navigli, R., Velardi, P.: The OntoWordNet Project: Extension and Axiomatization of Conceptual Relations in WordNet. In: Meersman, R., Schmidt, D.C. (eds.) CoopIS/DOA/ODBASE 2003. LNCS, vol. 2888, pp. 820–838. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  10. 10.
    Motta, E., Osborne, F.: Making Sense of Research with Rexplore. In: Proceedings of the ISWC, Posters & Demonstrations Track (2012),
  11. 11.
    Osborne, F., Motta, E.: Mining Semantic Relations between Research Areas. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 410–426. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. 12.
    Peroni, S., Shotton, D.: FaBiO and CiTO: ontologies for describing bibliographic resources and citations. Journal of Web Semantics: Science, Services and Agents on the World Wide Web 17, 33–43 (2012), doi:10.1016/j.websem.2012.08.001CrossRefGoogle Scholar
  13. 13.
    Peroni, S., Shotton, D., Vitali, F.: Faceted documents: describing document characteristics using semantic lenses. In: Proceedings of the 2012 ACM Symposium on Document Engineering (DocEng 2012), pp. 191–194 (2012), doi:10.1145/2361354.2361396Google Scholar
  14. 14.
    Pettifer, S., McDermott, P., Marsh, J., Thorne, D., Villéger, A., Attwood, T.K.: Ceci n’est pas un hamburger: modelling and representing the scholarly article. Learned Publishing 24(3), 207–220 (2011), doi:10.1087/20110309CrossRefGoogle Scholar
  15. 15.
    Presutti, V., Draicchio, F., Gangemi, A.: Knowledge extraction based on discourse representation theory and linguistic frames. In: ten Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 114–129. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  16. 16.
    Shotton, D.: Semantic Publishing: the coming revolution in scientific journal publishing. Learned Publishing 22(2), 85–94 (2009), doi:10.1087/2009202CrossRefGoogle Scholar
  17. 17.
    Zhong, Z., Ng, H.T.: It Makes Sense: A wide-coverage word sense disambiguation system for free text. In: Proceedings of the ACL 2010 System Demonstrations, pp. 78–83 (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Paolo Ciancarini
    • 1
    • 2
  • Angelo Di Iorio
    • 1
  • Andrea Giovanni Nuzzolese
    • 1
    • 2
  • Silvio Peroni
    • 1
    • 2
  • Fabio Vitali
    • 1
  1. 1.Department of Computer Science and EngineeringUniversity of BolognaItaly
  2. 2.STLab-ISTCConsiglio Nazionale delle RicercheItaly

Personalised recommendations