Abstract
Scholarly publishing has seen an ever increasing interest in Linked Open Data (LOD). However, most of the existing datasets are designed as flat translation of legacy data sources into RDF. Although that is a crucial step to address, a lot of useful information is not expressed in RDF, and humans are still required to infer relevant knowledge by reading and making sense of texts. Examples are the reasons why authors cite other papers, the rhetorical structure of scientific discourse, bibliometric measures, provenance information, and so on. In this paper we introduce the Semantic Lancet Project, whose goal is to make available a LOD which includes the formalisation of some useful knowledge hidden within the textual content of papers. We have developed a toolchain for reengineering and enhancing data extracted from some publisher’s legacy repositories. Finally, we show how these data are immediately useful to help humans to address relevant tasks, such as data browsing, expert finding, related works finding, and identification of data inconsistencies.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
vn.role:Theme and vn.role:Agent identify the theme and the agent of an event, respectively.
- 6.
The Scholarlydata dataset is the reference linked dataset of the Semantic Web community about papers, people, organisations and events. It is available at http://www.scholarlydata.org/.
- 7.
An in-text reference pointer is the entity in the body of a citing work that denotes a bibliographic reference in the reference list, e.g. “[3]” and “(Handler et al. 2012)”.
- 8.
Available at http://two.eelst.cs.unibo.it/data and http://two.eelst.cs.unibo.it/prov, respectively.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
References
Ciancarini, P., Iorio, A., Nuzzolese, A.G., Peroni, S., Vitali, F.: Evaluating citation functions in CiTO: cognitive issues. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 580–594. Springer, Cham (2014). doi:10.1007/978-3-319-07443-6_39
García-Castro, L., McLaughlin, C., García Castro, A.: Biotea: RDFizing PubMed central in support for the paper as an interface to the web of data. J. Biomed. Semant. 5(Suppl1), S5 (2013)
IFLA Study Group on the FRBR (2009). Functional Requirements for Bibliographic Records. http://www.ifla.org/publications/functional-requirements-for-bibliographic-records. Accessed 7 Nov 2016
Lebo, T., Sahoo, S., McGuinness, D.: The PROV Ontology. W3C Recommendation, 30. World Wide Web Consortium. http://www.w3.org/TR/prov-o/. Accessed 7 Nov 2016
Gangemi, A., Presutti, V., Reforgiato Recupero, D., Nuzzolese, A.G., Draicchio, F., Mongiovì, M.: Semantic web machine reading with FRED. Semantic Web, Under review (2016). http://www.semantic-web-journal.net/system/files/swj1297.pdf
Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web. 6(2), 167–195 (2015)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Möller, K., Heath, T., Handschuh, S., Domingue, J.: Recipes for semantic web dog food: the ESWC and ISWC metadata projects. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 802–815. Springer, Heidelberg (2007). doi:10.1007/978-3-540-76298-0_58
Di Iorio, A., Giannella, R., Poggi, F., Peroni, S., Vitali, F.: Exploring scholarly papers through citations. In: Proceedings of the 2015 ACM Symposium on Document Engineering, pp. 107–116. ACM (2015)
Horrocks, I., Patel-Schneider, P.F., van Harmelen, F.: From SHIQ and RDF to OWL: the making of a web ontology language. Web Semant. Sci. Serv. Agents World Wide Web 1(1), 7–26 (2003). doi:10.1016/j.websem.2003.07.001
Ogbuji, C.: SPARQL 1.1 Graph Store HTTP Protocol. W3C Recommendation, 2013. World Wide Web Consortium (2013). http://www.w3.org/TR/sparql11-http-rdf-update/. Accessed 7 Dec 2016
Peroni, S.: The semantic publishing and referencing ontologies. Semantic Web Technologies and Legal Scholarly Publishing. LGTS, vol. 15, pp. 121–193. Springer, Cham (2014). doi:10.1007/978-3-319-04777-5_5
Picca, D., Gliozzo, A.M., Gangemi, A.: LMM: an OWL-DL MetaModel to represent heterogeneous lexical knowledge. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008) (2008)
Presutti, V., Draicchio, F., Gangemi, A.: Knowledge extraction based on discourse representation theory and linguistic frames. In: Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS (LNAI), vol. 7603, pp. 114–129. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33876-2_12
Qazvinian, V., Radev, D.: Identifying non-explicit citing sentences for citation-based summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 555–564. Pennsylvania, USA (2010)
Sanderson, R., Ciccarese, P., Van de Sompel, H.: Designing the W3C open annotation data model. In: Proceedings of the 5th Annual ACM Web Science Conference (WebSci13), pp. 366–375. ACM Press, New York (2013)
Schuler, K.: A broad-coverage, comprehensive verb lexicon (2005). http://repository.upenn.edu/dissertations/AAI3179808. Accessed 1 Apr 2016
Shotton, D.: Publishing: open citations. Nature 502(7471), 295–297 (2013)
Stasko, J.: Value-driven evaluation of visualizations. In: Proceedings of the Fifth Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization, pp. 46–53. ACM (2014)
Teufel, S., Siddharthan, A., Tidhar, D.: Automatic classification of citation function. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), pp. 103–110 (2006)
Acknowledgements
This paper was supported by MIUR PRIN 2016 GAUSS Project. We would like to thank Elsevier for granting access to Scopus and ScienceDirect APIs.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Iorio, A.D., Nuzzolese, A.G., Peroni, S., Poggi, F., Vitali, F., Ciancarini, P. (2017). Analysing and Discovering Semantic Relations in Scholarly Data. In: Grana, C., Baraldi, L. (eds) Digital Libraries and Archives. IRCDL 2017. Communications in Computer and Information Science, vol 733. Springer, Cham. https://doi.org/10.1007/978-3-319-68130-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-68130-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68129-0
Online ISBN: 978-3-319-68130-6
eBook Packages: Computer ScienceComputer Science (R0)