Skip to main content

Analysing and Discovering Semantic Relations in Scholarly Data

  • Conference paper
  • First Online:
  • 516 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 733))

Abstract

Scholarly publishing has seen an ever increasing interest in Linked Open Data (LOD). However, most of the existing datasets are designed as flat translation of legacy data sources into RDF. Although that is a crucial step to address, a lot of useful information is not expressed in RDF, and humans are still required to infer relevant knowledge by reading and making sense of texts. Examples are the reasons why authors cite other papers, the rhetorical structure of scientific discourse, bibliometric measures, provenance information, and so on. In this paper we introduce the Semantic Lancet Project, whose goal is to make available a LOD which includes the formalisation of some useful knowledge hidden within the textual content of papers. We have developed a toolchain for reengineering and enhancing data extracted from some publisher’s legacy repositories. Finally, we show how these data are immediately useful to help humans to address relevant tasks, such as data browsing, expert finding, related works finding, and identification of data inconsistencies.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.semanticlancet.eu.

  2. 2.

    http://www.developers.elsevier.com/devcms/content-apis.

  3. 3.

    http://www.elsevier.com/about/policies/content-mining-policies.

  4. 4.

    FRED: http://wit.istc.cnr.it/stlab-tools/fred.

  5. 5.

    vn.role:Theme and vn.role:Agent identify the theme and the agent of an event, respectively.

  6. 6.

    The Scholarlydata dataset is the reference linked dataset of the Semantic Web community about papers, people, organisations and events. It is available at http://www.scholarlydata.org/.

  7. 7.

    An in-text reference pointer is the entity in the body of a citing work that denotes a bibliographic reference in the reference list, e.g. “[3]” and “(Handler et al. 2012)”.

  8. 8.

    Available at http://two.eelst.cs.unibo.it/data and http://two.eelst.cs.unibo.it/prov, respectively.

  9. 9.

    http://eelst.cs.unibo.it:8089/.

  10. 10.

    http://www.semanticlancet.eu/citationexplorer.

  11. 11.

    http://www.semanticlancet.eu/abstractfinder.

  12. 12.

    http://www.semanticlancet.eu/reporter.

  13. 13.

    http://data.nature.com.

  14. 14.

    http://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/.

  15. 15.

    http://dblp.l3s.de/dblp++.php.

  16. 16.

    http://ontoware.org/swrc/.

  17. 17.

    http://semantic-web-journal.com:3030/.

References

  1. Ciancarini, P., Iorio, A., Nuzzolese, A.G., Peroni, S., Vitali, F.: Evaluating citation functions in CiTO: cognitive issues. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 580–594. Springer, Cham (2014). doi:10.1007/978-3-319-07443-6_39

    Chapter  Google Scholar 

  2. García-Castro, L., McLaughlin, C., García Castro, A.: Biotea: RDFizing PubMed central in support for the paper as an interface to the web of data. J. Biomed. Semant. 5(Suppl1), S5 (2013)

    Article  Google Scholar 

  3. IFLA Study Group on the FRBR (2009). Functional Requirements for Bibliographic Records. http://www.ifla.org/publications/functional-requirements-for-bibliographic-records. Accessed 7 Nov 2016

  4. Lebo, T., Sahoo, S., McGuinness, D.: The PROV Ontology. W3C Recommendation, 30. World Wide Web Consortium. http://www.w3.org/TR/prov-o/. Accessed 7 Nov 2016

  5. Gangemi, A., Presutti, V., Reforgiato Recupero, D., Nuzzolese, A.G., Draicchio, F., Mongiovì, M.: Semantic web machine reading with FRED. Semantic Web, Under review (2016). http://www.semantic-web-journal.net/system/files/swj1297.pdf

  6. Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web. 6(2), 167–195 (2015)

    Google Scholar 

  7. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  8. Möller, K., Heath, T., Handschuh, S., Domingue, J.: Recipes for semantic web dog food: the ESWC and ISWC metadata projects. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 802–815. Springer, Heidelberg (2007). doi:10.1007/978-3-540-76298-0_58

    Chapter  Google Scholar 

  9. Di Iorio, A., Giannella, R., Poggi, F., Peroni, S., Vitali, F.: Exploring scholarly papers through citations. In: Proceedings of the 2015 ACM Symposium on Document Engineering, pp. 107–116. ACM (2015)

    Google Scholar 

  10. Horrocks, I., Patel-Schneider, P.F., van Harmelen, F.: From SHIQ and RDF to OWL: the making of a web ontology language. Web Semant. Sci. Serv. Agents World Wide Web 1(1), 7–26 (2003). doi:10.1016/j.websem.2003.07.001

    Article  Google Scholar 

  11. Ogbuji, C.: SPARQL 1.1 Graph Store HTTP Protocol. W3C Recommendation, 2013. World Wide Web Consortium (2013). http://www.w3.org/TR/sparql11-http-rdf-update/. Accessed 7 Dec 2016

  12. Peroni, S.: The semantic publishing and referencing ontologies. Semantic Web Technologies and Legal Scholarly Publishing. LGTS, vol. 15, pp. 121–193. Springer, Cham (2014). doi:10.1007/978-3-319-04777-5_5

    Google Scholar 

  13. Picca, D., Gliozzo, A.M., Gangemi, A.: LMM: an OWL-DL MetaModel to represent heterogeneous lexical knowledge. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008) (2008)

    Google Scholar 

  14. Presutti, V., Draicchio, F., Gangemi, A.: Knowledge extraction based on discourse representation theory and linguistic frames. In: Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS (LNAI), vol. 7603, pp. 114–129. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33876-2_12

    Chapter  Google Scholar 

  15. Qazvinian, V., Radev, D.: Identifying non-explicit citing sentences for citation-based summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 555–564. Pennsylvania, USA (2010)

    Google Scholar 

  16. Sanderson, R., Ciccarese, P., Van de Sompel, H.: Designing the W3C open annotation data model. In: Proceedings of the 5th Annual ACM Web Science Conference (WebSci13), pp. 366–375. ACM Press, New York (2013)

    Google Scholar 

  17. Schuler, K.: A broad-coverage, comprehensive verb lexicon (2005). http://repository.upenn.edu/dissertations/AAI3179808. Accessed 1 Apr 2016

  18. Shotton, D.: Publishing: open citations. Nature 502(7471), 295–297 (2013)

    Article  Google Scholar 

  19. Stasko, J.: Value-driven evaluation of visualizations. In: Proceedings of the Fifth Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization, pp. 46–53. ACM (2014)

    Google Scholar 

  20. Teufel, S., Siddharthan, A., Tidhar, D.: Automatic classification of citation function. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), pp. 103–110 (2006)

    Google Scholar 

Download references

Acknowledgements

This paper was supported by MIUR PRIN 2016 GAUSS Project. We would like to thank Elsevier for granting access to Scopus and ScienceDirect APIs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Poggi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Iorio, A.D., Nuzzolese, A.G., Peroni, S., Poggi, F., Vitali, F., Ciancarini, P. (2017). Analysing and Discovering Semantic Relations in Scholarly Data. In: Grana, C., Baraldi, L. (eds) Digital Libraries and Archives. IRCDL 2017. Communications in Computer and Information Science, vol 733. Springer, Cham. https://doi.org/10.1007/978-3-319-68130-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68130-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68129-0

  • Online ISBN: 978-3-319-68130-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics