Advertisement

Linked Data-Based NLP Workflows

  • Philipp Cimiano
  • Christian Chiarcos
  • John P. McCrae
  • Jorge Gracia
Chapter

Abstract

In this chapter we describe principles and architectures that support the development of NLP workflows and pipelines based on linked data technology. The benefit of NLP workflows that build on linked data standards is that they build on an open set of data models and Web technologies that can be implemented with standard functionality not requiring additional frameworks and thus avoiding any type of lock-in or dependence on particular frameworks in comparison to using UIMA, GATE or other frameworks. In this chapter we describe, on the one hand, how NLP workflows can be implemented by relying on the Natural Language Processing Interchange Format (NIF). We give examples of how a POS-tagger and a dependency parser can be implemented as NIF-based web services. We then describe Teanga, a recent platform for NLP integration that exploits Docker containers to implement NLP workflows. Finally, we also describe LAPPS Grid, an open-source platform for NLP tools that builds on JSON-LD.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    H. Cunningham, GATE, a general architecture for text engineering. Comput. Hum. 36(2), 223 (2002)Google Scholar
  2. 2.
    D. Ferrucci, A. Lally, UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3-4), 327 (2004)CrossRefGoogle Scholar
  3. 3.
    S. Hellmann, RLOG—an RDF Logging Ontology (AKSW/University Leipzig, Ontology, 2013). http://persistence.uni-leipzig.org/nlp2rdf/ontologies/rlog/rlog.html Google Scholar
  4. 4.
    S. Bird, NLTK: the natural language toolkit, in Proceedings of the COLING/ACL on Interactive presentation sessions (Association for Computational Linguistics, Stroudsburg, 2006), pp. 69–72CrossRefGoogle Scholar
  5. 5.
    H. Ziad, J.P. McCrae, P. Buitelaar, Teanga: a linked data based platform for natural language processing, in Proceedings of the 11th Language Resource and Evaluation Conference (LREC) (2018)Google Scholar
  6. 6.
    F. Haupt, D. Karastoyanova, F. Leymann, B. Schroth, A model-driven approach for REST compliant services, in Proceedings of the IEEE International Conference on Web Services (ICWS) (IEEE, Piscataway, 2014), pp. 129–136Google Scholar
  7. 7.
    M. Sporny, D. Longley, G. Kellogg, M. Lanthaler, N. Lindström, JSON-LD 1.0, in W3C Recommendation (World Wide Web Consortium, Cambridge, 2014)Google Scholar
  8. 8.
    M. Verhagen, K. Suderman, D. Wang, N. Ide, C. Shi, J. Wright, J. Pustejovsky, The LAPPS interchange format, in Proceedings of the International Workshop on Worldwide Language Service Infrastructure (Springer, Berlin, 2015), pp. 33–47Google Scholar
  9. 9.
    N. Ide, K. Suderman, E. Nyberg, J. Pustejovsky, M. Verhagen, LAPPS/Galaxy: current state and next steps, in Proceedings of the 3rd International Workshop on Worldwide Language Service Infrastructure and 2nd Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016) (2016), pp. 11–18Google Scholar
  10. 10.
    N. Ide, K. Suderman, M. Verhagen, J. Pustejovsky, The language application grid web service exchange vocabulary, in Proceedings of the International on Worldwide Language Service Infrastructure (Springer, Berlin, 2015), pp. 18–32Google Scholar
  11. 11.
    D. Ferrucci, E. Nyberg, J. Allan, K. Barker, E. Brown, J. Chu-Carroll, A. Ciccolo, P. Duboue, J. Fan, D. Gondek, et al., Towards the Open Advancement of Question Answering Systems (IBM, Armonk, 2009)Google Scholar
  12. 12.
    J. Goecks, A. Nekrutenko, J. Taylor, Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11(8), R86 (2010)CrossRefGoogle Scholar
  13. 13.
    T. Kluyver, B. Ragan-Kelley, F. Pérez, B.E. Granger, M. Bussonnier, J. Frederic, K. Kelley, J.B. Hamrick, J. Grout, S. Corlay, et al., Jupyter notebooks-a publishing format for reproducible computational workflows, in ELPUB (IOS Press, Amsterdam, 2016), pp. 87–90Google Scholar
  14. 14.
    N. Ide, K. Suderman, J. Pustejovsky, Demonstration: the language application grid as a platform for digital humanities research, in Proceedings of the Workshop on Corpora in the Digital Humanities (CDH 2017), Bloomington, IN, 19 January 2017. CEUR Workshop Proceedings 1786, CEUR-WS.org 2017, pp. 71–76Google Scholar
  15. 15.
    N. Ide, K. Suderman, J.D. Kim, Mining biomedical publications with the LAPPS grid., in Proceedings of the 11th Conference on International Language Resources and Evaluation (2018), pp. 2075–2018Google Scholar
  16. 16.
    D. Maynard, K. Bontcheva, I. Augenstein, Natural Language Processing for the Semantic Web. The Semantic Web: Theory and Technology (Morgan & Claypool, San Rafael, 2016)Google Scholar
  17. 17.
    C. Barrière, Natural Language Understanding in a Semantic Web Context (Springer, Berlin, 2016)CrossRefGoogle Scholar
  18. 18.
    D. Jurafsky, J. Martin, Speech and Language Processing (Pearson, Harlow, 2014)Google Scholar
  19. 19.
    C. Manning, H. Schütze, Foundations of Statistical Natural Language Processing (MIT Press, Cambridge, 1999)zbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Semantic Computing GroupBielefeld UniversityBielefeldGermany
  2. 2.Angewandte ComputerlinguistikGoethe-UniversityFrankfurt am MainGermany
  3. 3.Insight Centre for Data AnalyticsNational University of IrelandGalwayIreland
  4. 4.Aragon Institute of Engineering Research (I3A)University of ZaragozaZaragozaSpain

Personalised recommendations