Linked-Data Aware URI Schemes for Referencing Text Fragments
The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. The motivation behind NIF is to allow NLP tools to exchange annotations about text documents in RDF. Hence, the main prerequisite is that parts of the documents (i.e. strings) are referenceable by URIs, so that they can be used as subjects in RDF statements. In this paper, we present two NIF URI schemes for different use cases and evaluate them experimentally by benchmarking the stability of both NIF URI schemes in a Web annotation scenario. Additionally, the schemes are compared with other available schemes used to address text with URIs. The String Ontology, which is the basis for NIF, fixes the referent (i.e. a string in a given text) of the URIs unambiguously for machines and thus enables the creation of heterogeneous, distributed and loosely coupled NLP applications, which use the Web as an integration platform.
Unable to display preview. Download preview PDF.
- 1.Chiarcos, C.: Ontologies of linguistic annotation: Survey and perspectives. In: LREC. European Language Resources Association (2012)Google Scholar
- 4.Rizzo, G., Troncy, R., Hellmann, S., Bruemmer, M.: NERD meets NIF: Lifting NLP extraction results to the linked data cloud. In: LDOW (2012)Google Scholar
- 6.Wilde, E., Duerst, M.: URI Fragment Identifiers for the text/plain Media Type (2008), http://tools.ietf.org/html/rfc5147 (Online; accessed April 13, 2011)
- 7.Yee, K.: Text-Search Fragment Identifiers (1998), http://zesty.ca/crit/draft-yee-url-textsearch-00.txt (Online; accessed April 13, 2011)