Semi-automatic Knowledge Acquisition through CODA

  • Manuel Fiorelli
  • Riccardo Gambella
  • Maria Teresa Pazienza
  • Armando Stellato
  • Andrea Turbati
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8482)

Abstract

In this paper, we illustrate the benefits deriving from the adoption of CODA (Computer-aided Ontology Development Architecture) for the semi-automatic acquisition of knowledge from unstructured information. Based on UIMA for the orchestration of analytics, CODA promotes the reuse of independently developed information extractors, while providing dedicated capabilities for projecting their output as RDF triples conforming to a user provided vocabulary. CODA introduces a clear workflow for the coordination of concurrently working teams through the incremental definition of a limited number of shared interfaces. In the proposed semi-automatic knowledge acquisition process, humans can validate the automatically produced triples, or refine them to increase their relevance to a specific domain model. An experimental user interface tries to raise efficiency and effectiveness of human involvement. For instance, candidate refinements are provided based on metadata about the triples to be refined, and the already assessed knowledge in the target semantic repository.

Keywords

Human-Computer Interaction Ontology Engineering Ontology Population Text Analytics UIMA 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. International Journal on Semantic Web and Information Systems, Special Issue on Linked Data (IJSWIS) 5(3), 1–22 (2009)Google Scholar
  2. 2.
    Cunningham, H.: GATE, a General Architecture for Text Engineering. Computers and the Humanities 36(2), 223–254 (2002), doi:10.1023/A:1014348124664CrossRefGoogle Scholar
  3. 3.
    Ferrucci, D., Lally, A.: UIMA: An architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3-4), 327–348 (2004)CrossRefGoogle Scholar
  4. 4.
    Manola, F., Miller, E.: RDF Primer. In: World Wide Web Consortium (W3C), http://www.w3.org/TR/rdf-primer/ (accessed February 10, 2004)
  5. 5.
    W3C: OWL 2 Web Ontology Language. In: World Wide Web Consortium (W3C), http://www.w3.org/TR/2009/REC-owl2-overview-20091027/ (accessed October 27, 2009)
  6. 6.
    Isaac, A., Summers, E.: SKOS Simple Knowledge Organization System Primer. In: World Wide Web Consortium (W3C), http://www.w3.org/TR/skos-primer/ (accessed August 18, 2009)
  7. 7.
    Götz, T., Suhre, O.: Design and implementation of the UIMA common analysis system. IBM System Journal 43(3), 476–489 (2004)CrossRefGoogle Scholar
  8. 8.
    Carpenter, B.: The Logic of Typed Feature Structures, hardback edn. Cambridge Tracts in Theoretical Computer Science, p. 32. Cambridge University Press (1992)Google Scholar
  9. 9.
    Pazienza, M.T., Stellato, A., Turbati, A.: PEARL: ProjEction of Annotations Rule Language, a Language for Projecting (UIMA) Annotations over RDF Knowledge Bases. In: International Conference on Language Resources and Evaluation (LREC 2012), Instanbul, Turkey (2012)Google Scholar
  10. 10.
    Bouquet, P., Stoermer, H., Bazzanella, B.: An Entity Name System (ENS) for the Semantic Web. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 258–272. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Mendelsohn, N.: The Self-Describing Web. In: World Wide Web Consortium (W3C) Technical Architecture Group (TAG), http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html (accessed February 7, 2009)
  12. 12.
    Diosteanu, A., Turbati, A., Stellato, A.: SODA: A Service Oriented Data Acquisition Framework. In: Pazienza, M.T., Stellato, A. (eds.) Semi-Automatic Ontology Development: Processes and Resources, pp. 48–77. IGI Global (2012)Google Scholar
  13. 13.
    Pazienza, M.T., Scarpato, N., Stellato, A., Turbati, A.: Semantic Turkey: A Browser-Integrated Environment for Knowledge Acquisition and Management. Semantic Web Journal 3(3), 279–292 (2012)Google Scholar
  14. 14.
    Caracciolo, C., Stellato, A., Morshed, A., Johannsen, G., Rajbhandari, S., Jaques, Y., Keizer, J.: The AGROVOC Linked Dataset. Semantic Web Journal 4(3), 341–348 (2013)Google Scholar
  15. 15.
    Caracciolo, C., Stellato, A., Rajbahndari, S., Morshed, A., Johannsen, G., Keizer, J., Jacques, Y.: Thesaurus Maintenance, Alignment and Publication as Linked Data. International Journal of Metadata, Semantics and Ontologies (IJMSO) 7(1), 65–75 (2012)CrossRefGoogle Scholar
  16. 16.
    Morshed, A., Caracciolo, C., Johannsen, G., Keizer, J.: Thesaurus Alignment for Linked Data Publishing. In: International Conference on Dublin Core and Metadata Applications, pp. 37–46 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Manuel Fiorelli
    • 1
  • Riccardo Gambella
    • 1
  • Maria Teresa Pazienza
    • 1
  • Armando Stellato
    • 1
  • Andrea Turbati
    • 1
  1. 1.ART Research Group, Dept. of Enterprise Engineering (DII)University of RomeRomeItaly

Personalised recommendations