Skip to main content

An Architecture for Data and Knowledge Acquisition for the Semantic Web: The AGROVOC Use Case

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7567))

Abstract

We are surrounded by ever growing volumes of unstructured and weakly-structured information, and for a human being, domain expert or not, it is nearly impossible to read, understand and categorize such information in a fair amount of time. Moreover, different user categories have different expectations: final users need easy-to-use tools and services for specific tasks, knowledge engineers require robust tools for knowledge acquisition, knowledge categorization and semantic resources development, while semantic applications developers demand for flexible frameworks for fast and easy, standardized development of complex applications. This work represents an experience report on the use of the CODA framework for rapid prototyping and deployment of knowledge acquisition systems for RDF. The system integrates independent NLP tools and custom libraries complying with UIMA standards. For our experiment a document set has been processed to populate the AGROVOC thesaurus with two new relationships.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fiorelli, M., Pazienza, M.T., Petruzza, S., Stellato, A., Turbati, A.: Computer-aided Ontology Development: an integrated environment. In: New Challenges for NLP Frameworks, Valletta, Malta, May 18 (2010)

    Google Scholar 

  2. Chang, C.-H., Kayed, M., Girgis, M.R., Shaalan, K.F.: A Survey of Web Information Extraction Systems. IEEE Transactions on KDE, 1411–1428 (October 2006)

    Google Scholar 

  3. Kluegl, P., Atzmueller, M., Puppe, F.: TextMarker: A Tool for Rule-Based Information Extraction. In: Unstructured Information Management Architecture (UIMA), 2nd UIMA@GSCL Workshop, 2009 Conference of the GSCL (2009)

    Google Scholar 

  4. Jayram, T.S., Krishnamurthy, R., Raghavan, S., Vaithyanathan, S., Zhu, H.: Avatar Information Extraction System. IEEE Data Eng. Bull., 40–48 (2006)

    Google Scholar 

  5. Regev, Y., et al.: Rule-based extraction of experimental evidence in the biomedical domain: the KDD Cup 2002 (task 1). SIGKDD 4(2), 90–92 (2002)

    Article  Google Scholar 

  6. Mykowiecka, A., Marciniak, M., Kupsc, A.: Rule-based information extraction from patients’ clinical data. Journal of Biomedical Informatics 42(5), 923–936 (2009)

    Article  Google Scholar 

  7. Vossen, P., Soroa, A., Zapirain, B., Rigau, G.: Cross-lingual event-mining using wordnet as a shared knowledge interface. In: Proceedings of GWC 2012, Japan (January 2012)

    Google Scholar 

  8. Pazienza, M.T., Stellato, A.: Linguistic Enrichment of Ontologies: a methodological framework. In: OntoLex 2006, Genoa, Italy (2006)

    Google Scholar 

  9. Buitelaar, P., Cimiano, P., Haase, P., Sintek, M.: Towards Linguistically Grounded Ontologies. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 111–125. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  10. Cimiano, P.: Ontology Learning and Population from Text Algorithms, Evaluation and Applications XXVIII. Springer (2006)

    Google Scholar 

  11. Cunningham, H.: GATE, a General Architecture for Text Engineering. Computers and the Humanities 36, 223–254 (2002)

    Article  Google Scholar 

  12. Ferrucci, D., Lally, A.: Uima: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3-4), 327–348 (2004)

    Article  Google Scholar 

  13. Morshed, A., Keizer, J., Johannsen, G., Stellato, A., Baker, T.: From AGROVOC OWL Model towards AGROVOC SKOS Model. FAOAIMS (2010)

    Google Scholar 

  14. Morshed, A., Sini, M.: Creating and aligning controlled vocabularies. Report (2009)

    Google Scholar 

  15. Lee, H., et al.: Stanford’s Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task. In: CoNLL-2011 Shared Task (2011)

    Google Scholar 

  16. Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  17. Liu, B., Chiticariu, L., Chu, V., Jagadish, H.V., Reiss, F.: Automatic Rule Refinement for Information Extraction. PVLDB 3(1), 588–597 (2010)

    Google Scholar 

  18. Pazienza, M.T., Stellato, A., Turbati, A.: PEARL: ProjEction of Annotations Rule Language, a Language for Projecting (UIMA) Annotations over RDF Knowledge Bases. In: International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, May 21-27 (2012)

    Google Scholar 

  19. Basili, R., Zanzotto, F.M.: Parsing Engineering and Empirical Robustness. Journal of Natural Language Engineering 8 (June 2-3 2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pazienza, M.T., Stellato, A., Tudorache, A.G., Turbati, A., Vagnoni, F. (2012). An Architecture for Data and Knowledge Acquisition for the Semantic Web: The AGROVOC Use Case. In: Herrero, P., Panetto, H., Meersman, R., Dillon, T. (eds) On the Move to Meaningful Internet Systems: OTM 2012 Workshops. OTM 2012. Lecture Notes in Computer Science, vol 7567. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33618-8_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33618-8_58

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33617-1

  • Online ISBN: 978-3-642-33618-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics