An Architecture for Data and Knowledge Acquisition for the Semantic Web: The AGROVOC Use Case

Pazienza, Maria Teresa; Stellato, Armando; Tudorache, Alexandra Gabriela; Turbati, Andrea; Vagnoni, Flaminia

doi:10.1007/978-3-642-33618-8_58

An Architecture for Data and Knowledge Acquisition for the Semantic Web: The AGROVOC Use Case

Maria Teresa Pazienza²⁰,
Armando Stellato²⁰,
Alexandra Gabriela Tudorache²⁰,
Andrea Turbati²⁰ &
…
Flaminia Vagnoni²⁰

Conference paper

1776 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7567))

Abstract

We are surrounded by ever growing volumes of unstructured and weakly-structured information, and for a human being, domain expert or not, it is nearly impossible to read, understand and categorize such information in a fair amount of time. Moreover, different user categories have different expectations: final users need easy-to-use tools and services for specific tasks, knowledge engineers require robust tools for knowledge acquisition, knowledge categorization and semantic resources development, while semantic applications developers demand for flexible frameworks for fast and easy, standardized development of complex applications. This work represents an experience report on the use of the CODA framework for rapid prototyping and deployment of knowledge acquisition systems for RDF. The system integrates independent NLP tools and custom libraries complying with UIMA standards. For our experiment a document set has been processed to populate the AGROVOC thesaurus with two new relationships.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fiorelli, M., Pazienza, M.T., Petruzza, S., Stellato, A., Turbati, A.: Computer-aided Ontology Development: an integrated environment. In: New Challenges for NLP Frameworks, Valletta, Malta, May 18 (2010)
Google Scholar
Chang, C.-H., Kayed, M., Girgis, M.R., Shaalan, K.F.: A Survey of Web Information Extraction Systems. IEEE Transactions on KDE, 1411–1428 (October 2006)
Google Scholar
Kluegl, P., Atzmueller, M., Puppe, F.: TextMarker: A Tool for Rule-Based Information Extraction. In: Unstructured Information Management Architecture (UIMA), 2nd UIMA@GSCL Workshop, 2009 Conference of the GSCL (2009)
Google Scholar
Jayram, T.S., Krishnamurthy, R., Raghavan, S., Vaithyanathan, S., Zhu, H.: Avatar Information Extraction System. IEEE Data Eng. Bull., 40–48 (2006)
Google Scholar
Regev, Y., et al.: Rule-based extraction of experimental evidence in the biomedical domain: the KDD Cup 2002 (task 1). SIGKDD 4(2), 90–92 (2002)
Article Google Scholar
Mykowiecka, A., Marciniak, M., Kupsc, A.: Rule-based information extraction from patients’ clinical data. Journal of Biomedical Informatics 42(5), 923–936 (2009)
Article Google Scholar
Vossen, P., Soroa, A., Zapirain, B., Rigau, G.: Cross-lingual event-mining using wordnet as a shared knowledge interface. In: Proceedings of GWC 2012, Japan (January 2012)
Google Scholar
Pazienza, M.T., Stellato, A.: Linguistic Enrichment of Ontologies: a methodological framework. In: OntoLex 2006, Genoa, Italy (2006)
Google Scholar
Buitelaar, P., Cimiano, P., Haase, P., Sintek, M.: Towards Linguistically Grounded Ontologies. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 111–125. Springer, Heidelberg (2009)
Chapter Google Scholar
Cimiano, P.: Ontology Learning and Population from Text Algorithms, Evaluation and Applications XXVIII. Springer (2006)
Google Scholar
Cunningham, H.: GATE, a General Architecture for Text Engineering. Computers and the Humanities 36, 223–254 (2002)
Article Google Scholar
Ferrucci, D., Lally, A.: Uima: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3-4), 327–348 (2004)
Article Google Scholar
Morshed, A., Keizer, J., Johannsen, G., Stellato, A., Baker, T.: From AGROVOC OWL Model towards AGROVOC SKOS Model. FAOAIMS (2010)
Google Scholar
Morshed, A., Sini, M.: Creating and aligning controlled vocabularies. Report (2009)
Google Scholar
Lee, H., et al.: Stanford’s Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task. In: CoNLL-2011 Shared Task (2011)
Google Scholar
Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)
Article Google Scholar
Liu, B., Chiticariu, L., Chu, V., Jagadish, H.V., Reiss, F.: Automatic Rule Refinement for Information Extraction. PVLDB 3(1), 588–597 (2010)
Google Scholar
Pazienza, M.T., Stellato, A., Turbati, A.: PEARL: ProjEction of Annotations Rule Language, a Language for Projecting (UIMA) Annotations over RDF Knowledge Bases. In: International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, May 21-27 (2012)
Google Scholar
Basili, R., Zanzotto, F.M.: Parsing Engineering and Empirical Robustness. Journal of Natural Language Engineering 8 (June 2-3 2002)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Rome Tor Vergata, Via del Politecnico 1, 00133, Rome, Italy
Maria Teresa Pazienza, Armando Stellato, Alexandra Gabriela Tudorache, Andrea Turbati & Flaminia Vagnoni

Authors

Maria Teresa Pazienza
View author publications
You can also search for this author in PubMed Google Scholar
Armando Stellato
View author publications
You can also search for this author in PubMed Google Scholar
Alexandra Gabriela Tudorache
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Turbati
View author publications
You can also search for this author in PubMed Google Scholar
Flaminia Vagnoni
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Facultad de Informática, Universidad Politécnica de Madrid, Campus de Montegancedo S/N, Boadilla del Monte, 28660, Madrid, Spain
Pilar Herrero
Research Centre for Automatic Control, School of Engineering in Information Technology, University of Lorraine, CNRS, Campus scientifique, BP 70239, 54506, Vandoeuvre-les-Nancy, France
Hervé Panetto
Department of Computer Science, Semantic Technology and Application Research Laboratory (STARLab), Vrije Universiteit Brussel, Building G-10, Pleinlaan 2, 1050, Brussels, Belgium
Robert Meersman
La Trobe University, Melbourne, VIC, Australia
Tharam Dillon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pazienza, M.T., Stellato, A., Tudorache, A.G., Turbati, A., Vagnoni, F. (2012). An Architecture for Data and Knowledge Acquisition for the Semantic Web: The AGROVOC Use Case. In: Herrero, P., Panetto, H., Meersman, R., Dillon, T. (eds) On the Move to Meaningful Internet Systems: OTM 2012 Workshops. OTM 2012. Lecture Notes in Computer Science, vol 7567. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33618-8_58

Download citation

DOI: https://doi.org/10.1007/978-3-642-33618-8_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33617-1
Online ISBN: 978-3-642-33618-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics