Skip to main content

Automatic Metadata Generation in an Archaeological Digital Library: Semantic Annotation of Grey Literature

  • Chapter
Computational Linguistics

Part of the book series: Studies in Computational Intelligence ((SCI,volume 458))

Abstract

This paper discusses the automatic generation of rich metadata from excavation reports from the Archaeological Data Service library of grey literature (OASIS). The work is part of the STAR project, in collaboration with English Heritage. An extension of the CIDOC CRM ontology for the archaeological domain acts as a core ontology. Rich metadata is automatically extracted from grey literature, directed by the CRM, via a three phase process of semantic enrichment employing the GATE toolkit augmented with bespoke rules and knowledge resources. The paper demonstrates the potential of combining knowledge based resources (ontologies and thesauri) in information extraction, and techniques for delivering the automatically extracted metadata as XML annotations coupled with the grey literature reports and as RDF graphs decoupled from content. Examples from two consuming applications are discussed, the Andronikos web portal which serves the annotated XML files for visual inspection and the STAR project, research demonstrator which offers unified search across of archaeological excavation data and grey literature via the core ontology CRM-EH.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andronikos web-portal of semantic indices of the OASIS corpus, http://andronikos.kyklos.co.uk

  2. Babeu, A., Bamman, D., Crane, G., Kummer, R., Weaver, G.: Named Entity Identification and Cyberinfrastructure. In: Kovács, L., Fuhr, N., Meghini, C. (eds.) ECDL 2007. LNCS, vol. 4675, pp. 259–270. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  3. Binding, C., May, K., Tudhope, D.: Semantic Interoperability in Archaeological Datasets: Data Mapping and Extraction Via the CIDOC CRM. In: Christensen-Dalsgaard, B., Castelli, D., Ammitzbøll Jurik, B., Lippincott, J. (eds.) ECDL 2008. LNCS, vol. 5173, pp. 280–290. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Binding, C.: Implementing Archaeological Time Periods Using CIDOC CRM and SKOS. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part I. LNCS, vol. 6088, pp. 273–287. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Bontcheva, K., Duke, T., Glover, N., Kings, I.: Semantic Information Access. In: Semantic Web Semantic Web Technology: Trends and Research in Ontology Based Systems, Wiley, Sussex (2006)

    Google Scholar 

  6. Cowie, J., Lehnert, W.: Information extraction. Communications ACM 39(1), 80–91 (1996)

    Article  Google Scholar 

  7. Cripps, P., Greenhalgh, A., Fellows, D., May, K., Robinson, D.: Ontological Modelling of the work of the Centre for Archaeology (2004), http://hypermedia.research.glam.ac.uk/resources/crm

  8. Crofts, N., Doerr, M., Gill, T., Stead, S., Stiff, M.: Definition of the CIDOC Conceptual Reference Model, http://www.cidoc-crm.org/docs/cidoc_crm_version_5.0.2.pdf

  9. Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In: Proc. 40th Meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia (2002)

    Google Scholar 

  10. Doerr, M.: The CIDOC Conceptual Reference Module: an Ontological Approach to Semantic Interoperability of Metadata. AI Magazine 2493, 75–92 (2003)

    Google Scholar 

  11. English Heritage Thesauri, http://thesaurus.english-heritage.org.uk/

  12. Isaac, A., Summers, E.: SKOS Simple Knowledge Organization System Primer (2009), http://www.w3.org/TR/skos-primer

  13. Online AccesS to the Index of archaeological investigationS (OASIS), http://www.oasis.ac.uk/

  14. Ore, C.-E., Eide, Ø.: TEI and cultural heritage ontologies: Exchange of information? Literary and Linguist Computing 24(2), 161–172 (2009)

    Article  Google Scholar 

  15. May, K., Binding, C., Tudhope, D.: A STAR is born: some emerging Semantic Technologies for Archaeological Resources. In: Proceedings Computer Applications and Quantitative Methods in Archaeology (CAA 2008), Budapest (2008)

    Google Scholar 

  16. Moens, M.: Information Extraction Algorithms and Prospects in a Retrieval Context. Springer, New York (2006)

    MATH  Google Scholar 

  17. Semantic Technologies for Archaeological Resources (STAR) demonstrator. University of Glamorgan, http://hypermedia.research.glam.ac.uk/resources/star-demonstrator/

  18. Tudhope, D., Binding, C., May, K.: Semantic interoperability issues from a case study in archaeology. In: Kollias, S., Cousins, J. (eds.) Proc. First International Workshop SIEDL 2008, Semantic Interoperability in the European Digital Library, Associated with 5th European Semantic Web Conference, Tenerife, pp. 88–99 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Vlachidis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Vlachidis, A., Binding, C., May, K., Tudhope, D. (2013). Automatic Metadata Generation in an Archaeological Digital Library: Semantic Annotation of Grey Literature. In: Przepiórkowski, A., Piasecki, M., Jassem, K., Fuglewicz, P. (eds) Computational Linguistics. Studies in Computational Intelligence, vol 458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34399-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34399-5_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34398-8

  • Online ISBN: 978-3-642-34399-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics