Knowledge-Based Named Entity Recognition of Archaeological Concepts in Dutch

Vlachidis, Andreas; Tudhope, Douglas; Wansleeben, Milco

doi:10.1007/978-3-030-71903-6_6

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1355))

Included in the following conference series:

Research Conference on Metadata and Semantics Research

1255 Accesses
1 Citations
1 Altmetric

Abstract

The advancement of Natural Language Processing (NLP) allows the process of deriving information from large volumes of text to be automated, making text-based resources more discoverable and useful. The attention is turned to one of the most important, but traditionally difficult to access resources in archaeology; the largely unpublished reports generated by commercial or “rescue” archaeology, commonly known as “grey literature”. The paper presents the development and evaluation of a Named Entity Recognition system of Dutch archaeological grey literature targeted at extracting mentions of artefacts, archaeological features, materials, places and time entities. The role of domain vocabulary is discussed for the development of a KOS-driven NLP pipeline which is evaluated against a Gold Standard, human-annotated corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Research Trends for Named Entity Recognition in Hindi Language

Named Entity Recognition in Natural Language Processing: A Systematic Review

Overview of HIPE-2022: Named Entity Recognition and Linking in Multilingual Historical Documents

Notes

1.
Simple Knowledge Organization System (SKOS) is a Semantic Web format and aW3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject heading systems, or any other type of structured controlled vocabulary https://www.w3.org/2004/02/skos/.
2.
http://openskos.org/api/collections/rce:EGT.html.
3.
https://opennlp.apache.org/.
4.
https://snowballstem.org/.
5.
https://gate.ac.uk/sale/tao/splitch13.html#x18-33400013.8.
6.
https://cloud.gate.ac.uk/shopfront/displayItem/archaeology-ner-nl.

References

Evans, T.N.: A reassessment of archaeological grey literature: semantics and paradoxes. Internet Archaeol. 40 (2015)
Google Scholar
Rijksdienst vvor het Cultureel Erfgoed. Archis Invoer. https://archis.cultureelerfgoed.nl. Accessed 05 May 2019 (2019)
Richards, J., Tudhope, D., Vlachidis, A.: Text mining in archaeology: extracting information from archaeological reports. In: Barcelo, J.A., Bogdanovic, I. (eds.) Mathematics and Archaeology, pp. 240–254. CRC Press, Boca Raton (2015)
Google Scholar
Brandsen, A., Lambers, K., Verberne, S., Wansleeben, M.: User requirement solicitation for an information retrieval system applied to Dutch grey literature in the archaeology domain. J. Comput. Appl. Archaeol. 2(1), 21–30 (2019)
Google Scholar
Vlachidis, A., Tudhope, D.: A knowledge- based approach to Information Extraction for semantic interoperability in the archaeology domain. J. Assoc. Inf. Sci. Technol. 67(5), 1138–1152 (2016)
Article Google Scholar
Cunningham, H., Tablan, V., Roberts, A., Bontcheva, K.: Getting more out of biomedical documents with GATE’s full lifecycle open source text analytics. PLoS Comput. Biol. 9(2), e1002854 (2013)
Article Google Scholar
Meghini, C., et al.: ARIADNE: a research infrastructure for archaeology. J. Comput. Cult. Heritage (JOCCH) 10(3), 18 (2017)
Google Scholar
Tudhope, D., May, K., Binding, C., Vlachidis. A.: Connecting archaeological data and grey literature via semantic cross search. Internet Archaeol. 30 (2011)
Google Scholar
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)
Article Google Scholar
Toledo, J.I., Carbonell, M., Fornés, A., Lladós, A.J.: Information extraction from historical handwritten document images with a context-aware neural model. Pattern Recogn. 86, 27–36 (2019)
Article Google Scholar
Grishman, R., Sundheim, B.: Message understanding conference-6: a brief history. In: 16th International Conference on Computational Lingusitics, pp. 466–471 (1996)
Google Scholar
Tjong Kim Sang, E.F.: Introduction to the CoNLL-2002 shared task: language-independent named entity recognition. In: Proceedings of CoNLL-2002, pp. 155–158 (2002)
Google Scholar
Hooland, S., De Wilde, M., Verborgh, R., Steiner, T., Van de Walle, R.: Exploring entity recognition and disambiguation for cultural heritage collections. Digit. Sch. Hum. 30(2), 262–279 (2013)
Google Scholar
Amrani, A., Abajian, V., Kodratoff, Y.: A chain of text-mining to extract information in archaeology. In: Annual IEEE Computer Conference, International Conference on Information and Communication Technologies: From Theory to Applications, and ICTTA, 3rd International Conference on Information and Communication Technologies: From Theory to Applications, 7–11 April (2008)
Google Scholar
Paijmans, H., Wubben, S.: Preparing archaeological reports for intelligent retrieval. In: Posluschny, A., Lambers, K., Herzog, I. (eds.) Layers of Perception. Proceedings of the 35th International Conference on Computer Applications and Quantitative Methods in Archaeology (CAA) Berlin, Germany, April 2–6, pp. 212–217 (2007)
Google Scholar
Byrne, K.F., Klein, E.: Automatic extraction of archaeological events from text. In: Frischer, B., Crawford, J.W., Koller, D. (eds.) Making History Interactive. Proceedings of the 37th Computer Application in Archaeology Conference, pp. 48–56 (2009)
Google Scholar
Jeffrey, S., Richards, J., Ciravegna, F., Waller, S., Chapman, S., Zhang, Z.: The archaeotools project: faceted classification and natural language processing in an archaeological context. Philosoph. Trans. Ser. A. Math. Phys. Eng. Sci. 367(1897), 2507–2519 (2009)
Google Scholar
Vlachidis, A.: Semantic indexing via knowledge organization systems: applying the CIDOC-CRM to archaeological grey literature. Doctoral dissertation, University of Glamorgan (2012)
Google Scholar
Piskorski, J., Yangarber, J.R.: Information extraction: past, present and future. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds.) Multi-source, Multilingual Information Extraction and Summarization, pp. 23–49. Springer, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-28569-1_2
Chapter Google Scholar
Piskorski, J., Wieloch, K., Sydow, M.: On knowledge-poor methods for person name matching and lemmatization for highly inflectional languages. Inf. Retr. 12(3), 275–299 (2009)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the European Commission under the Community’s Seventh Framework Programme, contract no. FP7-INFRASTRUCTURES-2012-1-313193 (the ARIADNE project). Thanks, are due to ARIADNE project partners from Leiden University who helped with the definition of the Gold standard

Author information

Authors and Affiliations

Department of Information Studies, UCL, Gower Street, London, WC1E 6BT, UK
Andreas Vlachidis
School of Computing and Mathematics, University of South Wales, Newport, CF37 1DL, UK
Douglas Tudhope
Faculty of Archaeology, Leiden University, Einsteinweg 2, 2333 CC, Leiden, The Netherlands
Milco Wansleeben

Authors

Andreas Vlachidis
View author publications
You can also search for this author in PubMed Google Scholar
Douglas Tudhope
View author publications
You can also search for this author in PubMed Google Scholar
Milco Wansleeben
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreas Vlachidis .

Editor information

Editors and Affiliations

International Hellenic University, Thessaloniki, Greece
Emmanouel Garoufallou
Complutense University of Madrid, Madrid, Spain
María-Antonia Ovalle-Perandones

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vlachidis, A., Tudhope, D., Wansleeben, M. (2021). Knowledge-Based Named Entity Recognition of Archaeological Concepts in Dutch. In: Garoufallou, E., Ovalle-Perandones, MA. (eds) Metadata and Semantic Research. MTSR 2020. Communications in Computer and Information Science, vol 1355. Springer, Cham. https://doi.org/10.1007/978-3-030-71903-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-71903-6_6
Published: 18 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71902-9
Online ISBN: 978-3-030-71903-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Knowledge-Based Named Entity Recognition of Archaeological Concepts in Dutch

Abstract

Access this chapter

Similar content being viewed by others

Research Trends for Named Entity Recognition in Hindi Language

Named Entity Recognition in Natural Language Processing: A Systematic Review

Overview of HIPE-2022: Named Entity Recognition and Linking in Multilingual Historical Documents

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Knowledge-Based Named Entity Recognition of Archaeological Concepts in Dutch

Abstract

Access this chapter

Similar content being viewed by others

Research Trends for Named Entity Recognition in Hindi Language

Named Entity Recognition in Natural Language Processing: A Systematic Review

Overview of HIPE-2022: Named Entity Recognition and Linking in Multilingual Historical Documents

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation