Text Mining pp 201-219 | Cite as

Multi-perspective Event Detection in Texts Documenting the 1944 Battle of Arnhem

  • Marten Düring
  • Antal van den Bosch
Part of the Theory and Applications of Natural Language Processing book series (NLP)


We present a pilot project which combines the respective strengths of research practices in history, memory studies, and computational linguistics. We present a proof-of-concept workflow for the semi-automatic detection and linking of narratives referring to the same event based on references to location names. We rely on the interaction between human evaluation, entity extraction, mapping, and network visualization techniques. We work with 83 narratives and reports surrounding the Battle of Arnhem in 1944. The liberation of the Netherlands led to frequent encounters between civilians and soldiers in the war zones. We seek to find multi-perspective descriptions of these interactions marked by a high degree of uncertainty, differing anticipations and sometimes violence. A proof-of-concept study shows that we cannot rely on standard named-entity recognition but need to develop fine-grained detection of street names, to capture the scenes that connect multi-perspective narratives.


Regular Expression Historical Research Historical Source Entity Recognition Police Report 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The authors wish to thank the anonymous reviewers for their comments and suggestions. We are grateful to Berry de Reus en Jan Hovers of the Airborne Museum ‘Hartenstein’, Oosterbeek, for their willingness to give us access to their museum’s digital resources.


  1. 1.
    Bennett D (2008) Magnificent disaster: the failure of market garden, the arnhem operation. Casemate, Oxford (Sept 1944)Google Scholar
  2. 2.
    Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1):107–117CrossRefGoogle Scholar
  3. 3.
    Byrne K, Klein E (2010) Automatic extraction of archaeological events from text. In: Proceedings of computer applications and quantitative methods in archaeology, WilliamsburgGoogle Scholar
  4. 4.
    Cornelius R (1995) A bridge too far, 1st edn. Simon & Schuster, New YorkGoogle Scholar
  5. 5.
    Diesner J, Carley KM, Tambayong L (2012) Extracting socio-cultural networks of the sudan from open-source, large-scale text data. Comput Math Organ Theory 18(3):328–339. doi: 10.1007/s10588-012-9126-x.
  6. 6.
    Düring M (2012) Verdeckte soziale netzwerke im nationalsozialismus. die Entstehung und Arbeitsweise von Berliner Hilfsnetzwerken für verfolgte Juden. Ph.D. thesis, Universität Mainz, MainzGoogle Scholar
  7. 7.
    Düring M (2014) Netzwerkvisualisierungen in den geschichtswissenschaften zwischen explorativer quellenanalyse und der suggestionskraft des bildes. In: Haussling R (ed) Visualisierung sozialer netzwerke. VS Verlag für Sozialwissenschaften, WiesbadenGoogle Scholar
  8. 8.
    Gerritsen S, Lenders W (2006) Verhalen die blijven: beleefde geschiedenis in de grensregio. Nationaal bevrijdingsmuseum 1944–1945, Groesbeek, the NetherlandsGoogle Scholar
  9. 9.
    Hall D, Jurafsky D, Manning C (2008) Studying the history of ideas using topic models. In: Proceedings of the conference on empirical methods in natural language processing. ACL, Stroudsburg, pp 363–371Google Scholar
  10. 10.
    Hendrickx I, Düring M, Zervanou K, Van Den Bosch A (2013) Searching and finding strikes in the New York Times. In: Proceedings of the 3rd workshop on annotation of corpora for research in the humanities (ACRH-3). The Institute of Information and Communication Technologies, Bulgarian Academy of Scienes, Sofia, pp 25–36Google Scholar
  11. 11.
    Henke KD (1996) Die amerikanische Besetzung Deutschlands. Oldenbourg Verlag, MunichGoogle Scholar
  12. 12.
    Hunter J, Macarthur J, der Plaat DV, Gosseye J, Muys A, Macnamara C, G Bannerman G (2014) Extracting relationships from an online digital archive about post-war queensland architecture.
  13. 13.
    Isaksen L (2014) Pelagios: pelagios 3 overview.
  14. 14.
    Jockers ML (2013) Macroanalysis: digital methods and literary history. University of Illinois Press, ChampaignGoogle Scholar
  15. 15.
    Karsdorp F, Van den Bosch A (2013) Identifying motifs in folktales using topic models. In: Proceedings of the 22 annual Belgian-Dutch conference on machine learning, Nijmegen, pp 41–49Google Scholar
  16. 16.
    Kershaw R (1990) It never snows in september: the German view of market-garden and the battle of Arnhem, september 1944. Crowood, MarlboroughGoogle Scholar
  17. 17.
    McCallum AK (2002) Mallet: a machine learning for language toolkit.
  18. 18.
    Meister JC, Jacke J (2014) Pushing back the boundary of interpretation: concept, practice and relevance of a digital heuristic.
  19. 19.
    Miller B, Shrestha A, Derby J, Olive J, Umapathy K, Li F, Zhao Y (2013) Digging into human rights violations: data modelling and collective memory. In: 2013 IEEE international conference on big data, pp 37–45. doi: 10.1109/BigData.2013.6691668
  20. 20.
    Miller B, Shrestha A, Olive J (2014) Visualizing computational, transversal narratives from the world trade towers.
  21. 21.
    Nuessli MA, Kaplan F (2014) Encoding metaknowledge for historical databases.
  22. 22.
    Shrestha A, Miller B, Zhu Y, Zhao Y (2013) Storygraph: extracting patterns from spatio-temporal data. In: Proceedings of the ACM SIGKDD workshop on interactive data exploration and analytics, IDEA ’13. ACM, New York, pp 95–103. doi: 10.1145/2501511.2501525.
  23. 23.
    Sporleder C (2010) Natural language processing for cultural heritage domains. Lang Linguist Compass 4(9):750–768CrossRefGoogle Scholar
  24. 24.
    Sporleder C, Van Erp M, Porcelijn T, Van den Bosch A (2006) Identifying named entities in text databases from the natural history domain. In: Proceedings of the 5th international conference on language resources and evaluation, LREC-2006, TrentoGoogle Scholar
  25. 25.
    Stacey CP (1960) Official history of the Canadian army in the second world war: the victory campaign: the operations in Northwest Europe, 1944–1945. Official history of the Canadian army, vol 3. Queen’s Printer, Ottawa.
  26. 26.
    Underwood T, Black ML, Auvil L, Capitanu B (2013) Mapping mutable genres in structurally complex volumes. In: Proceedings of the 2013 IEEE international conference on big data. IEEE, Santa Clara, pp 95–103Google Scholar
  27. 27.
    Van de Camp M, Van den Bosch A (2012) The socialist network. Decision Support Syst 53(4):761–769CrossRefGoogle Scholar
  28. 28.
    Van den Bosch A, Sporleder C, Van Erp M, Hunt S (2007) Automatic techniques for generating and correcting cultural heritage collection metadata. In: Proceedings of digital humanities 2007, the 19th joint international conference of the association for computers and the humanities and the association for literary and linguistic computing. University of Illinois at Urbana-Champaign, Champaign, pp 223–224Google Scholar
  29. 29.
    Van den Bosch A, Stroppa N, Way A (2007) A memory-based classification approach to marker-based EBMT. In: Eynde FV, Vandeghinste V, Schuurman I (eds) Proceedings of the METIS-II workshop on new approaches to machine translation, Leuven, pp 63–72Google Scholar
  30. 30.
    Van den Bosch A, Lendvai P, Van Erp M, Hunt S, Van der Meij M, Dekker R (2009) Weaving a new fabric of natural history. Interdisciplinary Sci Rev 34(2–3):206–23CrossRefGoogle Scholar
  31. 31.
    Van den Hoven M, Van den Bosch A, Zervanou K (2010) Beyond reported history: strikes that never happened. In: Darányi S, Lendvai P (eds) Proceedings of the first international AMICUS workshop on automated motif discovery in cultural heritage and scientific communication texts, Vienna, pp 20–28Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.University of North CarolinaChapel HillUSA
  2. 2.Centre for Language StudiesRadboud UniversityNijmegenThe Netherlands

Personalised recommendations