Integrating Wiki Systems, Natural Language Processing, and Semantic Technologies for Cultural Heritage Data Management

  • René WitteEmail author
  • Thomas Kappler
  • Ralf Krestel
  • Peter C. Lockemann
Conference paper
Part of the Theory and Applications of Natural Language Processing book series (NLP)


Modern documents can easily be structured and augmented to have the characteristics of a semantic knowledge base. Many older documents may also hold a trove of knowledge that would deserve to be organized as such a knowledge base. In this chapter, we show that modern semantic technologies offer the means to make these heritage documents accessible by transforming them into a semantic knowledge base. Using techniques from natural language processing and Semantic Computing, we automatically populate an ontology. Additionally, all content is made accessible in a user-friendly Wiki interface, combining original text with NLP-derived metadata and adding annotation capabilities for collaborative use. All these functions are combined into a single, cohesive system architecture that addresses the different requirements from end users, software engineering aspects, and knowledge discovery paradigms. The ideas were implemented and tested with a volume from the historic Encyclopedia of Architecture and a number of different user groups.


semantic wikis ontology population ontology queries automatic summarization index generation web services Handbuch der Architektur 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



Praharshana Perera contributed to the automatic index generation and the Durm lemmatizer. Qiangqiang Li contributed to the ontology population pipeline. Thomas Gitzinger contributed to the index generation.


  1. 1.
    Bontcheva, K., Tablan, V., Maynard, D., Cunningham, H.: Evolving GATE to Meet New Challenges in Language Engineering. Natural Language Engineering (2004)Google Scholar
  2. 2.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: Proc. of the 40th Anniversary Meeting of the ACL (2002).
  3. 3.
    Doerr, M.: The CIDOC Conceptual Reference Module: An Ontological Approach to Semantic Interoperability of Metadata. AI Mag. 24(3), 75–92 (2003)MathSciNetGoogle Scholar
  4. 4.
    Fujisawa, S.: Automatic creation and enhancement of metadata for cultural heritage. Bulletin of the IEEE Technical Committee on Digital Libraries (TCDL) 3(3) (2007)Google Scholar
  5. 5.
    Généreux, M.: Cultural Heritage Digital Resources: From Extraction to Querying. In: Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007), pp. 41–48. ACL, Prague, Czech Republic (2007)Google Scholar
  6. 6.
    Gitzinger, T., Witte, R.: Enhancing the Word Processor with Natural Language Processing Capabilities. In: Natural Language Processing resources, algorithms and tools for authoring aids. Marrakech, Morocco (2008)Google Scholar
  7. 7.
    Krötzsch, M., Vrandečić, D., Völkel, M.: Semantic MediaWiki. In: I. Cruz, S. Decker, D. Allemang, C. Preist, D. Schwabe, P. Mika, M. Uschold, L. Aroyo (eds.) The Semantic Web – ISWC 2006, LNCS, vol. 4273, pp. 935–942. Springer (2006)Google Scholar
  8. 8.
    Leuf, B., Cunningham, W.: The Wiki Way, Quick Collaboration on the Web. Addison-Wesley (2001)Google Scholar
  9. 9.
    Mani, I.: Automatic Summarization. John Benjamins B.V. (2001)Google Scholar
  10. 10.
    Mavrikas, E.C., Nicoloyannis, N., Kavakli, E.: Cultural Heritage Information on the Semantic Web. In: E. Motta, N. Shadbolt, A. Stutt, N. Gibbins (eds.) EKAW, Lecture Notes in Computer Science, vol. 3257, pp. 477–478. Springer (2004)Google Scholar
  11. 11.
    Maynard, D., Peters, W., Li, Y.: Metrics for Evaluation of Ontology-based Information Extraction. In: Proceedings of the 4th International Workshop on Evaluation of Ontologies on the Web (EON 2006). Edingburgh, UK (2006)Google Scholar
  12. 12.
    Perera, P., Witte, R.: A Self-Learning Context-Aware Lemmatizer for German. In: Proc. of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), pp. 636–643. Vancouver, BC, Canada (2005)Google Scholar
  13. 13.
    Rydberg-Cox, J.A.: Cultural Heritage Language Technologies: Building an Infrastructure for Collaborative Digital Libraries in the Humanities. Ariadne 34 (2002)Google Scholar
  14. 14.
    Rydberg-Cox, J.A.: The Cultural Heritage Language Technologies Consortium. D-Lib Magazine 11(5) (2005)Google Scholar
  15. 15.
    Schaffert, S.: IkeWiki: A Semantic Wiki for Collaborative Knowledge Management. In: WETICE, pp. 388–396 (2006). URL
  16. 16.
    Schmid, H.: Improvements in part-of-speech tagging with an application to German. In: Proceedings of the ACL SIGDAT-Workshop (1995)Google Scholar
  17. 17.
    Sinclair, P., Lewis, P., Martinez, K., Addis, M., Pillinger, A., Prideaux, D.: eCHASE: Exploiting Cultural Heritage using the Semantic Web. In: 4th International Semantic Web Conference (ISWC 2005). Galway, Ireland (2005)Google Scholar
  18. 18.
    Universität Tübingen – Zentrum für Datenverarbeitung: TUSTEP: Handbuch und Referenz (2008). Version 2008Google Scholar
  19. 19.
    Witte, R., Bergler, S.: Fuzzy Clustering for Topic Analysis and Summarization of Document Collections. In: Z. Kobti, D. Wu (eds.) Proc. of the 20th Canadian Conference on Artificial Intelligence (Canadian A.I. 2007), LNAI 4509, pp. 476–488. Springer, Montréal, Québec, Canada (2007)Google Scholar
  20. 20.
    Witte, R., Bergler, S.: Next-Generation Summarization: Contrastive, Focused, and Update Summaries. In: International Conference on Recent Advances in Natural Language Processing (RANLP 2007). Borovets, Bulgaria (2007). URL
  21. 21.
    Witte, R., Gerlach, P., Joachim, M., Kappler, T., Krestel, R., Perera, P.: Engineering a Semantic Desktop for Building Historians and Architects. In: Proc. of the Semantic Desktop Workshop at the ISWC 2005, CEUR, vol. 175, pp. 138–152. Galway, Ireland (2005)Google Scholar
  22. 22.
    Witte, R., Gitzinger, T.: Connecting Wikis and Natural Language Processing Systems. In: WikiSym ’07: Proceedings of the 2007 International Symposium on Wikis, pp. 165–176. ACM, New York, NY, USA (2007). DOI URL
  23. 23.
    Witte, R., Gitzinger, T.: Semantic Assistants – User-Centric Natural Language Processing Services for Desktop Clients. In: 3rd Asian Semantic Web Conference (ASWC 2008), LNCS, vol. 5367, pp. 360–374. Springer, Bangkok, Thailand (2009). URL
  24. 24.
    Witte, R., Khamis, N., Rilling, J.: Flexible Ontology Population from Text: The OwlExporter. In: The Seventh International Conference on Language Resources and Evaluation (LREC 2010), pp. 3845–3850. ELRA, Valletta, Malta (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • René Witte
    • 1
    Email author
  • Thomas Kappler
    • 2
  • Ralf Krestel
    • 3
  • Peter C. Lockemann
    • 4
  1. 1.Concordia UniversityMontréalCanada
  2. 2.Swiss Institute of BioinformaticsGenevaSwitzerland
  3. 3.L3S Research CenterHannoverGermany
  4. 4.Karlsruhe Institute of TechnologyKarlsruheGermany

Personalised recommendations