Skip to main content
  • Book
  • Open Access
  • © 2013

Essential Speech and Language Technology for Dutch

Results by the STEVIN-programme

  • Bundles contributions of almost all the HLT research groups in Flanders and the Netherlands

  • An overview of more than a decade of joint R&D effort in the Low Countries on HLT for Dutch

  • An exemplary way of how to protect the interests of a medium (or smaller) sized language in the modern information and communication society?

  • Includes supplementary material:

Buying options

Softcover Book USD 59.95
Price excludes VAT (USA)
Hardcover Book USD 59.99
Price excludes VAT (USA)

Table of contents (22 chapters)

  1. Front Matter

    Pages i-xvii
  2. Introduction

    • Peter Spyns
    Pages 1-17Open Access
  3. How It Started

    1. Front Matter

      Pages 19-19
    2. The STEVIN Programme: Result of 5 Years Cross-border HLT for Dutch Policy Preparation

      • Peter Spyns, Elisabeth D’Halleweyn
      Pages 21-39Open Access
  4. HLT Resource-Project Related Papers

    1. Front Matter

      Pages 41-41
    2. The JASMIN Speech Corpus: Recordings of Children, Non-natives and Elderly People

      • Catia Cucchiarini, Hugo Van hamme
      Pages 43-59Open Access
    3. Resources Developed in the Autonomata Projects

      • Henk van den Heuvel, Jean-Pierre Martens, Gerrit Bloothooft, Marijn Schraagen, Nanneke Konings, Kristof D’hanens et al.
      Pages 61-78Open Access
    4. STEVIN Can Praat

      • David Weenink
      Pages 79-94Open Access
    5. SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

      • Patrick Wambacq, Kris Demuynck, Dirk Van Compernolle
      Pages 95-113Open Access
    6. COREA: Coreference Resolution for Extracting Answers for Dutch

      • Iris Hendrickx, Gosse Bouma, Walter Daelemans, Véronique Hoste
      Pages 115-128Open Access
    7. Automatic Tree Matching for Analysing Semantic Similarity in Comparable Text

      • Erwin Marsi, Emiel Krahmer
      Pages 129-145Open Access
    8. Large Scale Syntactic Annotation of Written Dutch: Lassy

      • Gertjan van Noord, Gosse Bouma, Frank Van Eynde, Daniël de Kok, Jelmer van der Linde, Ineke Schuurman et al.
      Pages 147-164Open Access
    9. Cornetto: A Combinatorial Lexical Semantic Database for Dutch

      • Piek Vossen, Isa Maks, Roxane Segers, Hennie van der Vliet, Marie-Francine Moens, Katja Hofmann et al.
      Pages 165-184Open Access
    10. Dutch Parallel Corpus: A Balanced Parallel Corpus for Dutch-English and Dutch-French

      • Hans Paulussen, Lieve Macken, Willy Vandeweghe, Piet Desmet
      Pages 185-199Open Access
    11. The Construction of a 500-Million-Word Reference Corpus of Contemporary Written Dutch

      • Nelleke Oostdijk, Martin Reynaert, Véronique Hoste, Ineke Schuurman
      Pages 219-247Open Access
  5. HLT-Technology Related Papers

    1. Front Matter

      Pages 249-249
    2. Lexical Modeling for Proper name Recognition in Autonomata Too

      • Bert Réveil, Jean-Pierre Martens, Henk van den Heuvel, Gerrit Bloothooft, Marijn Schraagen
      Pages 251-270Open Access
    3. Missing Data Solutions for Robust Speech Recognition

      • Yujun Wang, Jort F. Gemmeke, Kris Demuynck, Hugo Van hamme
      Pages 289-304Open Access

About this book

The book provides an overview of more than a decade of joint R&D efforts in the Low Countries on HLT for Dutch. It not only presents the state of the art of HLT for Dutch in the areas covered, but, even more importantly, a description of the resources (data and tools) for Dutch that have been created  are now  available for both academia and industry worldwide.

The contributions cover many areas of human language technology (for Dutch): corpus collection (including IPR issues) and building (in particular one corpus aiming at a collection of 500M word tokens), lexicology, anaphora resolution, a semantic network, parsing technology, speech recognition, machine translation, text (summaries) generation, web mining, information extraction, and text to speech to name the most important ones.

The book also shows how a medium-sized language community (spanning two territories) can create a digital language infrastructure (resources, tools, etc.) as a basis for subsequent R&D. At the same time, it bundles contributions of almost all the HLT research groups in Flanders and the Netherlands, hence offers a view of their recent research activities.

Targeted readers are mainly researchers in human language technology, in particular those focusing on Dutch. It concerns researchers active in larger networks such as the CLARIN, META-NET, FLaReNet and participating in conferences such as ACL, EACL, NAACL, COLING, RANLP, CICling, LREC, CLIN and DIR ( both in the Low Countries), InterSpeech, ASRU, ICASSP, ISCA, EUSIPCO, CLEF, TREC, etc. In addition, some chapters are interesting for human language technology  policy makers and even for science policy makers in general.


  • 68T50, 68T10, 91F20, 68T30, 03B65
  • R&D programme for Dutch
  • digital linguistic infrastructure
  • human language technology
  • language policy for a medium sized language
  • machine learning


From the book reviews:

“The book offers an impressive documentation of transnational joint research and development in language and speech technology from the past decade, that has been characterized by international reviewers as a front-running, exemplary effort. … it is a must-read for prospective BLARK developers, HLT resource centers, and language unions everywhere … .” (Antal van den Bosch, Machine Translation, Vol. 28, 2014)

Editors and Affiliations

  • Nederlandse Taalunie, The Hague, The Netherlands

    Peter Spyns

  • UiL-OTS University of Utrecht, Utrecht, The Netherlands

    Jan Odijk

About the editors

Dr. Peter Spyns works for the Flemish Department of Economy, Science and Innovation on innovation policy preparation, including human language technology and digital heritage. He is part-time seconded to the Dutch Language Union where he coordinated the STEVIN programme, supervises the HLT Agency, and participates in CLARIN-ERIC committees. Previously, he worked in academia and industry on medical language processing, ontology modelling and dialogue systems.

Prof. Dr. Jan Odijk is professor of Language and Speech Technology at Utrecht University. Previously, he worked more than 23 years in language and speech technology industry, both in the Netherlands and in Flanders. His current research focuses on infrastructures for humanities researchers that include language and speech technology services.

Bibliographic Information

  • Book Title: Essential Speech and Language Technology for Dutch

  • Book Subtitle: Results by the STEVIN-programme

  • Editors: Peter Spyns, Jan Odijk

  • Series Title: Theory and Applications of Natural Language Processing

  • DOI:

  • Publisher: Springer Berlin, Heidelberg

  • eBook Packages: Computer Science, Computer Science (R0)

  • Copyright Information: The Editor(s) (if applicable) and the Author(s) 2013

  • License: CC BY-NC

  • Hardcover ISBN: 978-3-642-30909-0Published: 05 March 2013

  • Softcover ISBN: 978-3-642-42992-7Published: 07 March 2015

  • eBook ISBN: 978-3-642-30910-6Published: 26 February 2013

  • Series ISSN: 2192-032X

  • Series E-ISSN: 2192-0338

  • Edition Number: 1

  • Number of Pages: XVII, 413

  • Number of Illustrations: 50 b/w illustrations, 29 illustrations in colour

  • Topics: Computational Linguistics, Germanic Languages, Artificial Intelligence

Buying options

Softcover Book USD 59.95
Price excludes VAT (USA)
Hardcover Book USD 59.99
Price excludes VAT (USA)