Creating a Handwriting Recognition Corpus for Bushman Languages

  • Kyle Williams
  • Hussein Suleman
Conference paper

DOI: 10.1007/978-3-642-24826-9_28

Part of the Lecture Notes in Computer Science book series (LNCS, volume 7008)
Cite this paper as:
Williams K., Suleman H. (2011) Creating a Handwriting Recognition Corpus for Bushman Languages. In: Xing C., Crestani F., Rauber A. (eds) Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation. ICADL 2011. Lecture Notes in Computer Science, vol 7008. Springer, Berlin, Heidelberg

Abstract

Handwriting recognition systems rely on the existence of a corpus for training recognition models and evaluating accuracy. Creating a handwriting recognition corpus for the Bushman languages of southern Africa is difficult due to the complexities of the script used to represent them and the fact that this script cannot be represented using Unicode. To solve this problem, a semi-automatic Web-based tool was developed to segment, capture and encode the Bushman text. A case study demonstrated how the tool could be used to create a Bushman handwriting corpus with few errors.

Keywords

Corpus creation transcription digital libraries 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Kyle Williams
    • 1
  • Hussein Suleman
    • 1
  1. 1.Department of Computer ScienceUniversity of Cape TownRondeboschSouth Africa

Personalised recommendations