Creating a Handwriting Recognition Corpus for Bushman Languages
Handwriting recognition systems rely on the existence of a corpus for training recognition models and evaluating accuracy. Creating a handwriting recognition corpus for the Bushman languages of southern Africa is difficult due to the complexities of the script used to represent them and the fact that this script cannot be represented using Unicode. To solve this problem, a semi-automatic Web-based tool was developed to segment, capture and encode the Bushman text. A case study demonstrated how the tool could be used to create a Bushman handwriting corpus with few errors.
KeywordsCorpus creation transcription digital libraries
Unable to display preview. Download preview PDF.
- 2.Marti, U., Bunke, H.: A full English sentence database for off-line handwriting recognition. In: Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 705–708. IEEE, Washington, DC (1999)Google Scholar
- 4.Makridis, M., Nikolaou, N., Gatos, B.: An efficient word segmentation technique for historical and degraded machine-printed documents. In: Proceedings of the Ninth International Conference on Document Analysis and Recognition, pp. 178–182. IEEE, Washington, DC (2007)Google Scholar
- 7.Surowiecki, J.: The wisdom of crowds: why the many are smarter than the few. Abacus (2005)Google Scholar
- 8.Setlur, S., Kompalli, S., Ramanaprasad, V., Govindaraju, V.: Creation of data resources and design of an evaluation test bed for Devanagari script recognition. In: 13th International Workshop on Research Issues in Data Engineering: Multi-lingual Information Management, pp. 55–61. IEEE, Washington, DC (2003)Google Scholar
- 12.Rei, F.: Tipa: A system for processing phonetic symbols in LaTeX. TUGboat 17(2), 102–114 (1996)Google Scholar
- 13.Sezgin, M., Sankur, B.: Survey over image thresholding techniques and quantitative performance evaluation. J. Electron. Imaging 13(1), 146–168 (2007)Google Scholar
- 16.Shapiro, L., Stockman, G.: Computer vision. Prentice Hall, Englewood Cliffs (2001)Google Scholar