Techniques for the automated testing of document analysis algorithms

  • J. Sauvola
  • D. Doermann
  • H. Kauniskangas
  • M. Pietikäinen
Oral Presentations B. Document Processing and Retrieval
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1339)


This paper proposes a new approach to automate and manage the testing process for developing document analysis and understanding algorithms. A distributed test environment is proposed to assure visibility, repeatability, scalability and consistency during and between testing sessions. A variety of views are used to deal with multi-level operations in algorithm development. Tests are realized with dedicated test scenarios and events at different stages of the development cycle. Our main objective is to provide collaborating researchers with a flexible means of generating consistent ways to validate algorithm behaviour in a target environment. This is accomplished with a simulated environment and underlying resources for each test scenarios. A set of techniques to design test events for test scenarios (e.g. module, integration, regression) is proposed aimed at promoting a black-board style research approach. To demonstrate the functionality of this approach, we have implemented a prototype and trace an example algorithm through the development process.


Document analysis algorithm testing distributed test management 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Srihari S., Shin Y., Ramanaprasad V. and Lee D. (1995) Name and address block reader system for tax form processing. In: Proc. of the 3rd International Conference on Document Analysis and Recognition, Montreal, Canada, 1:5–10.Google Scholar
  2. [2]
    Dengel A., Bleisinger R., Fein F., Hoch R., Hones F. and Malburg M. (1995) OfficeMAID-A system for office mail analysis, interpretation and delivery. In: Spitz L & Dengel A (ed) Document Analysis Systems, 1:52–75. World Scientific Press, Co.Google Scholar
  3. [3]
    Sharpe M., Sutcliffe G. and Ahemd N. (1994) Implementation of an intelligent document understanding and reproduction system. In: Proc. of the IAPR Workshop on Machine Vision Applications, Kawasaki, Japan.Google Scholar
  4. [4]
    Casey R. and Ferguson D. (1990) Intelligent forms processing. IBM Systems Journal, 29(3):435–450.Google Scholar
  5. [5]
    Taylor S., Fritzon R. and Pastor J. (1992) Extraction of data from preprinted forms. Machine Vision and Applications, 5:211–222.Google Scholar
  6. [6]
    von Mayrhauser A. (1990) Software Engineering: Methods and Management. Academic Press Inc., pp. 435–497.Google Scholar
  7. [7]
    Sauvola J., Doermann D., Haapakoski S., Kauniskangas H., Seppäinen T. and Pietikäinen M. (1997) A Distributed Management System for Testing Document Image Analysis Algorithms. To appear in ICDAR'97, Ulm, Germany.Google Scholar
  8. [8]
    Nieminen S., Sauvola J. and Pietikäinen M. (1997) Benchmarking System for Document Image Binarization. Submitted.Google Scholar
  9. [9]
    Kanungo T., Haralick R.M. and Phillips I. (1993) Global and local document degradation models, 1993 IEEE, pp. 730–734.Google Scholar
  10. [10]
    Sauvola J., Seppdnen T., Haapakoski S. and Pietikäinen M (1997) Adaptive Document Binarization. To appear in ICDAR'97, Ulm, Germany.Google Scholar
  11. [11]
    Sauvola J. and Pietikäinen M. (1995) Skew Detection Using Texture Direction Analysis. In: The Proc. of the 9th SCIA, Uppsala, Sweden, 1:1099–1106.Google Scholar
  12. [12]
    Sauvola J., Doermann D. and Pietikäinen M. (1997) Local Document Skew Detection. In: The proceedings of West-Coast SPIE, Document Recognition Systems IV, California, USA, 3027:13 p.Google Scholar
  13. [13]
    Sauvola J. and Pietikäinen M. (1995) Page Segmentation and Classification Using Fast Feature Extraction and Connectivity Analysis. In: The Proc. of the 3rd ICDAR, Montreal, Canada, 2:1127–1131.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • J. Sauvola
    • 1
  • D. Doermann
    • 2
  • H. Kauniskangas
    • 1
  • M. Pietikäinen
    • 1
  1. 1.Media Processing Team Machine Vision and Media Processing Group Infotech OuluUniversity of OuluOuluFinland
  2. 2.Language and Media Processing Lab Center for Automation ResearchUniversity of MarylandCollege Park

Personalised recommendations