The UvA color document dataset

  • Leon Todoran
  • Marcel Worring
  • Arnold W. M. Smeulders
Article

Abstract.

Publications on color document image analysis present results on small, nonpublicly available datasets. In this paper we propose a well-defined and groundtruthed color dataset consisting of over 1000 pages, with associated tools for evaluation. As we focus on aspects specific to color documents, we leave out the document textual content in the ground truth. The color data groundtruthing and evaluation tools are based on a well-defined document model, complexity measures to assess the inherent difficulty of analyzing a page, and well-founded evaluation measures. Together they form a suitable basis for evaluating diverse applications in color document analysis. Both the dataset and the tools are available through our Web site at http: //www.science.uva.nl/UvA-CDD

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley, Reading, MAGoogle Scholar
  2. 2.
    Bottou L, Haffner P, Howard PG, LeCun Y (1999) Djvu: analyzing and compressing scanned documents for internet distribution. In: Proceedings of the 5th international conference on document analysis and recognition (ICDAR'99), Bangalore, India, September 1999, pp 625-628Google Scholar
  3. 3.
    Chen WY, Chen SY (1998) Adaptive page segmentation for color technical journals' cover images. Image Vis Comput 16(3):855-877Google Scholar
  4. 4.
    Garcia C, Apostolidis X (2000) Text detection and segmentation in complex color images. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, Istanbul, pp 75-78Google Scholar
  5. 5.
    Hase H, Shinokawa T, Yoneda M, Sakai M, Maruyama H (1999) Character string extraction from a color document. In: Proceedings of the 5th international conference on document analysis and recognition (ICDAR'99), Bangalore, India, September 1999, pp 75-78Google Scholar
  6. 6.
    Hua XS, Wenyin L, Zhang HJ (2001) Automatic performance evaluation for video text detection. In: Proceedings of the 6th international conference on document analysis and recognition (ICDAR'01), Seattle, pp 545-550Google Scholar
  7. 7.
    Jain AK, Yu B (1998) Automatic text location in images and video frames. Pattern Recog 31(12):2055-2076Google Scholar
  8. 8.
    Junker M, Hoch R, Dengel A (1999) On the evaluation of document analysis components by recall, precision and accuracy. In: Proceedings of the 5th international conference on document analysis and recognition (ICDAR'99), Bangalore, India, September, 1999, pp 713-716Google Scholar
  9. 9.
    Kanai J, Rice SV, Nartker TA, Nagy G (1995) Automated evaluation of ocr zoning. IEEE Trans Pattern Anal Mach Intell 17(1):86-90Google Scholar
  10. 10.
    Liang J, Phillips IT, Haralick R (2001) An optimization methodology for document structure extraction on latin character documents. IEEE Trans Pattern Anal Mach Intell 23(7):719-734Google Scholar
  11. 11.
    Liang J, Rogers R, Haralick R, Phillips I (1997) Uw-isl document image analysis toolbox: an experimental environment. In: Proceedings of the 4th international conference on document analysis and recognition, Ulm, Germany, August 1997, pp 984-988Google Scholar
  12. 12.
    Mao S, Kanungo T (2001) Empirical performance evaluation methodology and its application to page segmentation algorithms. IEEE Trans Pattern Anal Mach Intell 23(3):242-256Google Scholar
  13. 13.
    Microsoft Research (2000) The Microsoft Vision SDK library. http://www.research.microsoft.com/projects/VisSDKGoogle Scholar
  14. 14.
    Nagy G (2000) Twenty years of document image analysis in PAMI. IEEE Trans Pattern Anal Mach Intell 22(1):38-62Google Scholar
  15. 15.
    Perroud T, Sobottka K, Bunke H, Hall L (2001) Text extraction from color documents - clustering approaches in three and four dimensions. In: Proceedings of the 6th international conference on document analysis and recognition (ICDAR'01), Seattle, pp 937-941Google Scholar
  16. 16.
    Ryu DS, Kang SM, Lee SW (2000) Parameter-independent geometric document layout analysis. In: Proceedings of the 2000 international conference on pattern recognition (ICPR'00), Barcelona, Spain, pp 397-400Google Scholar
  17. 17.
    Sauvola J, Kauniskangas H (1998) MediaTeam document database II. CD-ROM collection of document images, University of Oulu, Finland. http://www.mediateam.oulu.fi/MTDB/index.htmlGoogle Scholar
  18. 18.
    Sauvola J, Haapakoski S, Kauniskangas H, Seppanen T, Pietiklainen M, Doermann D (1997) A distributed management system for testing document image analysis algorithms. In: Proceedings of the 4th international conference on document analysis and recognition (ICDAR'97), Ulm, Germany, pp 989-995Google Scholar
  19. 19.
    Sobottka K, Bunke H, Kronenberg H (1999) Identification of text on colored book and journal covers. In: Proceedings of the 5th international conference on document analysis and recognition (ICDAR'99), September 1999, Bangalore, India, pp 57-60Google Scholar
  20. 20.
    Todoran L, Aiello M, Monz C, Worring M (2001) Logical structure detection for hetrogeneous document classes. In: Kantor PB, Lopresti DP, Zhou J (eds) Proceedings of SPIE, Document Recognition and Retrieval VIII, San Jose, CA, 3407:99-111Google Scholar
  21. 21.
    Tsujimoto S, Asada H (1992) Major components of a complete text reading system. Proc IEEE 80(7):1133-1149Google Scholar
  22. 22.
    Wallace GK (1991) The JPEG still picture compression standard. Commun ACM 34(4):30-44Google Scholar
  23. 23.
    Watanabe T, Sobue T (2000) Layout analysis of complex documents. In: Proceedings of the 2000 international conference on pattern recognition (ICPR'00), Barcelona, Spain, pp 447-450Google Scholar
  24. 24.
    Wu V, Manmatha R, Riseman EM (1999) Textfinder: an automatic system to detect and recognize text in images. IEEE Trans Pattern Anal Mach Intell 21(11):1224-1229Google Scholar
  25. 25.
    Yanikoglu B, Vincent L (1997) Pink panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recog Lett 31(9):1191-1204Google Scholar
  26. 26.
    Zhong Y, Karu K, Jain AK (1995) Locating text in complex color images. Pattern Recog 28(10):1523-1535Google Scholar

Copyright information

© Springer-Verlag Berlin/Heidelberg 2005

Authors and Affiliations

  • Leon Todoran
    • 1
  • Marcel Worring
    • 1
  • Arnold W. M. Smeulders
    • 1
  1. 1.Intelligent Sensory Information SystemsUniversity of AmsterdamAmsterdamThe Netherlands

Personalised recommendations