Issues in Ground-Truthing Graphic Documents

  • Daniel Lopresti
  • George Nagy
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2390)

Abstract

We examine the nature of ground-truth: whether it is always well-defined for a given task, or only relative and approximate. In the conventional scenario, reference data is produced by recording the interpretation of each test document using a chosen data-entry platform. Looking a little more closely at this process, we study its constituents and their interrelations. We provide examples from the literature and from our own experiments where non-trivial problems with each of the components appear to preclude the possibility of real progress in evaluating automated graphics recognition systems, and propose possible solutions. More specifically, for documents with complex structure we recommend multi-valued, layered, weighted, functional ground-truth supported by model-guided reference data-entry systems and protocols. Mostly, however, we raise far more questions than we currently have answers for.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    A. A. Abu-Tarif. Table processing and understanding. Master’s thesis, Rensselaer Polytechnic Institute, 1998.Google Scholar
  2. [2]
    H. S. Baird. Document image defect models. In H. S. Baird, H. Bunke, and K. Yamamoto, editors, Structured Document Image Analysis, pages 546–556. Springer-Verlag, New York, NY, 1992.Google Scholar
  3. [3]
    D. Blostein and L. Haken. Using diagram generation software to improve diagram recognition: A case study of music notation. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-21(11):1121–1136, November 1999.Google Scholar
  4. [4]
    D. Blum. The Art of Quartet Playing. Cornell University Press, Ithaca, NY, 1986.Google Scholar
  5. [5]
    L. Boatto, V. Consorti, M. D. Bueno, S. D. Zenzo, V. Eramo, A. Esposito, F. Melcarne, M. Meucci, A. Morelli, M. Mosciatti, S. Scarci, and M. Tucci. An interpretation system for land register maps. IEEE Computer, 25(7):25–33, July 1992.Google Scholar
  6. [6]
    A. K. Chhabra and I. Phillips. The Second International Graphics Recognition Contest-raster to vector conversion: A report. In K. Tombre and A. K. Chhabra, editors, Graphics Recognition: Algorithms and Systems, volume 1389 of Lecture Notes in Computer Science, pages 390–410. Springer-Verlag, Berlin, Germany, 1998.Google Scholar
  7. [7]
    J. Esakov, D. P. Lopresti, J. S. Sandberg, and J. Zhou. Issues in automatic OCR error classification. In Proceedings of the Third Annual Symposium on Document Analysis and Information Retrieval, pages 401–412, Las Vegas, NV, April 1994.Google Scholar
  8. [8]
    T. Fruchterman. DAFS: A standard for document and image understanding. In Proceedings of the Symposium on Document Image Understanding Technology, pages 94–100, Bowie, MD, October 1995.Google Scholar
  9. [9]
    M. D. Garris. Document image recognition and retrieval: Where are we? In Proceedings of Document Recognition and Retrieval VI (IS&T/SPIE Electronic Imaging), volume 3651, pages 141–150, San Jose, CA, January 1999.Google Scholar
  10. [10]
    M. D. Garris, S. A. Janet, and W. W. Klein. Federal Register document image database. In Proceedings of Document Recognition and Retrieval VI (IS&T/SPIE Electronic Imaging), volume 3651, pages 97–108, San Jose, CA, January 1999.Google Scholar
  11. [11]
    M. D. Garris and W. W. Klein. Creating and validating a large image database for METTREC. Technical Report NISTIR 6090, National Institute of Standards and Technology, January 1998.Google Scholar
  12. [12]
    J. Ha, R. M. Haralick, S. Chen, and I. T. Phillips. Estimating errors in document databases. In Proceedings of the Third Annual Symposium on Document Analysis and Information Retrieval, pages 435–459, Las Vegas, NV, April 1994.Google Scholar
  13. [13]
    J. D. Hobby. Matching document images with ground truth. International Journal on Document Analysis and Recognition, 1(1):52–61, 1998.Google Scholar
  14. [14]
    J. Hu, R. Kashi, D. Lopresti, G. Nagy, and G. Wilfong. Why table ground-truthing is hard. In Proceedings of the Sixth International Conference on Document Analysis and Recognition, pages 129–133, Seattle, WA, September 2001.Google Scholar
  15. [15]
    T. Kanungo and R. M. Haralick. An automatic closed-loop methodology for generating character groundtruth for scanned documents. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-21(2):179–183, February 1999.Google Scholar
  16. [16]
    T. Kanungo, C. H. Lee, J. Czorapinski, and I. Bella. TRUEVIZ: a groundtruth / metadata editing and visualizing toolkit for OCR. In Proceedings of Document Recognition and Retrieval VIII (IS&T/SPIE Electronic Imaging), volume 4307, pages 1–12, San Jose, CA, January 2001.Google Scholar
  17. [17]
    D.-W. Kim and T. Kanungo. A point matching algorithm for automatic generation of groundtruth for document images. In Proceedings of the Fourth IAPR International Workshop on Document Analysis Systems, pages 475–485, Rio de Janeiro, Brazil, December 2000.Google Scholar
  18. [18]
    D. Knuth. The Texbook. Addison-Wesley, 1984.Google Scholar
  19. [19]
    D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the International Conference on Computer Vision (ICCV), pages II:416–421, Vancouver, Canada, July 2001.Google Scholar
  20. [20]
    H. Miyao and R. M. Haralick. Format of ground truth data used in the evaluation of the results of an optical music recognition system. In Proceedings of the Fourth IAPR International Workshop on Document Analysis Systems, pages 497–506, Rio de Janeiro, Brazil, December 2000.Google Scholar
  21. [21]
    G. Nagy and S. Seth. Hierarchical representation of optically scanned documents. In Proceedings of the Seventh International Conference on Pattern Recognition, pages 347–349, Montréal, Canada, July 1984.Google Scholar
  22. [22]
    M. Okamoto and A. Miyazawa. An experimental implementation of a document recognition system for papers containing mathematical expressions. In H. S. Baird, H. Bunke, and K. Yamamoto, editors, Structured Document Image Analysis. Springer-Verlag, Berlin, Germany, 1992.Google Scholar
  23. [23]
    I. Phillips, J. Ha, R. Haralick, and D. Dori. The implementation methodology for the CD-ROM English document database. In Proceedings of Second International Conference on Document Analysis and Recognition, pages 484–487, Tsukuba Science City, Japan, October 1993.Google Scholar
  24. [24]
    I. T. Phillips, S. Chen, J. Ha, and R. M. Haralick. English document database design and implementation methodology. In Proceedings of the Second Annual Symposium on Document Analysis and Information Retrieval, pages 65–104, Las Vegas, NV, April 1993.Google Scholar
  25. [25]
    I. T. Phillips, J. Ha, S. Chen, and R. M. Haralick. Implementation methodology and error analysis for the CD-ROM English document database. In Proceedings of the AIPR Workshop, Washington DC, October 1993.Google Scholar
  26. [26]
    P. Ratiu and R. Kikinis. Squaring the circle: Validation without ground truth. In Proceedings of the Third Visible Human Project Conference, Bethesda, MD, October 2000. http://www.nlm.nih.gov/research/visible/vhpconf2000/AUTHORS/RATIU2/RATIU2.HTM.
  27. [27]
    S. V. Rice, J. Kanai, and T. A. Nartker. Preparing OCR test data. Technical Report TR-93-08, UNLV Information Science Research Institute, Las Vegas, NV, June 1993.Google Scholar
  28. [28]
    S. V. Rice, G. Nagy, and T. A. Nartker. Optical Character Recognition: An Illustrated Guide to the Frontier. Kluwer Academic Publishers, Norwell, MA, 1999.Google Scholar
  29. [29]
    ScanSoft, Inc., Peabody, MA. XDOC Data Format, Technical Specification Version 3.0, May 1997.Google Scholar
  30. [30]
    S. Setlur, V. Govindaraju, and S. Srihari. Truthing, testing and evaluation issues in complex systems. In Proceedings of the Symposium on Document Image Understanding Technology, pages 131–140, Annapolis, MD, April 2001.Google Scholar
  31. [31]
    S. Srinivasan, D. Petkovic, and D. Ponceleon. Towards robust features for classifying audio in the CueVideo system. In Proceedings of ACM Multimedia’ 99, pages 393–400, Orlando FL, November 1999.Google Scholar
  32. [32]
    H. R. Stabler. Experiences with high-volume, high accuracy document capture. In A. L. Spitz and A. Dengel, editors, Document Analysis Systems, pages 38–51. World Scientific, Singapore, 1995.Google Scholar
  33. [33]
    X. Wang. Tabular abstraction, editing, and formatting. PhD thesis, University of Waterloo, 1996.Google Scholar
  34. [34]
    Y. Wang, I. T. Phillips, and R. Haralick. Automatic table ground truth generation and a background-analysis-based table structure extraction method. In Proceedings of the Sixth International Conference on Document Analysis and Recognition, pages 528–532, Seattle, WA, September 2001.Google Scholar
  35. [35]
    B. A. Yanikoglu and L. Vincent. Pink Panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognition, 31(9):1191–1204, 1998.CrossRefGoogle Scholar
  36. [36]
    T. S. Yoo, M. J. Ackerman, and M. Vannier. Towards a common validation methodology for segmentation and registration algorithms. In S. Delp, A. DiGioia, and B. Jaramaz, editors, Medical Image Computing and Computer-Assisted Intervention, volume 1935 of Lecture Notes in Computer Science, pages 422–431. Springer-Verlag, Berlin, Germany, 2000.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Daniel Lopresti
    • 1
  • George Nagy
    • 2
  1. 1.Lucent Technologies Inc.Bell LabsMurray HillUSA
  2. 2.Rensselaer Polytechnic Institute TroyDepartment of Electrical, Computer, and Systems EngineeringUSA

Personalised recommendations