Advertisement

Generation of synthetic documents for performance evaluation of symbol recognition & spotting systems

  • Mathieu DelalandreEmail author
  • Ernest Valveny
  • Tony Pridmore
  • Dimosthenis Karatzas
Original Paper

Abstract

This paper deals with the topic of performance evaluation of symbol recognition & spotting systems. We propose here a new approach to the generation of synthetic graphics documents containing non-isolated symbols in a real context. This approach is based on the definition of a set of constraints that permit us to place the symbols on a pre-defined background according to the properties of a particular domain (architecture, electronics, engineering, etc.). In this way, we can obtain a large amount of images resembling real documents by simply defining the set of constraints and providing a few pre-defined backgrounds. As documents are synthetically generated, the groundtruth (the location and the label of every symbol) becomes automatically available. We have applied this approach to the generation of a large database of architectural drawings and electronic diagrams, which shows the flexibility of the system. Performance evaluation experiments of a symbol localization system show that our approach permits to generate documents with different features that are reflected in variation of localization results.

Keywords

Document Image Background Image Symbol Model Symbol Recognition Layout Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Greengrass, E.: Information retrieval: A survey. Tech. Rep. TR-R52-008-001, Center for architectures for data-driven information processing (CADIP), University of Maryland, US (2000)Google Scholar
  2. 2.
    Thacker N., Clark A., Barron J., Beveridge J.R., Courtney P., Crum W., Ramesh V., Clark C.: Performance characterisation in computer vision: A guide to best practices. Comput. Vis. Image Underst. 109, 305–334 (2008)CrossRefGoogle Scholar
  3. 3.
    Muller H., Muller W., Squire D., Marchand-Maillet S., Pun T.: Performance evaluation in content-based image retrieval: Overview and proposals. Pattern Recognit. Lett. 22(5), 593–601 (2001)CrossRefGoogle Scholar
  4. 4.
    Haralick, R.: Performance evaluation of document image algorithms. In: Workshop on Graphics Recognition (GREC), Vol. 1941 of Lecture Notes in Computer Science (LNCS), (2000), pp. 315–323Google Scholar
  5. 5.
    Chhabra, A.: Graphic symbol recognition: An overview. In: Workshop on Graphics Recognition (GREC), Vol. 1389 of Lecture Notes in Computer Science (LNCS), (1998), pp. 68–79Google Scholar
  6. 6.
    Cordella, L., Vento, M.: Symbol and shape recognition. In: Workshop on Graphics Recognition (GREC), Vol. 1941 of Lecture Notes in Computer Science (LNCS), (1999), pp. 167–182Google Scholar
  7. 7.
    Lladós, J., Valveny, E., Sánchez, G., Martí, E.: Symbol recognition : Current advances and perspectives. In: Workshop on Graphics Recognition (GREC), Vol. 2390 of Lecture Notes in Computer Science (LNCS), (2002), pp. 104–127Google Scholar
  8. 8.
    Tombre, K., Tabbone, S., Dosch, P.: Musings on symbol recognition. In: Workshop on Graphics Recognition (GREC), Vol. 3926 of Lecture Notes in Computer Science (LNCS), (2005), pp. 23–34.Google Scholar
  9. 9.
    Yoon, S., Kim, G., Choi, Y., Lee, Y.: New paradigm for segmentation and recognition. In: Workshop on Graphics Recognition (GREC), (2001), pp. 216–225Google Scholar
  10. 10.
    Tombre, K., Lamiroy, B.: Graphics recognition—from re-engineering to retrieval. In: International conference on document analysis and recognition (ICDAR), (2003), pp. 148–155Google Scholar
  11. 11.
    Dosch, P., Lladós, J.: Vectorial signatures for symbol discrimination. In: Workshop on Graphics Recognition (GREC), Vol. 3088 of Lecture Notes in Computer Science (LNCS), (2004), pp. 154–165Google Scholar
  12. 12.
    Tabbone, S., Wendling, L., Zuwala, D.: A hybrid approach to detect graphical symbols in documents. In: Workshop on Document Analysis Systems (DAS), Vol. 3163 of Lecture Notes in Computer Science (LNCS), (2004), pp. 342–353Google Scholar
  13. 13.
    Zuwala, D., Tabbone, S.: A method for symbol spotting in graphical documents. In: Workshop on Document Analysis Systems (DAS), Vol. 3872 of Lecture Notes in Computer Science (LNCS), (2006), pp. 518–528Google Scholar
  14. 14.
    Locteau, H., Adam, S., Trupin, E., Labiche, J., Heroux, P.: Symbol spotting using full visibility graph representation. In: Workshop on Graphics Recognition (GREC), (2007), pp. 49–50Google Scholar
  15. 15.
    Qureshi, R., Ramel, J., Barret, D., Cardot, H.: Symbol spotting in graphical documents using graph representations. In: Workshop on Graphics Recognition (GREC), Vol. 5046 of Lecture Notes in Computer Science (LNCS), (2008), pp. 91–103Google Scholar
  16. 16.
    Rusiñol, M., Lladós, J.: A region-based hashing approach for symbol spotting in technical documents. In: Workshop on Graphics Recognition (GREC), Vol. 5046 of Lecture Notes in Computer Science (LNCS), (2008)Google Scholar
  17. 17.
    Valveny E. et al.: A general framework for the evaluation of symbol recognition methods. Int. J. Doc. Anal. Recognit. 1(9), 59–74 (2007)Google Scholar
  18. 18.
    Aksoy, S., et al.: Algorithm performance contest. In: International conference on pattern recognition (ICPR), Vol. 4, pp. 870–876, (2000)Google Scholar
  19. 19.
    Valveny, E., Dosch, P.: Symbol recognition contest: A synthesis. In: Workshop on Graphics Recognition (GREC), Vol. 3088 of Lecture Notes in Computer Science (LNCS), pp. 368–386, (2004)Google Scholar
  20. 20.
    Dosch, P., Valveny, E.: Report on the second symbol recognition contest. In: Workshop on Graphics Recognition (GREC), Vol. 3926 of Lecture Notes in Computer Science (LNCS), pp. 381–397, (2006)Google Scholar
  21. 21.
    Valveny, E., Dosch, P., Fornes, A., Escalera, S.: Report on the third contest on symbol recognition. In: Workshop on Graphics Recognition (GREC), Vol. 5046 of Lecture Notes in Computer Science (LNCS), pp. 321–328, (2008)Google Scholar
  22. 22.
    Lopresti, D., Nagy, G.: Issues in ground-truthing graphic documents. In: Workshop on Graphics Recognition (GREC), Vol. 2390 of Lecture Notes in Computer Science (LNCS), pp. 46–66, (2002)Google Scholar
  23. 23.
    Yan, L., Wenyin, L.: Interactive recognizing graphic objects in engineering drawings. In: Workshop on Graphics Recognition (GREC), Vol. 3088 of Lecture Notes in Computer Science (LNCS), pp. 126–137, (2004)Google Scholar
  24. 24.
    Chhabra, A., Phillips, I.: The second international graphics recognition contest—raster to vector conversion : a report. In: Workshop on Graphics Recognition (GREC), Vol. 1389 of Lecture Notes in Computer Science (LNCS), pp. 390–410, (1998)Google Scholar
  25. 25.
    Zhai, J., Wenyin, L., Dori, D., Li, Q.: A line drawings degradation model for performance characterization. In: International conference on document analysis and recognition (ICDAR), pp. 1020–1024, (2003)Google Scholar
  26. 26.
    Yanikoglu B., Vincent L.: Pink panther: A complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1998)CrossRefGoogle Scholar
  27. 27.
    Lee C., Kanungo T.: The architecture of trueviz: A groundtruth/metadata editing and visualizing toolkit. Pattern Recognit. 36(3), 811–825 (2003)CrossRefGoogle Scholar
  28. 28.
    Antonacopoulos, A., Karatzas, D., Bridson, D.: Ground truth for layout analysis performance evaluation. In: Workshop on Document Analysis Systems (DAS), Vol. 3872 of Lecture Notes in Computer Science (LNCS), pp. 302–311, (2006)Google Scholar
  29. 29.
    Kim D., Kanungo T.: Attributed point matching for automatic groundtruth generation. Int. J. Doc. Anal. Recognit. 5(1), 47–66 (2002)zbMATHCrossRefGoogle Scholar
  30. 30.
    Ford, G., Thoma, G.: Ground truth data for document image analysis. In: Symposium on document image understanding and technology (SDIUT). pp. 199–205, (2003)Google Scholar
  31. 31.
    Yang, L., Huang, W., Tan, C.: Semi-automatic ground truth generation for chart image recognition. In: Workshop on Document Analysis Systems (DAS), Vol. 3872 of Lecture Notes in Computer Science (LNCS). pp. 324–335, (2006)Google Scholar
  32. 32.
    Phillips, I., Ha, J., Haralick, R., Dori., D.: The implementation methodology for the cd-rom english document database, In: International Conference on Document Analysis and Recognition (ICDAR), pp. 484–487 (1993)Google Scholar
  33. 33.
    Kanungo, T., Haralick, R., Baird, H.S., Stuezle, W.D.M.: A statistical, nonparametric methodology for document degradation model validation. Pattern anal. mach. intell. 22(11), 1209–1223 (2000)CrossRefGoogle Scholar
  34. 34.
    Delalandre, M., Ramel, J., Valveny, E., Luqman, M.: A performance characterization algorithm for symbol localization, In: Workshop on Graphics Recognition (GREC), Vol. 8, pp. 3–11, (2009)Google Scholar
  35. 35.
    Rusiñol M., Lladós J.: A performance evaluation protocol for symbol spotting systems in terms of recognition and location indices. Int. J. Doc. Anal. Recognit. 12(2), 83–96 (2009)Google Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  • Mathieu Delalandre
    • 1
    Email author
  • Ernest Valveny
    • 1
  • Tony Pridmore
    • 2
  • Dimosthenis Karatzas
    • 1
  1. 1.CVCBarcelonaSpain
  2. 2.SCSITNottinghamEngland

Personalised recommendations