Advertisement

Treatment of Diagrams in Document Image Analysis

  • Dorothea Blostein
  • Edward Lank
  • Richard Zanibbi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1889)

Abstract

Document image analysis is the study of converting documents from paper form to an electronic form that captures the information content of the document. Necessary processing includes recognition of document layout (to determine reading order, and to distinguish text from diagrams), recognition of text (called Optical Character Recognition, OCR), and processing of diagrams and photographs. The processing of diagrams has been an active research area for several decades. A selection of existing diagram recognition techniques are presented in this paper. Challenging problems in diagram recognition include (1) the great diversity of diagram types, (2) the difficulty of adequately describing the syntax and semantics of diagram notations, and (3) the need to handle imaging noise. Recognition techniques that are discussed include blackboard systems, stochastic grammars, Hidden Markov Models, and graph grammars.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    R. Anderson, “Two Dimensional Mathematical Notation,” in Syntactic Pattern Recognition, Applications, K. S. Fu editor, Springer, 1977, 147–177.Google Scholar
  2. [2]
    J. Arias, C. Lai, S. Chandran, R. Kasturi, A. Chhabra, “Interpretation of Telephone System Manhole Drawings,” Proc. Second Intl. Conf. Document Analysis and Recognition, Tsukuba, Japan, Oct. 1993, 365–368.Google Scholar
  3. [3]
    J. Arias, A. Chhabra, V. Misra, “A Practical Application of Graphics Recognition: Helping with the Extraction of Information from Telephone Company Drawings,” in Graphics Recognition — Algorithms and Systems, Eds. K. Tombre and A. Chhabra, LNCS Vol. 1389, Springer, 1998, 314–321.Google Scholar
  4. [4]
    D. Bainbridge, N. Carter, “Automatic Reading of Music Notation,” in Handbook of Character Recognition and Document Image Analysis, Eds. H. Bunke and P. Wang, World Scientific, 1997, 583–603.Google Scholar
  5. [5]
    H. Baird, D. Ittner, “Data Structures for Page Readers” Proc. IAPR Workshop on Document Analysis Systems, Kaiserslautern, Germany, Oct. 1994, 323–334.Google Scholar
  6. [6]
    S. Baumann, “A Simplified Attributed Graph Grammar for High-Level Music Recognition,” Proc. Third Intl. Conf. on Document Analysis and Recognition, Montreal, Canada, Aug. 1995, 1080–1083.Google Scholar
  7. [7]
    A. Belkin, “Macintosh Notation Software: Present and Future,” Computer Music Journal1, 18(l):53–69, 1994.CrossRefGoogle Scholar
  8. [8]
    A. Blackwell, Y. Engelhardt, “A Taxonomy of Diagram Taxonomies,” Proc. Thinking with Diagrams 98, U.Wales, Aberystwyth, United Kingdom, August 1998, 60–70. Available at http://www.mrc-cbu.cam.ac.uk/projects/twd/mypapers/TwD98.html
  9. [9]
    D. Biostein, H. Baird, “A Critical Survey of Music Image Analysis,” in Structured Document Image Analysis, Eds. H. Baird, H. Bunke, and K. Yamamoto, Springer, 1992, 405–434.Google Scholar
  10. [10]
    D. Blostein, A. Grbavec, “Recognition of Mathematical Notation,” in Handbook of Character Recognition and Document Image Analysis, Eds. H. Bunke and P. Wang, World Scientific, 1997, 557–582.Google Scholar
  11. [11]
    D. Blostein, L. Haken, “Using Diagram Generation Software to Improve Diagram Recognition: A Case Study of Music Notation,” IEEE Trans. Pattern Analysis and Machine Intelligence, 21(11):1121–1136, Nov. 1999.Google Scholar
  12. [12]
    D. Blostein, A. Schiirr, “Computing with Graphs and Graph Transformation,” Software-Practice and Experience, 29(3), 1999, 197–217.CrossRefGoogle Scholar
  13. [13]
    L. Boatto, V. Consorti et al, “Detection and Separation of Symbols Connected to Graphics in Line Drawings,” 11th Intl. Conf. on Pattern Recognition, Delft, Netherlands, Sept. 1992, Vol. 2, 545–548.Google Scholar
  14. [14]
    H. Bunke, “Attributed Programmed Graph Grammars and Their Application to Schematic Diagram Interpretation,” IEEE Trans. Pattern Analysis and Machine Intelligence, 4(6):574–582, Nov. 1982.Google Scholar
  15. [15]
    T. Chaundy, P. Barrett, C. Batey, The Printing of Mathematics, Oxford University Press, 1957.Google Scholar
  16. [16]
    A. Chhabra, “Graphic Symbol Recognition: An Overview,” in Graphics Recognition — Algorithms and Systems, Eds. K. Tombre and A. Chhabra, LNCS Vol. 1389, Springer, 1998, 68–79.Google Scholar
  17. [17]
    P. Chou, “Recognition of Equations Using a Two-Dimensional Stochastic Context-Free Grammar,” Proc. SPIE Visual Communications and Image Processing IV, Philadelphia PA, 852–863, Nov. 1989.Google Scholar
  18. [18]
    P. Chou and G. Kopec, “A Stochastic Attribute Grammar Model of Document Production and its use in Document Image Decoding,” Document Recognition II, SPIE Proceedings Series, Vol. 2422, 1995, 66–73.Google Scholar
  19. [19]
    G. Costagliola, A. De Lucia, S. Orefice, G. Tortora, “A Framework of Syntactic Models for the Implementation of Visual Languages,” Proc. 1997 IEEE Intl. Symposium on Visual Languages (VL’97), Capri, Italy, Sept. 1997, 58–65.Google Scholar
  20. [20]
    W. Cushman, P. Ojha, and C. Daniels, “Usable OCR: What are the Minimum Performance Requirements?”, Proc. ACM SIGCHI1990 Conference on Human Factors in Computing Systems, Seattle, Washington, April 1990, 145–151.Google Scholar
  21. [21]
    D. Dori, Y. Liang, J. Dowell and I. Chai, “Sparse-pixel recognition of Primitives in Engineering Drawings,” Machine Vision and Applications, 6:69–82, 1993.CrossRefGoogle Scholar
  22. [22]
    D. Dori, A. Pnueli, “The Grammar of Dimensions in Machine Drawings,” Computer Vision, Graphics and Image Processing, 42:1–18, 1988.CrossRefGoogle Scholar
  23. [23]
    H. Fahmy, D. Blostein, “A Graph Grammar Programming Style for Recognition of Music Notation,” Machine Vision and Applications, 6(2):83–99, 1993.CrossRefGoogle Scholar
  24. [24]
    H. Fahmy, D. Blostein, “A Graph-Rewriting Paradigm for Discrete Relaxation: Application to Sheet-Music Recognition,” International Journal of Pattern Recognition and Artificial Intelligence, 12(6):763–799, Sept. 1998.Google Scholar
  25. [25]
    C. Faure and Z. Wang, “Automatic Perception of the Structure of Handwritten Mathematical Expressions,” in Computer Processing of Handwriting, Eds. R. Plamondon and C. Leedham, World Scientific, 1990, 337–361.Google Scholar
  26. [26]
    K. S. Fu, Syntactic Pattern Recognition and Applications, Prentice Hall 1982.Google Scholar
  27. [27]
    R. Futrelle, “Strategies for Diagram Understanding: Generalized Equivalence, Spatial/Object Pyramids and Animate Vision,” 10th Intl. Conf. on Pattern Recognition, Atlantic City, New Jersey, June 1990, 403–408.Google Scholar
  28. [28]
    A. Grbavec, D. Blostein, “Mathematics Recognition Using Graph Rewriting,” Third Intl. Conf. on Document Analysis and Recognition, Montreal, Aug. 1995, 417–421.Google Scholar
  29. [29]
    W. Havens, A. Mackworth, “Representing Knowledge of the Visual World,” IEEE Computer, Oct. 1983, 90–96.Google Scholar
  30. [30]
    IEEE Transactions on Pattern Analysis and Machine Intelligence, published by the IEEE Computer Society.Google Scholar
  31. [31]
    International Journal on Document Analysis and Recognition, Springer.Google Scholar
  32. [32]
    V. Jagannathan, R. Dodhiawala, L. Baum, Editors, Blackboard Architectures and Applications, Academic Press, 1989.Google Scholar
  33. [33]
    S. Joseph, T. Pridmore, “Knowledge-Directed Interpretation of Mechanical Engineering Drawings,” IEEE Trans. Pattern Analysis and Machine Intelligence, 14(9):928–940, Sept. 1992.Google Scholar
  34. [34]
    Journal of Visual Languages and Computing, Academic Press.Google Scholar
  35. [35]
    T. Kanungo, R. Haralick, D. Dori, “Understanding Engineering Drawings: A Survey,” Proc. Intl. Workshop on Graphics Recognition, University Park, Pennsylvania, Aug. 1995, 119–130.Google Scholar
  36. [36]
    H. Kato, S. Inokuchi, “The Recognition System of Printed Piano Music using Musical Knowledge and Constraints,” Proc. IAPR Workshop on Syntactic and Structural Pattern Recognition., Murray Hill NJ, June 1990, 231–248.Google Scholar
  37. [37]
    H. Kato, S. Inokuchi, “The Recognition Method for Roughly Hand-Drawn Logical Diagrams Based on Utilization of Multi-Layered Knowledge,” Proc. 10th Intl. Conf. on Pattern Recognition, Atlantic City NJ, June 1990, 443–473.Google Scholar
  38. [38]
    D. Knuth, “Mathematical Typography,” Bulletin of the American Mathematical Society, 1(2), March 1979, 337–372.Google Scholar
  39. [39]
    G. Kopec, P. Chou, “Document Image Decoding Using Markov Source Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, 16(6):602–617, June 1994.Google Scholar
  40. [40]
    G. Kopec, P. Chou, and D. Maltz, “Markov Source Model for Printed Music Decoding,” Journal of Electronic Imaging, 5(1):7–14, Jan. 1996.Google Scholar
  41. [41]
    L. Lamport, LaTeX User’s Guide & Reference Manual, Addison-Wesley, 1986.Google Scholar
  42. [42]
  43. [43]
    S. Lavirotte, L. Pottier, “Optical Formula Recognition,” Fourth Intl. Conf. on Document Analysis and Recognition, Ulm, Germany, Aug. 1997, 357–361.Google Scholar
  44. [44]
    X. Lin, S. Shimotsuji, M. Minoh, T. Saki, “Efficient Diagram Understanding with Characteristic Pattern Detection,” Computer Vision, Graphics, and Image Processing, 30:84–106, 1985.CrossRefGoogle Scholar
  45. [45]
    K. Marriott, B. Meyer, K. Wittenburg, “A Survey of Visual Language Specification and Recognition,” in Visual Language Theory, Eds. K. Marriott, B. Meyer, Springer, 1998, 5–85.Google Scholar
  46. [46]
    J. McDaniel, J. Balmuth, “Automatic Interpretation of Chemical Structure Diagrams,” in Graphics Recognition — Methods and Applications, Eds. R. Kasturi and K. Tombre, LNCS Vol. 1072, Springer, 1996, 148–158.Google Scholar
  47. [47]
    I. Mulder, A. Mackworth, W. Havens, “Knowledge Structuring and Constraint Satisfaction: The Mapsee Approach,” IEEE Trans. Pattern Analysis and Machine Intelligence, 10(6):866–879, Nov. 1988.Google Scholar
  48. [48]
    G. Novak, W. Bulko, “Diagrams and Text as Computer Input,” Journal of Visual Languages and Computing, 4, 1993, 161–175.CrossRefGoogle Scholar
  49. [49]
    L. O’Gorman, R. Kasturi, Document Image Analysis, IEEE Computer Society Press, 1995.Google Scholar
  50. [50]
    M. Okamoto, B. Miao, “Recognition of Mathematical Expressions by Using the Layout Structure of Symbols,” in Proc. First Intl. Conf. on Document Analysis and Recognition, Saint Malo, France, Sept. 1991, 242–250.Google Scholar
  51. [51]
    A. Okazaki, T. Kondo, K. Mori, S. Tsunekawa, and E. Kawamoto, “An Automatic Circuit Diagram Reader with Loop-Structure-Based Symbol Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, 10(3):331–340, May 1988.Google Scholar
  52. [52]
    B. Pasternak, “Processing Imprecise and Structural Distorted Line Drawings by and Adaptable Drawing Interpretation Kernel,” Proc. IAPR Workshop on Document Analysis Systems, Kaiserslautern, Germany, Oct. 1994, 349–363.Google Scholar
  53. [53]
    B. Pasternak, “The Role of Taxonomy in Drawing Interpretation,” Proc. Third Intl. Conf. Document Analysis and Recognition, Montreal, Canada, Aug. 1995, 799–802.Google Scholar
  54. [54]
    B. Poirier, M. Dagenais, “An Interactive System to Extract Structured Text from a Geometrical Representation,” Proc. Fourth Intl. Conf. on Document Analysis and Recognition, Ulm, Germany, Aug. 1997, 342–346.Google Scholar
  55. [55]
    Proc. Annual Symposia on Document Analysis and Information Retrieval, Las Vegas, 1992, 1993, 1994, 1995, 1996, 1997, sponsored by the University of Nevada.Google Scholar
  56. [56]
    Proc. CHI Conferences, Human Factors in Computing Systems, ACM press.Google Scholar
  57. [57]
    Proc. IAPR Workshops on Document Analysis Systems.Google Scholar
  58. [58]
    Proc. IAPR Workshops on Graphics Recognition.Google Scholar
  59. [59]
    Proc. IEEE Symposium on Visual Languages, IEEE Computer Society Press.Google Scholar
  60. [60]
    Proc. Intl. Confs. on Document Analysis and Recognition, sponsored by IAPR, IEEE.Google Scholar
  61. [61]
    M. A. Rahgozar, R. Cooperman, “A Graph-based Table Recognition System,” Document Recognition III, SPIE Proceedings Series, Vol. 2660, 1996, 192–203.Google Scholar
  62. [62]
    J. Ramel, N. Vincent, H. Emptoz, “A Coarse Vectorization as an Initial Representation for the Understanding of Line Drawing Images,” in Graphics Recognition — Algorithms and Systems, Eds. K. Tombre and A. Chhabra, LNCS Vol. 1389, Springer, 1998, 48–57.Google Scholar
  63. [63]
    G. Read, Music Notation: A Manual of Modern Practice (2nd Edition), Taplinger Publishing, New York, NY, 1979.Google Scholar
  64. [64]
    D. Roush, “Music Formatting Guidelines,” Technical Report OSU-CISRC-3/88-TRlO, Department of Computer and Information Science, The Ohio State University, 1988.Google Scholar
  65. [65]
    K. Ryall, S. Shieber, J. Marks, M. Mazer, “Semi-Automatic Delineation of Regions in Floor Plans,” Proc. Third Intl. Conf. on Document Analysis and Recognition, Montreal, Canada, Aug. 1995, 964–969.Google Scholar
  66. [66]
    S. Satoh, H. Mo and M. Sakauchi, “Drawing Image Understanding System with Capability of Rule Learning,” Proc. Second Intl. Conf. on Document Analysis and Recognition, Tsukuba, Japan, Oct. 1993, 119–124.Google Scholar
  67. [67]
    R. Sennhauser, “Integration of Contextual Knowledge Sources Into a Blackboard-based Text Recognition System,” IAPR Workshop on Document Analysis Systems, Kaiserslautern, Germany, Oct. 1994, 211–228.Google Scholar
  68. [68]
    SmartScore music-notation recognition software by Musitek, http://www.harmonycentral.com/Newp/1999/SmartScore.html
  69. [69]
    A. Smithies, K. Novins, J. Arvo, “A Handwriting-Based Equation Editor,” Proc. Graphics Interface’ 99, sponsored by the Canadian Human-Computer Communications Society, Kingston, Ontario, June, 1999, 84–91.Google Scholar
  70. [70]
    S. Srihari, “From Pixels to Paragraphs: the Use of Contextual Models in Text Recognition,” Proc. Second Intl. Conf. on Document Analysis and Recognition, Tsukuba, Japan, Oct. 1993, 416–423.Google Scholar
  71. [71]
    H. Twaakyondo and M. Okamoto, “Structure Analysis and Recognition of Mathematical Expressions,” Proc. Third Intl. Conf. on Document Analysis and Recognition, Montreal, Canada, August 1995, 430–437.Google Scholar
  72. [72]
    P. Vaxivière, K. Tombre, “Knowledge Organization and Interpretation Process in Engineering Drawing Interpretation,” Proc. IAPR Workshop on Document Analysis Systems, Kaiserslautern, Germany, Oct. 1994, 313–321.Google Scholar
  73. [73]
    C. Wang, S. Srihari, “A Framework for Object Recognition in a Visually Complex Environment and its Application to Locating Address Blocks on Mail Pieces,” Intl. Journal of Computer Vision, 2:125–151, 1989.CrossRefGoogle Scholar
  74. [74]
    Z. Wang, C. Faure, “Structural Analysis of Handwritten Mathematical Expressions,” Proc. Ninth Intl. Conf. on Pattern Recognition, 32–34, Rome, Italy, Nov. 1988.Google Scholar
  75. [75]
    R. Zanibbi, “Recognition of Mathematics Notation via Computer Using Baseline Structure”, M.Sc. Thesis, Dept. Computing and Information Science, Queen’s University, Kingston, Ontario, Canada, Jan. 2000.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Dorothea Blostein
    • 1
  • Edward Lank
    • 1
  • Richard Zanibbi
    • 1
  1. 1.Computing and Information ScienceQueen’s UniversityKingston OntarioCanada

Personalised recommendations