Advertisement

Online flowchart understanding by combining max-margin Markov random field with grammatical analysis

  • Chengcheng Wang
  • Harold Mouchère
  • Aurélie Lemaitre
  • Christian Viard-Gaudin
Original Paper

Abstract

Flowcharts are considered in this work as a specific 2D handwritten language where the basic strokes are the terminal symbols of a graphical language governed by a 2D grammar. In this way, they can be regarded as structured objects, and we propose to use a MRF to model them, and to allow assigning a label to each of the strokes. We use structured SVM as learning algorithm, maximizing the margin between true labels and incorrect labels. The model would automatically learn the implicit grammatical information encoded among strokes, which greatly improves the stroke labeling accuracy compared to previous researches that incorporated human prior knowledge of flowchart structure. We further complete the recognition by using grammatical analysis, which finally brings coherence to the whole flowchart recognition by labeling the relations between the detected objects.

Keywords

Random Forest Markov Random Field Conditional Random Field Handwriting Recognition Markov Random Field Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Andres, B., Beier, T., Kappes, J.H.: OpenGM: a C++ library for discrete graphical models. ArXiv e-prints (2012)Google Scholar
  2. 2.
    Awal, A-M., Feng, G., Mouchere, H., Viard-Gaudin, C.: First experiments on a new online handwritten flowchart database. In: IS&T/SPIE Electronic Imaging, p. 78740A. International Society for Optics and Photonics (2011)Google Scholar
  3. 3.
    Awal, A.-M., Mouchère, H., Viard-Gaudin, C.: A global learning approach for an online handwritten mathematical expression recognition system. Pattern Recogn. Lett. 35, 68–77 (2014)CrossRefGoogle Scholar
  4. 4.
    Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Computer Vision—ECCV 2008, pp. 2–15. Springer (2008)Google Scholar
  5. 5.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  6. 6.
    Bresler, M., Pruša, D., Hlavác, V.: Modeling flowchart structure recognition as a max–sum problem. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1215–1219. IEEE (2013)Google Scholar
  7. 7.
    Bresler, M., Van Phan, T., Pruša, D., Nakagawa, M., Hlavác, V.: Recognition system for on-line sketched diagrams. In: 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 563–568. IEEE (2014)Google Scholar
  8. 8.
    Carton, C., Lemaitre, A., Couasnon, B.: Fusion of statistical and structural information for flowchart recognition. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1210–1214. IEEE (2013)Google Scholar
  9. 9.
    Chan, K.-F., Yeung, D.-Y.: Mathematical expression recognition: a survey. Int. J. Doc. Anal. Recogn. 3(1), 3–15 (2000)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Coüasnon, B.: DMOS, a generic document recognition method: application to table structure analysis in a general and in a specific way. Int. J. Doc. Anal. Recogn. IJDAR 8(2), 111–122 (2006)CrossRefGoogle Scholar
  11. 11.
    Delaye, A., Liu, C.-L.: Contextual text/non-text stroke classification in online handwritten notes with conditional random fields. Pattern Recogn. 47(3), 959–968 (2014)CrossRefGoogle Scholar
  12. 12.
    Delaye, A., Liu, C.-L.: Multi-class segmentation of free-form online documents with tree conditional random fields. Int. J. Doc. Anal. Recogn. IJDAR 17(4), 313–329 (2014)CrossRefGoogle Scholar
  13. 13.
    Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8. IEEE (2008)Google Scholar
  14. 14.
    Fix, A., Gruber, A., Boros, E., Zabih, R.: A graph cut algorithm for higher-order Markov random fields. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1020–1027. IEEE (2011)Google Scholar
  15. 15.
    Joachims, T.: Training linear SVMS in linear time. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–226. ACM (2006)Google Scholar
  16. 16.
    Joachims, T., Finley, T., Chun-Nam John, Y.: Cutting-plane training of structural svms. Mach. Learn. 77(1), 27–59 (2009)CrossRefzbMATHGoogle Scholar
  17. 17.
    Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)zbMATHGoogle Scholar
  18. 18.
    Lemaitre, A., Mouchère, H., Camillerapp, J., Coüasnon, B.: Interest of syntactic knowledge for on-line flowchart recognition. In: Graphics Recognition: New Trends and Challenges, pp. 89–98. Springer (2013)Google Scholar
  19. 19.
    Mouchère, H., Zanibbi, R., Garain, U., Viard-Gaudin, C.: Advancing the state of the art for handwritten math recognition: the CROHME competitions, 2011–2014. Int. J. Doc. Anal. Recognit. IJDAR 173–189 (2016)Google Scholar
  20. 20.
    Müller, A.C., Behnke, S.: PyStruct: learning structured prediction in python. J. Mach. Learn. Res. 15, 2055–2060 (2014)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Nowozin, S., Gehler, P.V., Lampert, C.H.: On parameter learning in CRT-based approaches to object class image segmentation. In: Computer Vision–ECCV 2010, pp. 98–111. Springer (2010)Google Scholar
  22. 22.
    Nowozin, S., Lampert, C.H.: Structured learning and prediction in computer vision. Found. Trends Comput. Graph. Vis. 6(3–4), 185–365 (2011)zbMATHGoogle Scholar
  23. 23.
    Artificial Intelligence Group of Microsoft Research Asia. Github—microsoft/lightgbm. https://github.com/Microsoft/LightGBM (2016)
  24. 24.
    Pei, D., Li, Z., Ji, R., Sun, F.: Efficient semantic image segmentation with multi-class ranking prior. Comput. Vis. Image Underst. 120, 81–90 (2014)CrossRefGoogle Scholar
  25. 25.
    Qi, Y., Szummer, M., Minka, T.P.: Diagram structure recognition by Bayesian conditional random fields. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005, vol. 2, pp. 191–196. IEEE (2005)Google Scholar
  26. 26.
    Taik Heon Rhee and Jin Hyung Kim: Efficient search strategy in structural analysis for handwritten mathematical expression recognition. Pattern Recogn. 42(12), 3192–3201 (2009)Google Scholar
  27. 27.
    Roth, D., Yih, W.-T.: Integer linear programming inference for conditional random fields. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 736–743. ACM (2005)Google Scholar
  28. 28.
    Smola, A.J., Mcauley, J.J., Caetano, T.S.: Robust near-isometric matching via structured learning of graphical models. In: Advances in Neural Information Processing Systems, pp. 1057–1064 (2009). https://papers.nips.cc/paper/3464-robust-near-isometric-matching-via-structured-learning-of-graphical-models
  29. 29.
    Tapia, E., Rojas, R.: Recognition of on-line handwritten mathematical expressions using a minimum spanning tree construction and symbol dominance. In: Lladós, J., Kwon, Y.B. (eds.) Graphics Recognition: Recent Advances and Perspectives, pp. 329–340. Springer (2003)Google Scholar
  30. 30.
    Teo, C.H., Smola, A., Vishwanathan, S.V.N., Le, Q.V.: A scalable modular convex solver for regularized risk minimization. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 727–736. ACM (2007)Google Scholar
  31. 31.
    Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, K.J.: Phoneme recognition using time-delay neural networks. IEEE Trans. Acoust. Speech Signal Process. 37(3), 328–339 (1989)CrossRefGoogle Scholar
  32. 32.
    Wang, C., Mouchère, H., Viard-Gaudin, C., Jin, L.: Combined segmentation and recognition of online handwritten diagrams with high order markov random field. In: International Conference on Frontiers in Handwriting Recognition (ICFHR) (2016)Google Scholar
  33. 33.
    Wu, J., Wang, C., Zhang, L., Rui, Y.: Offline sketch parsing via shapeness estimation. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence, pp. 1200–1206. AAAI Press (2015)Google Scholar
  34. 34.
    Zanibbi, R., Blostein, D., Cordy, J.R.: Recognizing mathematical expressions using tree transformation. IEEE Trans. Pattern Anal. Mach. Intell. 24(11), 1455–1467 (2002)CrossRefGoogle Scholar
  35. 35.
    Zhu, B., Nakagawa, M.: On-line handwritten Japanese characters recognition using a MRF model with parameter optimization by CRF. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 603–607. IEEE (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  1. 1.Microsoft (China) Co. Ltd.SuzhouChina
  2. 2.UBL/University of Nantes/LS2NNantesFrance
  3. 3.IRISA - Université de Rennes 2Rennes CedexFrance

Personalised recommendations