Building Digital Ink Recognizers Using Data Mining: Distinguishing between Text and Shapes in Hand Drawn Diagrams

  • Rachel Blagojevic
  • Beryl Plimmer
  • John Grundy
  • Yong Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6096)


The low accuracy rates of text-shape dividers for digital ink diagrams are hindering their use in real world applications. While recognition of handwriting is well advanced and there have been many recognition approaches proposed for hand drawn sketches, there has been less attention on the division of text and drawing. The choice of features and algorithms is critical to the success of the recognition, yet heuristics currently form the basis of selection. We propose the use of data mining techniques to automate the process of building text-shape recognizers. This systematic approach identifies the algorithms best suited to the specific problem and generates the trained recognizer. We have generated dividers using data mining and training with diagrams from three domains. The evaluation of our new recognizer on realistic diagrams from two different domains, against two other recognizers shows it to be more successful at dividing shapes and text with 95.2% of strokes correctly classified compared with 86.9% and 83.3% for the two others.


Sketch tools recognition algorithms sketch recognition pen-based interfaces 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rubine, D.H.: Specifying gestures by example. In: Proceedings of Siggraph ’91. ACM, New York (1991)Google Scholar
  2. 2.
    Paulson, B., Hammond, T.: PaleoSketch: Accurate Primitive Sketch Recognition and Beautification. In: Intelligent User Interfaces (IUI ’08). ACM Press, New York (2008)Google Scholar
  3. 3.
    Wobbrock, J.O., Wilson, A.D., Li, Y.: Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes. In: User interface software and technology. ACM, Newport (2007)Google Scholar
  4. 4.
    Plimmer, B.: Using Shared Displays to Support Group Designs; A Study of the Use of Informal User Interface Designs when Learning to Program. Computer Science (2004)Google Scholar
  5. 5.
    Young, M.: InkKit: The Back End of the Generic Design Transformation Tool. Computer Science (2005)Google Scholar
  6. 6.
    Schmieder, P., Plimmer, B., Blagojevic, R.: Automatic Evaluation of Sketch Recognition. In: Sketch Based Interfaces and Modelling, New Orleans, USA (2009)Google Scholar
  7. 7.
    Bhat, A., Hammond, T.: Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams. In: International Joint Conference on Artificial Intelligence (IJCAI ’09), Pasadena, California, USA (2009)Google Scholar
  8. 8.
    Bishop, C.M., Svensen, M., Hinton, G.E.: Distinguishing Text from Graphics in On-Line Handwritten Ink. In: Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  9. 9.
    Patel, R., Plimmer, B., et al.: Ink Features for Diagram Recognition. In: 4th Eurographics Workshop on Sketch-Based Interfaces and Modeling 2007. Eurographics, Riverside (2007)Google Scholar
  10. 10.
    Plimmer, B., Freeman, I.: A Toolkit Approach to Sketched Diagram Recognition. In: HCI 2007. eWiC, Lancaster (2007)Google Scholar
  11. 11.
    Lank, E., Thorley, J.S., Chen, S.J.-S.: An interactive system for recognizing hand drawn UML diagrams. In: Proceedings of the Centre for Advanced Studies on Collaborative research. IBM Press, Mississauga (2000)Google Scholar
  12. 12.
    Hammond, T., Davis, R.: Tahuti: A Geometrical Sketch Recognition System for UML Class Diagrams. In: 2002 AAAI Spring Symposium on Sketch Understanding (2002)Google Scholar
  13. 13.
    Zeleznik, R.C., Bragdon, A., et al.: Lineogrammer: creating diagrams by drawing. In: Proceedings of User interface software and technology. ACM, Monterey (2008)Google Scholar
  14. 14.
    Shilman, M., Viola, P.: Spatial recognition and grouping of text and graphics. In: EUROGRAPHICS Workshop on Sketch-Based Interfaces and Modeling (2004)Google Scholar
  15. 15.
    Shilman, M., Wei, Z., et al.: Discerning structure from freeform handwritten notes. In: Document Analysis and Recognition (2003)Google Scholar
  16. 16.
    Jain, A.K., Namboodiri, A.M., Subrahmonia, J.: Structure in On-line Documents. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition. IEEE Computer Society, Los Alamitos (2001)Google Scholar
  17. 17.
    Ao, X., Li, J., et al.: Structuralizing digital ink for efficient selection. In: Proceedings of the 11th international conference on Intelligent user interfaces. ACM, Sydney (2006)Google Scholar
  18. 18.
    Machii, K., Fukushima, H., Nakagawa, M.: Online text/drawings segmentation of handwritten patterns. In: Document Analysis and Recognition, Tsukuba Science City, Japan (1993)Google Scholar
  19. 19.
    Microsoft Corporation, Ink Analysis Overview (cited 2008),
  20. 20.
    Mochida, K., Nakagawa, M.: Separating drawings, formula and text from free handwriting. In: International Graphonomics Society (IGS 2003), Scottsdale, Arizona (2003)Google Scholar
  21. 21.
    Blagojevic, R., Plimmer, B., et al.: A Data Collection Tool for Sketched Diagrams. In: Sketch Based Interfaces and Modeling. Eurographics, Annecy (2008)Google Scholar
  22. 22.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)zbMATHGoogle Scholar
  23. 23.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)zbMATHMathSciNetGoogle Scholar
  24. 24.
    Holmes, G., Pfahringer, B., et al.: Multiclass alternating decision trees. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 161–172. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  25. 25.
    Landwehr, N., Hall, M., Frank, E.: Logistic Model Trees. Machine Learning 95(1-2), 161–205 (2005)CrossRefGoogle Scholar
  26. 26.
    Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: a Statistical View of Boosting. Stanford University (1998)Google Scholar
  27. 27.
    Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)zbMATHCrossRefGoogle Scholar
  28. 28.
    Platt, J.: Machines using Sequential Minimal Optimization. In: Advances in Kernel Methods - Support Vector Learning (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Rachel Blagojevic
    • 1
  • Beryl Plimmer
    • 1
  • John Grundy
    • 2
  • Yong Wang
    • 1
  1. 1.University of AucklandAucklandNew Zealand
  2. 2.Swinburne University of TechnologyHawthornAustralia

Personalised recommendations