Skip to main content

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

  • Conference paper
  • First Online:
Document Analysis and Recognition – ICDAR 2021 (ICDAR 2021)

Abstract

Stroke classification is an essential task for applications with free-form handwriting input. Implementation of this type of application for mobile devices places stringent requirements on different aspects of embedded machine learning models, which results in finding a trade-off between model performance and model complexity. In this work, a novel hierarchical deep neural network (HDNN) architecture with high computational efficiency is proposed. It is adopted for handwritten document processing and particularly for multi-class stroke classification. The architecture uses a stack of 1D convolutional neural networks (CNN) on the lower (point) hierarchical level and a stack of recurrent neural networks (RNN) on the upper (stroke) level. The novel fragment pooling techniques for feature transition between hierarchical levels are presented. On-device implementation of the proposed architecture establishes new state-of-the-art results in the multi-class handwritten document processing with a classification accuracy of 97.58% on the IAMonDo dataset. Our method is also more efficient in both processing time and memory consumption than the previous state-of-the-art RNN-based stroke classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Artz, B., Johnson, M., Robson, D., Taengnoi, S.: Taking notes in the digital age: evidence from classroom random control trials. J. Econ. Educ. 51(2), 103–115 (2020)

    Article  Google Scholar 

  2. Bishop, C.M., Svensen, M., Hinton, G.E.: Distinguishing text from graphics in on-line handwritten ink. In: Proceedings of International Workshop on Frontiers in Handwriting Recognition, pp. 142–147 (2004)

    Google Scholar 

  3. Chung, J., Ahn, S., Bengio, Y.: Hierarchical multiscale recurrent neural networks. In: Proceedings of ICLR (2017)

    Google Scholar 

  4. Darvishzadeh, A., et al.: CNN-BLSTM-CRF network for semantic labeling of students’ online handwritten assignments. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1035–1040 (2019)

    Google Scholar 

  5. Degtyarenko, I., et al.: Hierarchical recurrent neural network for handwritten strokes classification. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2865–2869 (2021)

    Google Scholar 

  6. Degtyarenko, I., Radyvonenko, O., Bokhan, K., Khomenko, V.: Text/shape classifier for mobile applications with handwriting input. Int. J. Doc. Anal. Recognit. 19(4), 369–379 (2016)

    Article  Google Scholar 

  7. Delaye, A., Liu, C.L.: Contextual text/non-text stroke classification in online handwritten notes with conditional random fields. Pattern Recogn. 47(3), 959–968 (2014)

    Article  Google Scholar 

  8. Dengel, A., Otte, S., Liwicki, M.: Local feature based online mode detection with recurrent neural networks. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 533–537 (2012)

    Google Scholar 

  9. El Hihi, S., Bengio, Y.: Hierarchical recurrent neural networks for long-term dependencies. In: Proceedings of NIPS (1995)

    Google Scholar 

  10. Gonnet, P., Deselaers, T.: Indylstms: independently recurrent LSTMs. In: Proceedings of ICASSP, pp. 3352–3356 (2020)

    Google Scholar 

  11. Indermühle, E., Frinken, V., Bunke, H.: Mode detection in online handwritten documents using BLSTM neural networks. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 302–307 (2012)

    Google Scholar 

  12. Indermühle, E., Liwicki, M., Bunke, H.: IAMonDo-database: an online handwritten document database with non-uniform contents. In: Proceedings of International Workshop on Document Analysis Systems, pp. 97–104 (2010)

    Google Scholar 

  13. Khomenko, V., Shyshkov, O., Radyvonenko, O., Bokhan, K.: Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization. In: Proceedings of IEEE DSMP, pp. 100–103 (2016)

    Google Scholar 

  14. Khomenko, V., Volkoviy, A., Degtyarenko, I., Radyvonenko, O.: Handwriting text/non-text classification on mobile device. In: Proceedings of International Conference on Artificial Intelligence and Pattern Recognition, pp. 42–49 (2017)

    Google Scholar 

  15. Lee, K., Kim, J., Kim, J., Hur, K., Kim, H.: Stacked convolutional bidirectional LSTM recurrent neural network for bearing anomaly detection in rotating machinery diagnostics. In: Proceedings of IEEE International Conference on Knowledge Innovation and Invention, pp. 98–101 (2018)

    Google Scholar 

  16. Lei, L., Lu, J., Ruan, S.: Hierarchical recurrent and convolutional neural network based on attention for Chinese document classification. In: Proceedings of IEEE Chinese Control And Decision Conference, pp. 809–814 (2019)

    Google Scholar 

  17. Liu, F., et al.: An attention-based hybrid LSTM-CNN model for arrhythmias classification. In: Proceedings of International Joint Conference on Neural Networks, pp. 1–8 (2019)

    Google Scholar 

  18. Mousavi, S., Weiqiang, Z., Sheng, Y., Beroza, G.: CRED: a deep residual network of convolutional and recurrent units for earthquake signal detection. Sci. Rep. 9, 1–14 (2019)

    Google Scholar 

  19. Polotskyi, S., Deriuga, I., Ignatova, T., Melnyk, V., Azarov, H.: Improving online handwriting text/non-text classification accuracy under condition of stroke context absence. In: Rojas, I., Joya, G., Catala, A. (eds.) IWANN 2019. LNCS, vol. 11506, pp. 210–221. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20521-8_18

    Chapter  Google Scholar 

  20. Polotskyi, S., Radyvonenko, O., Degtyarenko, I., Deriuga, I.: Spatio-temporal clustering for grouping in online handwriting document layout analysis with GRU-RNN. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 276–281 (2020)

    Google Scholar 

  21. Saeedan, F., Weber, N., Goesele, M., Roth, S.: Detail-preserving pooling in deep networks. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 9108–9116 (2018)

    Google Scholar 

  22. Willems, D., Rossignol, S., Vuurpijl, L.: Features for mode detection in natural online pen input. In: Proceedings of Conference of the International Graphonomics Society, pp. 113–117 (2005)

    Google Scholar 

  23. Yang, J., et al.: A hierarchical deep convolutional neural network and gated recurrent unit framework for structural damage detection. Inf. Sci. 540, 117–130 (2020)

    Article  Google Scholar 

  24. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of NAACL-HLT, pp. 1480–1489 (2016)

    Google Scholar 

  25. Ye, J., Zhang, Y., Yang, Q., Liu, C.: Contextual stroke classification in online handwritten documents with graph attention networks. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 993–998 (2019)

    Google Scholar 

  26. Yu, D., Wang, H., Chen, P., Wei, Z.: Mixed pooling for convolutional neural networks. In: Rough Sets and Knowledge Technology, pp. 364–375 (2014)

    Google Scholar 

  27. Zeiler, M., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. In: Proceedings of ICLR (2013)

    Google Scholar 

  28. Zhelezniakov, D., Zaytsev, V., Radyvonenko, O., Yakishyn, Y.: InteractivePaper: minimalism in document editing UI through the handwriting prism. In: Proceedings of ACM Symposium on UIST, pp. 13–15 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrii Grygoriev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grygoriev, A. et al. (2021). HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12822. Springer, Cham. https://doi.org/10.1007/978-3-030-86331-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86331-9_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86330-2

  • Online ISBN: 978-3-030-86331-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics