Skip to main content
Log in

A CNN-LSTM based ensemble framework for in-air handwritten Assamese character recognition

  • 1166: Advances of machine learning in data analytics and visual information processing
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In-air handwriting is a contemporary human computer interaction (HCI) technique which enables users to write and communicate in free space in a simple and intuitive manner. Air-written characters exhibit wide variations depending upon different writing styles of users and their speed of articulation, which presents a great challenge towards effective recognition of linguistic characters. So, in this paper we have proposed an ensemble model for in-air handwriting recognition which is based on convolutional neural network (CNN) and a long short-term memory neural network (LSTM-NN). The method collaborates overall character trajectory appearance modeling and temporal trajectory feature modeling for efficient recognition of varied types of air-written characters. In contrast to two-dimensional handwriting, in-air handwriting generally involves writing of characters interlinked by a continuous stroke, which makes segregation of intended writing activity from insignificant connecting motions an intricate task. So, a two-stage statistical framework is incorporated in the system for automatic detection and extraction of relevant writing segments from air-written characters. Identification of writing events from a continuous stream of air-written data is accomplished by formulating a Markov Random Field (MRF) model, while the segmentation of writing events into meaningful handwriting segments and redundant parts is performed by implementation of a Mahalanobis distance (MD) classifier. The proposed approach is assessed on an air-written character dataset comprising of Assamese vowels, consonants and numerals. The experimental results connote that our hybrid network can assimilate more information from the air-writing patterns and hence offer better recognition performance than the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  1. Agarwal C, Dogra DP, Saini R, Roy PP (2015) Segmentation and recognition of text written in 3d using leap motion interface. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR). IEEE, pp 539–543

  2. Alam M, Kwon KC, Abbass MY, Imtiaz SM, Kim N (2020) Trajectory-based air-writing recognition using deep neural network and depth sensor. Sensors 20(2):376

    Article  Google Scholar 

  3. Amma C, Gehrig D, Schultz T (2010) Airwriting recognition using wearable motion sensors. In: Proceedings of the 1st Augmented Human International Conference, pp 1–8

  4. Amma C, Georgi M, Schultz T (2012) Airwriting: Hands-free mobile text input by spotting and continuous recognition of 3D-space handwriting with inertial sensors. In: Proceedings of the 16th International Symposium on Wearable Computers. IEEE, pp 52–59

  5. Ayachi N, Kejriwal P, Kane L, Khanna, P (2015) Analysis of the hand motion trajectories for recognition of air-drawn symbols. In: Proceedings of the Fifth International Conference on Communication Systems and Network Technologies. IEEE, pp 505–510

  6. Behera SK, Kumar P, Dogra DP, Roy PP (2017) Fast signature spotting in continuous air writing. In: Fifteenth IAPR International Conference on Machine Vision Applications (MVA). IEEE, pp 314–317

  7. Bradski G, Kaehler A (2008) Learning OpenCV: Computer vision with the OpenCV library (1st edition). O'Reilly Media, Inc, Sebastopol, CA

  8. Chen M, AlRegib G, Juang BH (2013) Feature processing and modeling for 6D motion gesture recognition. IEEE Trans Multimedia 15(3):561–571. https://doi.org/10.1109/TMM.2012.2237024

  9. Chen M, AlRegib G, Juang BH (2016) Air-writing recognition—Part I: Modeling and recognition of characters, words, and connecting motions. IEEE Trans Hum Mach Syst 46(3):403–413. https://doi.org/10.1109/THMS.2015.2492598

  10. Chen Y, Luo B, Chen YL, Liang G, Wu X (2015) A real-time dynamic hand gesture recognition system using kinect sensor. In: 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, pp 2026–2030

  11. Chen M, AlRegib G, Juang BH (2016) Air-writing recognition—Part II: Detection and recognition of writing activity in continuous stream of motion data. IEEE Trans Hum Mach Syst 46(3):436–444. https://doi.org/10.1109/THMS.2015.2492599

  12. Chiang CC, Wang RH, Chen BR (2017) Recognizing arbitrarily connected and superimposed handwritten numerals in intangible writing interfaces. Pattern Recognit 61:15–28

    Article  Google Scholar 

  13. Choudhury A, Sarma KK (2018) A novel approach for gesture spotting in an assamese gesture-based character recognition system using a unique geometrical feature set. In: 2018 5th International Conference on Signal Processing and Integrated Networks (SPIN). IEEE, pp 98–104

  14. Choudhury A, Sarma KK (2019) Visual gesture-based character recognition systems for design of assistive technologies for people with special necessities. In: Handmade teaching materials for students with disabilities. IGI Global, pp 294–315. https://doi.org/10.4018/978-1-5225-6240-5.ch013

  15. Choudhury A, Sarma KK (2019) A two stage framework for detection and segmentation of writing events in air-written assamese characters. In: Proceedings of the international conference on pattern recognition and machine intelligence. Springer, Cham, pp 575–586

  16. Crivelli T, Cernuschi-Frías B, Bouthemy P, Yao JF (2006) Segmentation of motion textures using mixed-state Markov random fields. In: Mathematics of data/image pattern recognition, compression, and encryption with applications IX 6315, 63150J. https://doi.org/10.1117/12.674648

  17. Davies ER (2012) Computer and machine vision: theory, algorithms, practicalities, 4th edn. Academic Press, USA

  18. DeCarlo LT (1997) On the meaning and use of kurtosis. Psychol Methods 2(3):292. https://doi.org/10.1037/1082-989X.2.3.292

    Article  Google Scholar 

  19. Duda RO, Hart PE (1973) Pattern recognition and scene analysis. Wiley, New York

    MATH  Google Scholar 

  20. Elmezain M, Al-Hamadi A, Sadek S, Michaelis B (2010) Robust methods for hand gesture spotting and recognition using hidden Markov models and conditional random fields. In: The Proceedings of 10th IEEE International Symposium on Signal Processing and Information Technology, pp 131–136. https://doi.org/10.1109/ISSPIT.2010.5711749

  21. Fan DP, Lin Z, Zhang Z, Zhu M, Cheng MM (2020) Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw LearnSyst. https://doi.org/10.1109/TNNLS.2020.2996406

  22. Frolova D, Stern H, Berman S (2013) Most probable longest common subsequence for recognition of gesture character input. IEEE Trans Cybern 43(3):871–880. https://doi.org/10.1109/TSMCB.2012.2217324

  23. Gan J, Wang W (2019) In-air handwritten English word recognition using attention recurrent translator. Neural Comput Appl 31(7):3155–3172. https://doi.org/10.1007/s00521-017-3260-9

    Article  Google Scholar 

  24. Gan J, Wang W, Lu K (2018) A unified CNN-RNN approach for in-air handwritten English word recognition. In: 2018 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp 1–6

  25. Gander W, Golub GH, Strebel R (1994) Least-squares fitting of circles and ellipses. BIT Numeric Math 34(4):558–578

    Article  MathSciNet  Google Scholar 

  26. Hu JT, Fan CX, Ming Y (2015) Trajectory image based dynamic gesture recognition with convolutional neural networks. In: 2015 15th International Conference on Control, Automation and Systems (ICCAS). IEEE, pp 1885–1889

  27. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv 1412:6980

    Google Scholar 

  28. Kumar P, Saini R, Roy PP, Dogra DP (2017) Study of text segmentation and recognition using leap motion sensor. IEEE Sens J 17(5):1293–1301

    Article  Google Scholar 

  29. Leo M, Medioni G, Trivedi M, Kanade T, Farinella GM (2017) Computer vision for assistive technologies. Comput Vis Image Understand 154:1–15

    Article  Google Scholar 

  30. Li SZ (2009) Markov random field modeling in image analysis. Springer Science & Business Media

  31. Liang Z, Wei J, Zhao J, Liu H, Li B, Shen J, Zheng C (2008) The statistical meaning of kurtosis and its new application to identification of persons based on seismic signals. Sensors 8(8):5106–5119. https://doi.org/10.3390/s8085106

    Article  Google Scholar 

  32. Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th International Conference on Artificial Intelligence (IJCAI), pp 674–679

  33. Ma Y, Chen W, Ma X, Xu J, Huang X, Maciejewski R, Tung AK (2017) EasySVM: A visual analysis approach for open-box support vector machines. Comput Vis Media 3(2):161–175. https://doi.org/10.1007/s41095-017-0077-5

  34. Mukherjee S, Ahmed SA, Dogra DP, Kar S, Roy PP (2019) Fingertip detection and tracking for recognition of air-writing in videos. Expert Syst Appl 136:217–229

    Article  Google Scholar 

  35. Murata T, Shin J (2014) Hand gesture and character recognition based on kinect sensor. Int J Distrib Sens Networks 10(7):278460

    Article  Google Scholar 

  36. Papoulis A, Saunders H (1989) Probability, random variables and stochastic processes. McGraw-Hill, New York

    Google Scholar 

  37. Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286

    Article  Google Scholar 

  38. Rahman A, Roy P, Pal U (2020) Continuous motion numeral recognition using RNN architecture in air-writing environment. In: Proceedings of Asian Conference on Pattern Recognition. Springer, Cham, pp 76–90. https://doi.org/10.1007/978-3-030-41404-7_6

    Chapter  Google Scholar 

  39. Ren H, Wang W, Lu K, Zhou J, Yuan Q (2017) An end-to-end recognizer for in-air handwritten Chinese characters based on a new recurrent neural networks. In: Proceedings of the 2017 IEEE International Conference on Multimedia and Expo (ICME), pp 841–846. https://doi.org/10.1109/ICME.2017.8019443

  40. Rosin PL, Mumford CL (2006) A symmetric convexity measure. Comput Vis Image Underst 103(2):101–111. https://doi.org/10.1016/j.cviu.2006.04.002

  41. Roy P, Ghosh S, Pal U (2018) A CNN based framework for unistroke numeral recognition in air-writing. In 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR) (pp. 404–409). IEEE

  42. Schick A, Morlock D, Amma C, Schultz T, Stiefelhagen R (2012) Vision-based handwriting recognition for unrestricted text input in mid-air. In: Proceedings of the 14th ACM international conference on Multimodal interaction, pp 217–220. https://doi.org/10.1145/2388676.2388719

  43. Smith SW (1997) Moving average filters. In: The scientist and engineer's guide to digital signal processing. California Technical Publishing, San, Diego, CA

    Google Scholar 

  44. Tang J, Cheng H, Zhao Y, Guo H (2018) Structured dynamic time warping for continuous hand trajectory gesture recognition. Pattern Recognit 80:21–31

    Article  Google Scholar 

  45. Wang QA (2008) Probability distribution and entropy as a measure of uncertainty. J Physi A: Math Theoretic 41(6). 065004

  46. Wilson JN, Ritter GX (2000) Handbook of computer vision algorithms in image algebra. CRC press

  47. Xu S, Xue Y (2016) Air-writing characters modelling and recognition on modified CHMM. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, pp 001510–001513

  48. Yang HD, Sclaroff S, Lee SW (2008) Sign language spotting with a threshold model based on conditional random fields. IEEE Trans Pattern Anal Mach Intell 31(7):1264–1277. https://doi.org/10.1109/TPAMI.2008.172

  49. Yang C, Ku B, Han DK, Ko H (2016) Alpha-numeric hand gesture recognition based on fusion of spatial feature modelling and temporal feature modelling. Electron Lett 52(20):1679–1681

    Article  Google Scholar 

  50. Yang C, Han DK, Ko H (2017) Continuous hand gesture recognition based on trajectory shape information. Pattern Recognition Lett 99:39–47

    Article  Google Scholar 

  51. Zhang XY, Yin F, Zhang YM, Liu CL, Bengio Y (2017) Drawing and recognizing chinese characters with recurrent neural network. IEEE Trans Pattern Analys Machine Intell 40(4):849–862

    Article  Google Scholar 

  52. Zunic J, Rosin PL (2004) A new convexity measure for polygons. IEEE Trans Pattern Analys Machine Intell 26(7):923–934

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ananya Choudhury.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Choudhury, A., Sarma, K.K. A CNN-LSTM based ensemble framework for in-air handwritten Assamese character recognition. Multimed Tools Appl 80, 35649–35684 (2021). https://doi.org/10.1007/s11042-020-10470-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10470-y

Keywords

Navigation