A CNN-LSTM based ensemble framework for in-air handwritten Assamese character recognition

Choudhury, Ananya; Sarma, Kandarpa Kumar

doi:10.1007/s11042-020-10470-y

A CNN-LSTM based ensemble framework for in-air handwritten Assamese character recognition

1166: Advances of machine learning in data analytics and visual information processing
Published: 31 March 2021

Volume 80, pages 35649–35684, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

619 Accesses
10 Citations
Explore all metrics

Abstract

In-air handwriting is a contemporary human computer interaction (HCI) technique which enables users to write and communicate in free space in a simple and intuitive manner. Air-written characters exhibit wide variations depending upon different writing styles of users and their speed of articulation, which presents a great challenge towards effective recognition of linguistic characters. So, in this paper we have proposed an ensemble model for in-air handwriting recognition which is based on convolutional neural network (CNN) and a long short-term memory neural network (LSTM-NN). The method collaborates overall character trajectory appearance modeling and temporal trajectory feature modeling for efficient recognition of varied types of air-written characters. In contrast to two-dimensional handwriting, in-air handwriting generally involves writing of characters interlinked by a continuous stroke, which makes segregation of intended writing activity from insignificant connecting motions an intricate task. So, a two-stage statistical framework is incorporated in the system for automatic detection and extraction of relevant writing segments from air-written characters. Identification of writing events from a continuous stream of air-written data is accomplished by formulating a Markov Random Field (MRF) model, while the segmentation of writing events into meaningful handwriting segments and redundant parts is performed by implementation of a Mahalanobis distance (MD) classifier. The proposed approach is assessed on an air-written character dataset comprising of Assamese vowels, consonants and numerals. The experimental results connote that our hybrid network can assimilate more information from the air-writing patterns and hence offer better recognition performance than the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 19

Fig. 20

Deep learning for time series classification: a review

Article 02 March 2019

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

Human activity recognition from sensor data using spatial attention-aided CNN with genetic algorithm

Article 26 October 2022

References

Agarwal C, Dogra DP, Saini R, Roy PP (2015) Segmentation and recognition of text written in 3d using leap motion interface. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR). IEEE, pp 539–543
Alam M, Kwon KC, Abbass MY, Imtiaz SM, Kim N (2020) Trajectory-based air-writing recognition using deep neural network and depth sensor. Sensors 20(2):376
Article Google Scholar
Amma C, Gehrig D, Schultz T (2010) Airwriting recognition using wearable motion sensors. In: Proceedings of the 1st Augmented Human International Conference, pp 1–8
Amma C, Georgi M, Schultz T (2012) Airwriting: Hands-free mobile text input by spotting and continuous recognition of 3D-space handwriting with inertial sensors. In: Proceedings of the 16th International Symposium on Wearable Computers. IEEE, pp 52–59
Ayachi N, Kejriwal P, Kane L, Khanna, P (2015) Analysis of the hand motion trajectories for recognition of air-drawn symbols. In: Proceedings of the Fifth International Conference on Communication Systems and Network Technologies. IEEE, pp 505–510
Behera SK, Kumar P, Dogra DP, Roy PP (2017) Fast signature spotting in continuous air writing. In: Fifteenth IAPR International Conference on Machine Vision Applications (MVA). IEEE, pp 314–317
Bradski G, Kaehler A (2008) Learning OpenCV: Computer vision with the OpenCV library (1st edition). O'Reilly Media, Inc, Sebastopol, CA
Chen M, AlRegib G, Juang BH (2013) Feature processing and modeling for 6D motion gesture recognition. IEEE Trans Multimedia 15(3):561–571. https://doi.org/10.1109/TMM.2012.2237024
Chen M, AlRegib G, Juang BH (2016) Air-writing recognition—Part I: Modeling and recognition of characters, words, and connecting motions. IEEE Trans Hum Mach Syst 46(3):403–413. https://doi.org/10.1109/THMS.2015.2492598
Chen Y, Luo B, Chen YL, Liang G, Wu X (2015) A real-time dynamic hand gesture recognition system using kinect sensor. In: 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, pp 2026–2030
Chen M, AlRegib G, Juang BH (2016) Air-writing recognition—Part II: Detection and recognition of writing activity in continuous stream of motion data. IEEE Trans Hum Mach Syst 46(3):436–444. https://doi.org/10.1109/THMS.2015.2492599
Chiang CC, Wang RH, Chen BR (2017) Recognizing arbitrarily connected and superimposed handwritten numerals in intangible writing interfaces. Pattern Recognit 61:15–28
Article Google Scholar
Choudhury A, Sarma KK (2018) A novel approach for gesture spotting in an assamese gesture-based character recognition system using a unique geometrical feature set. In: 2018 5th International Conference on Signal Processing and Integrated Networks (SPIN). IEEE, pp 98–104
Choudhury A, Sarma KK (2019) Visual gesture-based character recognition systems for design of assistive technologies for people with special necessities. In: Handmade teaching materials for students with disabilities. IGI Global, pp 294–315. https://doi.org/10.4018/978-1-5225-6240-5.ch013
Choudhury A, Sarma KK (2019) A two stage framework for detection and segmentation of writing events in air-written assamese characters. In: Proceedings of the international conference on pattern recognition and machine intelligence. Springer, Cham, pp 575–586
Crivelli T, Cernuschi-Frías B, Bouthemy P, Yao JF (2006) Segmentation of motion textures using mixed-state Markov random fields. In: Mathematics of data/image pattern recognition, compression, and encryption with applications IX 6315, 63150J. https://doi.org/10.1117/12.674648
Davies ER (2012) Computer and machine vision: theory, algorithms, practicalities, 4th edn. Academic Press, USA
DeCarlo LT (1997) On the meaning and use of kurtosis. Psychol Methods 2(3):292. https://doi.org/10.1037/1082-989X.2.3.292
Article Google Scholar
Duda RO, Hart PE (1973) Pattern recognition and scene analysis. Wiley, New York
MATH Google Scholar
Elmezain M, Al-Hamadi A, Sadek S, Michaelis B (2010) Robust methods for hand gesture spotting and recognition using hidden Markov models and conditional random fields. In: The Proceedings of 10th IEEE International Symposium on Signal Processing and Information Technology, pp 131–136. https://doi.org/10.1109/ISSPIT.2010.5711749
Fan DP, Lin Z, Zhang Z, Zhu M, Cheng MM (2020) Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw LearnSyst. https://doi.org/10.1109/TNNLS.2020.2996406
Frolova D, Stern H, Berman S (2013) Most probable longest common subsequence for recognition of gesture character input. IEEE Trans Cybern 43(3):871–880. https://doi.org/10.1109/TSMCB.2012.2217324
Gan J, Wang W (2019) In-air handwritten English word recognition using attention recurrent translator. Neural Comput Appl 31(7):3155–3172. https://doi.org/10.1007/s00521-017-3260-9
Article Google Scholar
Gan J, Wang W, Lu K (2018) A unified CNN-RNN approach for in-air handwritten English word recognition. In: 2018 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp 1–6
Gander W, Golub GH, Strebel R (1994) Least-squares fitting of circles and ellipses. BIT Numeric Math 34(4):558–578
Article MathSciNet Google Scholar
Hu JT, Fan CX, Ming Y (2015) Trajectory image based dynamic gesture recognition with convolutional neural networks. In: 2015 15th International Conference on Control, Automation and Systems (ICCAS). IEEE, pp 1885–1889
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv 1412:6980
Google Scholar
Kumar P, Saini R, Roy PP, Dogra DP (2017) Study of text segmentation and recognition using leap motion sensor. IEEE Sens J 17(5):1293–1301
Article Google Scholar
Leo M, Medioni G, Trivedi M, Kanade T, Farinella GM (2017) Computer vision for assistive technologies. Comput Vis Image Understand 154:1–15
Article Google Scholar
Li SZ (2009) Markov random field modeling in image analysis. Springer Science & Business Media
Liang Z, Wei J, Zhao J, Liu H, Li B, Shen J, Zheng C (2008) The statistical meaning of kurtosis and its new application to identification of persons based on seismic signals. Sensors 8(8):5106–5119. https://doi.org/10.3390/s8085106
Article Google Scholar
Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th International Conference on Artificial Intelligence (IJCAI), pp 674–679
Ma Y, Chen W, Ma X, Xu J, Huang X, Maciejewski R, Tung AK (2017) EasySVM: A visual analysis approach for open-box support vector machines. Comput Vis Media 3(2):161–175. https://doi.org/10.1007/s41095-017-0077-5
Mukherjee S, Ahmed SA, Dogra DP, Kar S, Roy PP (2019) Fingertip detection and tracking for recognition of air-writing in videos. Expert Syst Appl 136:217–229
Article Google Scholar
Murata T, Shin J (2014) Hand gesture and character recognition based on kinect sensor. Int J Distrib Sens Networks 10(7):278460
Article Google Scholar
Papoulis A, Saunders H (1989) Probability, random variables and stochastic processes. McGraw-Hill, New York
Google Scholar
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Article Google Scholar
Rahman A, Roy P, Pal U (2020) Continuous motion numeral recognition using RNN architecture in air-writing environment. In: Proceedings of Asian Conference on Pattern Recognition. Springer, Cham, pp 76–90. https://doi.org/10.1007/978-3-030-41404-7_6
Chapter Google Scholar
Ren H, Wang W, Lu K, Zhou J, Yuan Q (2017) An end-to-end recognizer for in-air handwritten Chinese characters based on a new recurrent neural networks. In: Proceedings of the 2017 IEEE International Conference on Multimedia and Expo (ICME), pp 841–846. https://doi.org/10.1109/ICME.2017.8019443
Rosin PL, Mumford CL (2006) A symmetric convexity measure. Comput Vis Image Underst 103(2):101–111. https://doi.org/10.1016/j.cviu.2006.04.002
Roy P, Ghosh S, Pal U (2018) A CNN based framework for unistroke numeral recognition in air-writing. In 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR) (pp. 404–409). IEEE
Schick A, Morlock D, Amma C, Schultz T, Stiefelhagen R (2012) Vision-based handwriting recognition for unrestricted text input in mid-air. In: Proceedings of the 14th ACM international conference on Multimodal interaction, pp 217–220. https://doi.org/10.1145/2388676.2388719
Smith SW (1997) Moving average filters. In: The scientist and engineer's guide to digital signal processing. California Technical Publishing, San, Diego, CA
Google Scholar
Tang J, Cheng H, Zhao Y, Guo H (2018) Structured dynamic time warping for continuous hand trajectory gesture recognition. Pattern Recognit 80:21–31
Article Google Scholar
Wang QA (2008) Probability distribution and entropy as a measure of uncertainty. J Physi A: Math Theoretic 41(6). 065004
Wilson JN, Ritter GX (2000) Handbook of computer vision algorithms in image algebra. CRC press
Xu S, Xue Y (2016) Air-writing characters modelling and recognition on modified CHMM. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, pp 001510–001513
Yang HD, Sclaroff S, Lee SW (2008) Sign language spotting with a threshold model based on conditional random fields. IEEE Trans Pattern Anal Mach Intell 31(7):1264–1277. https://doi.org/10.1109/TPAMI.2008.172
Yang C, Ku B, Han DK, Ko H (2016) Alpha-numeric hand gesture recognition based on fusion of spatial feature modelling and temporal feature modelling. Electron Lett 52(20):1679–1681
Article Google Scholar
Yang C, Han DK, Ko H (2017) Continuous hand gesture recognition based on trajectory shape information. Pattern Recognition Lett 99:39–47
Article Google Scholar
Zhang XY, Yin F, Zhang YM, Liu CL, Bengio Y (2017) Drawing and recognizing chinese characters with recurrent neural network. IEEE Trans Pattern Analys Machine Intell 40(4):849–862
Article Google Scholar
Zunic J, Rosin PL (2004) A new convexity measure for polygons. IEEE Trans Pattern Analys Machine Intell 26(7):923–934
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, Gauhati University, Guwahati, Assam, India
Ananya Choudhury & Kandarpa Kumar Sarma

Authors

Ananya Choudhury
View author publications
You can also search for this author in PubMed Google Scholar
Kandarpa Kumar Sarma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ananya Choudhury.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Choudhury, A., Sarma, K.K. A CNN-LSTM based ensemble framework for in-air handwritten Assamese character recognition. Multimed Tools Appl 80, 35649–35684 (2021). https://doi.org/10.1007/s11042-020-10470-y

Download citation

Received: 30 April 2020
Revised: 13 August 2020
Accepted: 29 December 2020
Published: 31 March 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s11042-020-10470-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A CNN-LSTM based ensemble framework for in-air handwritten Assamese character recognition

Abstract

Access this article

Similar content being viewed by others

Deep learning for time series classification: a review

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

Human activity recognition from sensor data using spatial attention-aided CNN with genetic algorithm

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A CNN-LSTM based ensemble framework for in-air handwritten Assamese character recognition

Abstract

Access this article

Similar content being viewed by others

Deep learning for time series classification: a review

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

Human activity recognition from sensor data using spatial attention-aided CNN with genetic algorithm

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation