A Corpus of Word-Level Offline Handwritten Numeral Images from Official Indic Scripts

Obaidullah, Sk Md; Halder, Chayan; Das, Nibaran; Roy, Kaushik

doi:10.1007/978-81-322-2517-1_67

Sk Md Obaidullah⁶,
Chayan Halder⁷,
Nibaran Das⁸ &
…
Kaushik Roy⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 379))

1200 Accesses

Abstract

Dataset development is one of the most imperative tasks in document image processing research. The problem becomes more challenging when it comes about Numeral Image Database (NIdb) for official Indic scripts. Few efforts are made so far but they were restricted on single script which is basically a local script of the fellow researcher who prepared the database. In this paper, a technique for development of a handwritten NIdb of four popular Indic scripts namely Bangla, Devanagari, Roman and Urdu is proposed. Initially data were collected in unconstrained manner at Word-level from different writers with varying age, sex and educational qualification. All the images are stored in grey-level at .jpg format so that the data can be used in various ways as per need. A benchmark result on the present dataset is proposed using a novel hybrid approach with respect to Handwritten Numeral Script Identification (HNSI) problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Obaidullah, S.M., Das, S.K., Roy, K.: A System for Handwritten Script Identification from Indian Document. J. Pattern Recogn. Res. 8(1), 1–12 (2013)
Article Google Scholar
Obaidullah, S.M., Rahaman, Z., Das, N., Roy, K.: Development of document image database for handwritten indic scripts. Int. J. Appl. Eng. Res. 9(20), 4625–4630 (2014)
Google Scholar
Chaudhury, B.B.: A complete handwritten numeral database of Bangla—A Major Indic Script. In: 10th International Workshop on Frontiers in Handwriting Recognition, France (2006)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradientbased learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Hull, J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16, 550–554 (1994)
Article Google Scholar
Saito, T., Yamada, H., Yamamoto, K.: On the database ELT9 of handprinted characters in JIS Chinese characters and its analysis (in Japanese). Trans. IECEJ, J. 68-D(4), 757–764 (1985)
Google Scholar
Al-Ohali, Y., Cheriet, M., Suen, C.: Databases for recognition of handwritten Arabic cheques. Pattern Recogn. 36, 111–121 (2003)
Google Scholar
Noumi, T., Matsui, T., Yamashita, I., Wakahara, T., Tsutsumida, T.: Tegaki Suji Database ‘IPTP CD-ROM1’ no Ichi Bunseki (in Japanese). In: 1994 Autumn Meeting of IEICE, D-309, September, 1994
Google Scholar
Vajda, S., Roy, K., Pal, U., Chaudhuri, B.B., Belaid, A.: Automation of Indian postal documents written in Bangla and English. Int. J. Pattern Recogn. Artif. Intell. 23(8), 1599–1632 (2009)
Article Google Scholar
Roy, K., Banerjee, A., Pal, U.: A system for word-wise handwritten script identification for indian postal automation. In: Proceedings of IEEE India Annual Conference 2004, pp. 266–271 (2004)
Google Scholar
Mandal, J.K., Sengupta, M.: Authentication/secret message transformation through wavelet transform based subband image coding (WTSIC). In: InternationalSymposium on Electronic System Design 2010, pp 225–229, ISBN 978-0-7695-4294-2, Bhubaneswar, India, doi:10.1109/ISED.2010.50.,2010
Bhateja, V., Urooj, S., Mehrotra, R., Verma, R., Ekuakille, A.L., Verma, V.D.: A composite wavelets and morphology approach for ECG noise filtering. PReMI 2013, pp. 361–366
Google Scholar
Dey, N., Das, A., Chaudhuri, S.S.: Wavelet based normal and abnormal heart sound identification using spectrogram analysis. Int. J. Comput. Sci. Eng. Technol. (IJCSET), 3(6) (2012). ISSN: 2229–3345
Google Scholar
Matlab Documentation: http://www.mathworks.in/help/pdf_doc/images/images_tb.pdf. Accessed Mar 01 2015
Obaidullah, S.M., Mondal, A., Das, N., Roy, K.: Script identification from printed indian document images and performance evaluation using different classifiers. Appl. Comput. Intell. Soft Comput. 2014, Article ID 896128, 12 (2014). doi:10.1155/2014/896128
Google Scholar

Download references

Acknowledgement

The authors are very much thankful to Mr. Tousif Jaman and Mr. Sahaniaj Dhukra, students of Aliah University for their immense help during data collection process.

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Aliah University, Kolkata, W.B, India
Sk Md Obaidullah
Department of Computer Science, West Bengal State University, Kolkata, W.B, India
Chayan Halder & Kaushik Roy
Department of Computer Science & Engineering, Jadavpur University, Kolkata, W.B, India
Nibaran Das

Authors

Sk Md Obaidullah
View author publications
You can also search for this author in PubMed Google Scholar
Chayan Halder
View author publications
You can also search for this author in PubMed Google Scholar
Nibaran Das
View author publications
You can also search for this author in PubMed Google Scholar
Kaushik Roy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sk Md Obaidullah .

Editor information

Editors and Affiliations

Dept. of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences, Visakhapatnam, India
Suresh Chandra Satapathy
Department of CSE, CMR Technical Campus, Hyderabad, India
K. Srujan Raju
Computer Science & Engineering, Kalyani University, Nadia, West Bengal, India
Jyotsna Kumar Mandal
Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges, Lucknow, Uttar Pradesh, India
Vikrant Bhateja

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Obaidullah, S.M., Halder, C., Das, N., Roy, K. (2016). A Corpus of Word-Level Offline Handwritten Numeral Images from Official Indic Scripts. In: Satapathy, S., Raju, K., Mandal, J., Bhateja, V. (eds) Proceedings of the Second International Conference on Computer and Communication Technologies. Advances in Intelligent Systems and Computing, vol 379. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2517-1_67

Download citation

DOI: https://doi.org/10.1007/978-81-322-2517-1_67
Published: 05 September 2015
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2516-4
Online ISBN: 978-81-322-2517-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics