Abstract
Dataset development is one of the most imperative tasks in document image processing research. The problem becomes more challenging when it comes about Numeral Image Database (NIdb) for official Indic scripts. Few efforts are made so far but they were restricted on single script which is basically a local script of the fellow researcher who prepared the database. In this paper, a technique for development of a handwritten NIdb of four popular Indic scripts namely Bangla, Devanagari, Roman and Urdu is proposed. Initially data were collected in unconstrained manner at Word-level from different writers with varying age, sex and educational qualification. All the images are stored in grey-level at .jpg format so that the data can be used in various ways as per need. A benchmark result on the present dataset is proposed using a novel hybrid approach with respect to Handwritten Numeral Script Identification (HNSI) problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Obaidullah, S.M., Das, S.K., Roy, K.: A System for Handwritten Script Identification from Indian Document. J. Pattern Recogn. Res. 8(1), 1–12 (2013)
Obaidullah, S.M., Rahaman, Z., Das, N., Roy, K.: Development of document image database for handwritten indic scripts. Int. J. Appl. Eng. Res. 9(20), 4625–4630 (2014)
Chaudhury, B.B.: A complete handwritten numeral database of Bangla—A Major Indic Script. In: 10th International Workshop on Frontiers in Handwriting Recognition, France (2006)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradientbased learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Hull, J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16, 550–554 (1994)
Saito, T., Yamada, H., Yamamoto, K.: On the database ELT9 of handprinted characters in JIS Chinese characters and its analysis (in Japanese). Trans. IECEJ, J. 68-D(4), 757–764 (1985)
Al-Ohali, Y., Cheriet, M., Suen, C.: Databases for recognition of handwritten Arabic cheques. Pattern Recogn. 36, 111–121 (2003)
Noumi, T., Matsui, T., Yamashita, I., Wakahara, T., Tsutsumida, T.: Tegaki Suji Database ‘IPTP CD-ROM1’ no Ichi Bunseki (in Japanese). In: 1994 Autumn Meeting of IEICE, D-309, September, 1994
Vajda, S., Roy, K., Pal, U., Chaudhuri, B.B., Belaid, A.: Automation of Indian postal documents written in Bangla and English. Int. J. Pattern Recogn. Artif. Intell. 23(8), 1599–1632 (2009)
Roy, K., Banerjee, A., Pal, U.: A system for word-wise handwritten script identification for indian postal automation. In: Proceedings of IEEE India Annual Conference 2004, pp. 266–271 (2004)
Mandal, J.K., Sengupta, M.: Authentication/secret message transformation through wavelet transform based subband image coding (WTSIC). In: InternationalSymposium on Electronic System Design 2010, pp 225–229, ISBN 978-0-7695-4294-2, Bhubaneswar, India, doi:10.1109/ISED.2010.50.,2010
Bhateja, V., Urooj, S., Mehrotra, R., Verma, R., Ekuakille, A.L., Verma, V.D.: A composite wavelets and morphology approach for ECG noise filtering. PReMI 2013, pp. 361–366
Dey, N., Das, A., Chaudhuri, S.S.: Wavelet based normal and abnormal heart sound identification using spectrogram analysis. Int. J. Comput. Sci. Eng. Technol. (IJCSET), 3(6) (2012). ISSN: 2229–3345
Matlab Documentation: http://www.mathworks.in/help/pdf_doc/images/images_tb.pdf. Accessed Mar 01 2015
Obaidullah, S.M., Mondal, A., Das, N., Roy, K.: Script identification from printed indian document images and performance evaluation using different classifiers. Appl. Comput. Intell. Soft Comput. 2014, Article ID 896128, 12 (2014). doi:10.1155/2014/896128
Acknowledgement
The authors are very much thankful to Mr. Tousif Jaman and Mr. Sahaniaj Dhukra, students of Aliah University for their immense help during data collection process.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer India
About this paper
Cite this paper
Obaidullah, S.M., Halder, C., Das, N., Roy, K. (2016). A Corpus of Word-Level Offline Handwritten Numeral Images from Official Indic Scripts. In: Satapathy, S., Raju, K., Mandal, J., Bhateja, V. (eds) Proceedings of the Second International Conference on Computer and Communication Technologies. Advances in Intelligent Systems and Computing, vol 379. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2517-1_67
Download citation
DOI: https://doi.org/10.1007/978-81-322-2517-1_67
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2516-4
Online ISBN: 978-81-322-2517-1
eBook Packages: EngineeringEngineering (R0)