Advertisement

Handwritten Indic Script Identification from Document Images—A Statistical Comparison of Different Attribute Selection Techniques in Multi-classifier Environment

  • Sk Md Obaidullah
  • Chayan Halder
  • Nibaran Das
  • Kaushik Roy
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 381)

Abstract

Script identification from document images is an essential task before choosing script-specific OCR for a Multi-lingual/Multi-script country like India. The problem becomes more complex when handwritten document images are considered. Several techniques have been developed so far for HSI (Handwritten Script Identification) problem and the work is still in progress. But the issue of dimensionality reduction of the feature set for script identification problem has not been addressed in the literature till date. This paper presents a statistical performance analysis of different attribute selection techniques in a multi-classifier environment for HSI problem on Indic scripts. A GAS (Greedy Attribute Selection) technique for HSI problem has also been proposed here. Encouraging outcomes are found observing the complexities of handwritten Indic scripts.

Keywords

Handwritten script identification Greedy attribute selection Performance analysis Average accuracy rate Model building time 

References

  1. 1.
    Obaidullah, S.M., Das, S.K., Roy, K.: A system for handwritten script identification from indian document. J. Pattern Recognit. Res. 8(1), 1–12 (2013)CrossRefGoogle Scholar
  2. 2.
    Ghosh, D., Dube, T., Shivprasad, S.P.: Script recognition—a review. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2142–2161 (2010)CrossRefGoogle Scholar
  3. 3.
    Chaudhuri, B.B., Pal, U.: A complete printed Bangla OCR. Pattern Recogn. 31, 531–549 (1998)CrossRefGoogle Scholar
  4. 4.
    Pal, U., Chaudhuri, B.B.: Identification of different script lines from multi-script documents. Image Vis. Comput. 20(13-14), 945–954 (2002)CrossRefGoogle Scholar
  5. 5.
    Hochberg, J., Kelly, P., Thomas, T., Kerns, L.: Automatic script identification from document images using cluster-based templates. IEEE Trans. Pattern Anal. Mach. Intell. 19, 176–181 (1997)Google Scholar
  6. 6.
    Chaudhury, S., Harit, G., Madnani, S., Shet, R.B.: Identification of scripts of Indian languages by combining trainable classifiers. In: Proceedings of Indian Conference on Computer Vision, Graphics and Image Processing, Bangalore, India, Dec-20–22 2000Google Scholar
  7. 7.
    Dhanya, D., Ramakrishnan, A.G., Pati, P.B.: Script identification in printed bilingual documents. In: Sadhana, vol. 27, part-1, pp. 73–82 (2002)Google Scholar
  8. 8.
    Pati, P.B., Ramakrishnan, A.G.: Word level multi-script identification. Pattern Recogn. Lett. 29(9), 1218–1229 (2008)CrossRefGoogle Scholar
  9. 9.
    Obaidullah, S.M., Mondal, A., Das, N., Roy, K.: Script Identification from printed indian document images and performance evaluation using different classifiers. Appl. Comput. Intell. Soft Comput, vol. 2014, p. 12. Article ID 896128 (2014). doi: 10.1155/2014/896128
  10. 10.
    Roy, K., Banerjee, A., Pal, U.: A system for word-wise handwritten script identification for indian postal automation. In: Proceedings of IEEE India Annual Conference 2004, pp. 266-271 (2004)Google Scholar
  11. 11.
    Vajda, S., Roy, K., Pal, U., Chaudhuri, B.B., Belaid, A.: Automation of Indian postal documents written in Bangla and English. Int. J. Pattern Recognit. Artif. Intell. 23(8), 1599–1632 (2009)CrossRefGoogle Scholar
  12. 12.
  13. 13.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11, 10–18 (2009)CrossRefGoogle Scholar
  14. 14.
  15. 15.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Boston (1989)Google Scholar
  16. 16.
    Guetlein, M., Frank, E. Hall, M., Karwath, A.: Large scale attribute selection using wrappers. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, pp. 332–339 (2009)Google Scholar
  17. 17.
    Moraglio, A., Chio, D., Poli, C.R.: Geometric particle swarm optimization. In: Proceedings of the 10th European Conference on Genetic Programming, Berlin, Heidelberg, pp. 125–136 (2007)Google Scholar
  18. 18.
    Hall, M., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data Eng. 15(6), 1437–1447 (2003)CrossRefGoogle Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  • Sk Md Obaidullah
    • 1
  • Chayan Halder
    • 2
  • Nibaran Das
    • 3
  • Kaushik Roy
    • 2
  1. 1.Department of Computer Science & EngineeringAliah UniversityKolkataIndia
  2. 2.Department of Computer ScienceWest Bengal State UniversityBarasatIndia
  3. 3.Department of Computer Science and EngineeringJadavpur UniversityKolkataIndia

Personalised recommendations