Advertisement

Soft Computing

, Volume 23, Issue 24, pp 13603–13614 | Cite as

Devanagari ancient character recognition using DCT features with adaptive boosting and bootstrap aggregating

  • Sonika Rani Narang
  • M. K. Jindal
  • Munish KumarEmail author
Methodologies and Application
  • 76 Downloads

Abstract

Devanagari ancient manuscript recognition framework is drawing a lot of considerations from researchers nowadays. Devanagari ancient manuscripts are rare and delicate documents. To exploit the priceless information included in these documents, these documents are being digitized. Optical character recognition process is being used for the recognition of these documents. This paper presents a system for improvement in recognition of Devanagari ancient manuscripts using AdaBoost and Bagging methodologies. Discrete cosine transform (DCT) zigzag is used for feature extraction. Decision tree, Naïve Bayes and support vector machine classifiers are used for the recognition of basic characters segmented from Devanagari ancient manuscripts. A dataset of 5484 pre-segmented characters of Devanagari ancient documents is considered for experimental work. Maximum recognition accuracy of 90.70% has been achieved using DCT zigzag features and RBF-SVM classifier. AdaBoost and Bagging ensemble methods are used with the base classifiers to improve the accuracy. Maximum accuracy of 91.70% is achieved for adaptive boosting (AdaBoost) with RBF-SVM. Various parameters for performance measures such as precision, recall, F-measure, false acceptance rate, false rejection rate and RMSE are used for assessing the quality of the ensemble methods.

Keywords

Ancient manuscripts Devanagari historical documents Off-line character recognition Feature extraction Classification 

Notes

Compliance with ethical standards

During our research, we suffered a lot from the lack of a public dataset. Thus, we do not have a benchmark to compare our algorithm with others. A public dataset may help other researchers working on similar projects as ours. So we decide to share our raw data for experimental work.

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Stud Comput Intell.  https://doi.org/10.1007/978-3-030-10674-4 CrossRefGoogle Scholar
  2. Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5(1):19–28Google Scholar
  3. Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73(11):4773–4795CrossRefGoogle Scholar
  4. Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36CrossRefGoogle Scholar
  5. Abualigah LM, Khader AT, Hanandeh ES (2018) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48(11):4047–4071CrossRefGoogle Scholar
  6. Alkhateeb J, Ren J, Jiang J, Ipson SS, Abed HE (2008) Word based handwritten Arabic scripts recognition using DCT features and neural network classifier. In: Proceedings of the 5th international multi-conference on systems, signals and devices, pp 1–5Google Scholar
  7. Ameta D (2017) Ensemble classifier approach in breast cancer detection and malignancy grading: a review. Int J Manag Public Sect Inf Commun Technol (IJMPICT) 8(1):17–26Google Scholar
  8. Bansal S, Paliwal K (2018) Handwritten character recognition system using Gabor filter and SVM classifier. Int J Digit Appl Contemp Res 6(9):1–5Google Scholar
  9. Chung Y, Kim N, Park C, Lee JH (2018) Improved neighborhood search for collaborative filtering. Int J Fuzzy Log Intell Syst 18(1):29–40CrossRefGoogle Scholar
  10. Dabbaghchian S, Ghaemmaghami MP, Aghagolzadeh A (2010) Feature extraction using discrete cosine transform with discrimination power analysis with a face recognition technology. Pattern Recogn 43(4):1431–1440CrossRefGoogle Scholar
  11. Dattatray VJ, Raghunath SH (2008) Radon and discrete cosine transforms based feature extraction and dimensionality reduction approach for face recognition. Sig Process 88(10):2604–2609CrossRefGoogle Scholar
  12. Dietterich T (2000) Ensemble methods in machine learning. In: Proceedings of first international workshop on multiple classifier systems, pp 1–15Google Scholar
  13. Jiang S, Frigui H, Calhoun AW (2014) Text-independent speaker identification using soft bag-of-words feature representation. Int J Fuzzy Log Intell Syst 14(4):240–248CrossRefGoogle Scholar
  14. Khodadad I, Sid-Ahmed M, Abdel-Raheem E (2011) Online Arabic/Persian character recognition using neural network classifier and DCT features. In: Proceedings of the 54th international midwest symposium on circuits and systems, pp 1–4Google Scholar
  15. Kim JS, Jeong JS (2015) Pattern recognition of ship navigational data using support vector machine. Int J Fuzzy Log Intell Syst 15(4):268–276CrossRefGoogle Scholar
  16. Kim K, Choi H, Oh K (2017) Object detection using ensemble of linear classifiers with fuzzy adaptive boosting. EURASIP J Image Video Process.  https://doi.org/10.1186/s13640-017-0189-y CrossRefGoogle Scholar
  17. Kleber F, Sablatnig R, Gau M, and Miklas H (2008) Ancient document analysis based on text line extraction. In: Proceedings of the 19th international conference on pattern recognition, pp 1–4Google Scholar
  18. Kumar M, Jindal MK, Sharma RK (2014) A novel hierarchical techniques for offline handwritten Gurmukhi character recognition. Natl Acad Sci Lett 37(6):567–572CrossRefGoogle Scholar
  19. Kumar M, Jindal MK, Sharma RK, Jindal SR (2018a) Character and numeral recognition for non-Indic and Indic scripts: a survey. Artif Intell Rev.  https://doi.org/10.1007/s10462-017-9607-x CrossRefGoogle Scholar
  20. Kumar M, Jindal SR, Jindal MK, Lehal GS (2018b) Improved recognition results of medieval handwritten Gurmukhi manuscripts using boosting and bagging methodologies. Neural Process Lett.  https://doi.org/10.1007/s11063-018-9913-6 CrossRefGoogle Scholar
  21. Kuncheva LI (2005) Combining Pattern Classifiers: Methods and Algorithms. Wiley, New YorkzbMATHGoogle Scholar
  22. Lawgali A, Bouridane A, Angelova M, Ghassemlooy Z (2011) Handwritten Arabic character recognition: which feature extraction method. Int J Adv Sci Technol 34:1–8Google Scholar
  23. Lee H, Kim S (2016) Black-box classifier interpretation using decision tree and fuzzy logic-based classifier implementation. Int J Fuzzy Log Intell Syst 16(1):27–35CrossRefGoogle Scholar
  24. Ling CX, Huang J, Zhang H (2003) AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of the 18th international joint conference on artificial intelligence (IJCAI’03), pp 329–341Google Scholar
  25. Liu N, Han W (2007) Recognition of human faces using discrete cosine transform filtered trace feature. In: Proceedings of the 6th international conference on information, communications and signal processing (ICICS), pp 1–5Google Scholar
  26. Mitchell T (1997) Machine learning. McGraw-Hill, New York CityzbMATHGoogle Scholar
  27. Monro DM, Rakshit S, Zhang D (2007) DCT-based iris recognition. IEEE Trans Pattern Anal Mach Intell 29(4):586–595CrossRefGoogle Scholar
  28. Ngo CW, Chan CK (2005) Video text detection and segmentation for optical character recognition. Multimed Syst 10(3):261–272CrossRefGoogle Scholar
  29. Parisi R, Claudio ED, Lucarelli G, Orlandi G (1998) Car plate recognition by neural networks and image processing. Proc IEEE Int Symp Circuits Syst 3:195–198Google Scholar
  30. Quacimy BE, Kerroum MA, Hammouch A (2014) Feature extraction based on DCT for handwritten digit recognition. Int J Comput Sci Issues 11(6):27–33Google Scholar
  31. Quo L, Boukir S (2014) Ensemble margin framework for image classification. In: Proceedings of the IEEE international conference on image processing, France, pp 4231–4235Google Scholar
  32. Quo L, Boukir S (2017) Building an ensemble classifier using ensemble margin. Application to image classification. In: Proceedings of the 2017 IEEE international conference on image processing, Beijing, pp 4492–4496Google Scholar
  33. Ramteke SP, Gurjar AA, Deshmukh DS (2018) A streamlined OCR system for handwritten Marathi text document classification and recognition using SVM-ACS algorithm. Int J Intell Eng Syst 11(3):186–195Google Scholar
  34. Rokach L (2010) Ensemble methods for classifiers. In: Data mining and knowledge discovery handbook, pp 957–998. https://datajobs.com/data-science-repo/Ensemble-Methods-[Lior-Rokach].pdf
  35. Santana LEA, Silva L, Canuto AM, Pintro F, Vale KO (2010) A comparative analysis of genetic algorithm and ant colony optimization to select attributes for a heterogeneous ensemble of classifiers. In: IEEE congress evolutionary computation (CEC), pp 1–8Google Scholar
  36. Wang S, Yao X (2013) Relationships between diversity of classification ensembles and single-class performance measures. Knowl Data Eng 25(1):206–219CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Sonika Rani Narang
    • 1
  • M. K. Jindal
    • 2
  • Munish Kumar
    • 3
    Email author
  1. 1.Department of Computer ScienceDAV CollegeAboharIndia
  2. 2.Department of Computer Science and ApplicationsPanjab University Regional CentreMuktsarIndia
  3. 3.Department of Computational SciencesMaharaja Ranjit Singh Punjab Technical UniversityBathindaIndia

Personalised recommendations