Abstract
This paper presents an architecture that enables the recognizer to learn incrementally and, thereby adapt to document image collections for performance improvement. We argue that the recognition scheme for a book could be considerably different from that designed for isolated pages. We employ learning procedures to capture the relevant information available online, and feed it back to update the knowledge of the system. Experimental results show the effectiveness of our design for improving the performance on-the-fly.
Chapter PDF
Similar content being viewed by others
References
Feng, S., Manmatha, R.: A hierarchical, HMM-based automatic evaluation of OCR accuracy for a digital library of books. In: Joint Conference on Digital Libraries (JCDL), pp. 109–118 (2006)
Sankar, P., et al.: Digitizing a million books: Challenges for document analysis. In: Proc. of the Seventh IAPR Workshop on Document Analysis Systems, pp. 425–436 (2006)
Lin, X.: DRR research beyond COTS OCR software: A survey. In: SPIE Conference on Document Recognition and Retrieval XII, San Jose, CA, pp. 16–20 (2005)
Xu, Y., Nagy, G.: Prototype extraction and adaptive OCR. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 1280–1296 (1999)
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, Heidelberg (2001)
Nagy, G.: Twenty years of document image analysis in PAMI. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 38–62 (2000)
Kahan, S., Pavlidis, T., Baird, H.S.: On the recognition of printed characters of any font and size. IEEE Transactions on Pattern Analysis and Machine Intelligence 9, 274–288 (1987)
Rawat, S., et al.: A semi-automatic adaptive OCR for digital libraries. In: Proc. of the Seventh IAPR Workshop on Document Analysis Systems, pp. 13–24 (2006)
Ivanov, Y., Blumberg, B., Pentland, A.: Expectation maximization for weakly labeled data. In: Proc. of the Int. Conf. on Machine Learning, pp. 218–225 (2001)
Iyengar, V.S., Apte, C., Zhang, T.: Active learning using adaptive resampling. In: Sixth Int. Conference on Knowledge Discovery and Data Mining, pp. 92–98 (2000)
Diehl, C., Cauwenberghs, G.: SVM incremental learning, adaptation and optimization. In: Proc. IEEE Int. Joint Conf. Neural Networks, pp. 2685–2690 (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meshesha, M., Jawahar, C.V. (2007). Self Adaptable Recognizer for Document Image Collections. In: Ghosh, A., De, R.K., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2007. Lecture Notes in Computer Science, vol 4815. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77046-6_69
Download citation
DOI: https://doi.org/10.1007/978-3-540-77046-6_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77045-9
Online ISBN: 978-3-540-77046-6
eBook Packages: Computer ScienceComputer Science (R0)