Skip to main content

Text and Non-text Separation in Handwritten Document Images Using Local Binary Pattern Operator

  • Conference paper
  • First Online:
Proceedings of the First International Conference on Intelligent Computing and Communication

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 458))

Abstract

Development of an automated system for handwritten document analysis is being considered as an important research topic since last few decades. Digitized documents, either handwritten or printed, contain a mixture of text and non-text elements which need to be separated for designing a document layout analyzer or even an Optical Character Recognizer. In this paper, a technique is described to separate the text objects from the non-text objects present in a handwritten document image. For this purpose, a Rotation Invariant Local Binary Pattern (RILBP) based texture feature is used to represent the said components, at the feature space. Finally, the classification is carried out using an Artificial Neural Network based classifier called, Multi-layer Perceptron (MLP). The system provides an impressive result on a database comprising of 100 handwritten document images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Okun, O., Dœrmann, D., & Pietikainen, M. (1999). Page segmentation and zone classification: the state of the art (No. LAMP-TR-036). OULU UNIV (FINLAND) DEPT OF ELECTRICAL ENGINEERING.

    Google Scholar 

  2. Roy, P. P., Lladó, J., & Pal, U. (2007, March). Text/graphics separation in color maps. In Computing: Theory and Applications, 2007. ICCTA’07. International Conference on (pp. 545–551). IEEE.

    Google Scholar 

  3. Bukhari, S. S., Azawi, A., Ali, M. I., Shafait, F., & Breuel, T. M. (2010, June). Document image segmentation using discriminative learning over connected components. In Proceedings of the 9th IAPR International Workshop on Document Analysis Systems (pp. 183–190). ACM.

    Google Scholar 

  4. Sarkar, R., Moulik, S., Das, N., Basu, S., Nasipuri, M., & Kundu, M. (2011, November). Suppression of non-text components in handwritten document images. In Image Information Processing (ICIIP), 2011 International Conference on (pp. 1–7). IEEE.

    Google Scholar 

  5. Zirari, F., Ennaji, A., Nicolas, S., & Mammass, D. (2013, May). A simple text/graphic separation method for document image segmentation. In Computer Systems and Applications (AICCSA), 2013 ACS International Conference on (pp. 1–4). IEEE.

    Google Scholar 

  6. Delaye, A., & Liu, C. L. (2014). Contextual text/non-text stroke classification in online handwritten notes with conditional random fields. Pattern Recognition, 47(3), 959–968.

    Google Scholar 

  7. Chen, D., Bourlard, H., & Thiran, J. P. (2001). Text identification in complex background using SVM. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on (Vol. 2, pp. II-621). IEEE.

    Google Scholar 

  8. Yapa, R. D., & Harada, K. (2008). Connected component labeling algorithms for gray-scale images and evaluation of performance using digital mammograms. International Journal of Computer Science and Network Security8(6), 33–41.

    Google Scholar 

  9. Park, J. M., Looney, C. G., & Chen, H. C. (2000, March). Fast connected component labeling algorithm using a divide and conquer technique. In Computers and Their Applications (pp. 373–376).

    Google Scholar 

  10. Ojala, T., Pietikäinen, M., & Mäenpää, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on24(7), 971–987.

    Google Scholar 

  11. Ojala, T., Mäenpää, T., Pietikainen, M., Viertola, J., Kyllönen, J., & Huovinen, S. (2002). Outex-new framework for empirical evaluation of texture analysis algorithms. In Pattern Recognition, 2002. Proceedings. 16th International Conference on (Vol. 1, pp. 701–706). IEEE.

    Google Scholar 

  12. Ahonen, T., Hadid, A., & Pietikainen, M. (2006). Face description with local binary patterns: Application to face recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on28(12), 2037–2041.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Showmik Bhowmik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Singapore

About this paper

Cite this paper

Showmik Bhowmik, Ram Sarkar, Mita Nasipuri (2017). Text and Non-text Separation in Handwritten Document Images Using Local Binary Pattern Operator. In: Mandal, J., Satapathy, S., Sanyal, M., Bhateja, V. (eds) Proceedings of the First International Conference on Intelligent Computing and Communication. Advances in Intelligent Systems and Computing, vol 458. Springer, Singapore. https://doi.org/10.1007/978-981-10-2035-3_52

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-2035-3_52

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-2034-6

  • Online ISBN: 978-981-10-2035-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics