Skip to main content
Log in

Novel Features for Character Extraction of Historical Chinese Seal Images

  • Original Paper
  • Published:
Sensing and Imaging Aims and scope Submit manuscript

Abstract

Characters in the historical seals is valuable for the research of corresponding documents. Character extraction for these seals is challenging because of the special characteristics of the seals such as the different carving type (rilievi or diaglyph) and variant border form. Thus most existing character extraction methods do not work well with the images of the historical seals. In this paper, a new character extraction method is proposed based on novel features of the scanned seal images. First, the border width feature and structure feature of each scanned seal image are extracted by analysis of the probabilistic density distribution of border width and brightness of each column. Meanwhile, the optimal border width for rilievi or a starting position of border for diaglyph are generated for subsequent border removal and character extraction. The tri-border feature is used to differentiate between diaglyph and tri-border diaglyph. Then a decision tree is built to classify the seal images into rilievi and diaglyph. The classification result is used to divide background, border and character. After the border is removed with the optimal border width estimated from border width feature, the characters are finally extracted from the seal image. Experimental results on the real data set show that the proposed method classifies the scanned seal image accurately and extracts the characters effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Deng, L., & Yu, D. (2014). Deep learning: Methods and applications. Foundations and Trends in Signal Processing, 7, 197–387.

    Article  MathSciNet  Google Scholar 

  2. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Processing of international conference on neural information processing systems (pp. 1097–1105).

  3. Zhang, J., & Kasturi, R. (2010). Character energy and link energy-based text extraction in scene images. In Asian conference on computer vision (pp. 308–320).

  4. Zhang, H., Zhao, K., Song, Y. Z., et al. (2013). Text extraction from natural scene image: A survey. Neurocomputing, 122, 310–323.

    Article  Google Scholar 

  5. Lee, S. H., Min, S. C., Jung, K., et al. (2010). Scene text extraction with edge constraint and text collinearity. In International conference on pattern recognition (pp. 3983–3986).

  6. Raj, H., & Ghosh, R. (2014). Devanagari text extraction from natural scene images. In International conference on advances in computing (pp. 513–517).

  7. Segu, R., & Suresh, K. (2014). Joint feature extraction technique for text detection from natural scene image. International Journal of Signal and Imaging Systems Engineering, 10, 14.

    Article  Google Scholar 

  8. Khodadadi, M., & Behrad, A. (2012). Text localization, extraction and inpainting in color images. In Electrical engineering (pp. 1035–1040).

  9. Peng, X., Cao, H., Prasad, R., et al. (2011). Text extraction from video using conditional random fields. In International conference on document analysis and recognition (pp. 1029–1033).

  10. Li, Z., Liu, G., Qian, X., et al. (2011). Effective and efficient video text extraction using key text points. IET Image Processing, 5, 671–683.

    Article  MathSciNet  Google Scholar 

  11. Huang, X., Wang, Q., Zhu, L., et al. (2014). A new video text extraction method based on stroke. In International congress on image and signal processing (pp. 75–80).

  12. Wong, E. K., & Chen, M. (2015). A new robust algorithm for video text extraction. Pattern Recognition, 36, 1397–1406.

    Article  Google Scholar 

  13. Vellingiriraj, E. K., Balamurugan, M., & Balasubramanie, P. (2017). Information extraction and text mining of Ancient Vattezhuthu characters in historical documents using image zoning. In International conference on Asian Language Processing (pp. 37–40).

  14. Karatzas, D., & Antonacopoulos, A. (2013). Text extraction from web images based on a split-and-merge segmentation method using colour perception. In International conference on pattern recognition (vol. 2, pp. 634–637).

  15. Shi, C., Wang, C., Xiao, B., et al. (2013). Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognition Letters, 34, 107–116.

    Article  Google Scholar 

  16. Koo, H. I., & Kim, D. H. (2013). Scene text detection via connected component clustering and nontext filtering. IEEE Transactions on Image Processing, 22, 2296–2305.

    Article  MathSciNet  Google Scholar 

  17. Zhao, M., Li, S., & Kwok, J. (2010). Text detection in images using sparse representation with discriminative dictionaries. Image and Vision Computing, 28, 1590–1599.

    Article  Google Scholar 

  18. Shivakumara, P., Phan, T. Q., & Tan, C. L. (2010). A Laplacian approach to multi-oriented text detection in video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33, 412–419.

    Article  Google Scholar 

  19. Anthimopoulos, M., Gatos, B., & Pratikakis, I. (2010). A two-stage scheme for text detection in video images. Image and Vision Computing, 28, 1413–1426.

    Article  Google Scholar 

  20. Angadi, S. A., & Kodabagi, M. M. (2010). Text region extraction from low resolution natural scene images using texture features. In Advance computing conference (pp. 121–128).

  21. Baba, Y., & Akira, H. (2004). Proposal of the hybrid spectral gradient method to extract character/text regions from general scene images. In International conference on image processing (pp. 211–214).

  22. Khayyat, M., Lam, L., Suen, C. Y., et al. (2012). Arabic handwritten text line extraction by applying an adaptive mask to morphological dilation. In IAPR international workshop on document analysis systems (pp. 100–104).

  23. Liu, X., & Samarabandu, J. (2006). Multiscale edge-based text extraction from complex images. In IEEE international conference on multimedia and expo (pp. 1721–1724).

  24. Wang, C., Guo, Z., & Chen, Y. (2015). Seal extraction based on local thresholding techniques and color analysis. In International conference on service sciences (pp. 79–84).

  25. Ren, C., Liu, D., & Chen, Y. (2011). A new method on the segmentation and recognition of Chinese characters for automatic Chinese seal imprint retrieval. In International conference on document analysis and recognition (pp. 972–976).

  26. Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

Download references

Funding

Funding was provided by Natural Science Foundation of Hunan Province (No. 2018JJ3071).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Sun.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, T., Sun, B. Novel Features for Character Extraction of Historical Chinese Seal Images. Sens Imaging 20, 32 (2019). https://doi.org/10.1007/s11220-019-0253-z

Download citation

  • Received:

  • Revised:

  • Published:

  • DOI: https://doi.org/10.1007/s11220-019-0253-z

Keywords

Navigation