Skip to main content

Character-Independent Font Identification

  • Conference paper
  • First Online:
Document Analysis Systems (DAS 2020)

Abstract

There are a countless number of fonts with various shapes and styles. In addition, there are many fonts that only have subtle differences in features. Due to this, font identification is a difficult task. In this paper, we propose a method of determining if any two characters are from the same font or not. This is difficult due to the difference between fonts typically being smaller than the difference between alphabet classes. Additionally, the proposed method can be used with fonts regardless of whether they exist in the training or not. In order to accomplish this, we use a Convolutional Neural Network (CNN) trained with various font image pairs. In the experiment, the network is trained on image pairs of various fonts. We then evaluate the model on a different set of fonts that are unseen by the network. The evaluation is performed with an accuracy of 92.27%. Moreover, we analyzed the relationship between character classes and font identification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Throughout this paper, we assume that the pairs come from different character classes. This is simply because our font identification becomes a trivial task for the pairs of the same character class (e.g., ‘A’); if the two images are exactly the same, they are the same font; otherwise, they are different.

  2. 2.

    http://www.ultimatefontdownload.com/.

  3. 3.

    https://www.adobe.com/jp/products/fontfolio.html.

References

  1. Type Identifier for Beginners. Seibundo Shinkosha Publishing (2013)

    Google Scholar 

  2. Abe, K., Iwana, B.K., Holmér, V.G., Uchida, S.: Font creation using class discriminative deep convolutional generative adversarial networks. In: Asian Conference on Pattern Recognition, pp. 232–237 (2017)

    Google Scholar 

  3. Avilés-Cruz, C., Villegas, J., Arechiga-Martínez, R., Escarela-Perez, R.: Unsupervised font clustering using stochastic versio of the EM algorithm and global texture analysis. In: Sanfeliu, A., Martínez Trinidad, J.F., Carrasco Ochoa, J.A. (eds.) CIARP 2004. LNCS, vol. 3287, pp. 275–286. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30463-0_34

    Chapter  Google Scholar 

  4. Bagoriya, Y., Sharma, N.: Font type identification of Hindi printed document. Int. J. Res. Eng. Technol. 03(03), 513–516 (2014)

    Article  Google Scholar 

  5. Chaudhuri, B., Garain, U.: Automatic detection of italic, bold and all-capital words in document images. In: International Conference on Pattern Recognition (1998)

    Google Scholar 

  6. Chen, G., et al.: Large-scale visual font recognition. In: Conference on Computer Vision and Pattern Recognition, pp. 3598–3605 (2014)

    Google Scholar 

  7. Elgammal, A.M., Ismail, M.A.: Techniques for language identification for hybrid Arabic-English document images. In: International Conference on Document Analysis and Recognition (2001)

    Google Scholar 

  8. Ghosh, D., Dube, T., Shivaprasad, A.: Script recognition-a review. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2142–2161 (2010)

    Article  Google Scholar 

  9. Gupta, A., Gutierrez-Osuna, R., Christy, M., Furuta, R., Mandell, L.: Font identification in historical documents using active learning. arXiv preprint arXiv:1601.07252 (2016)

  10. Hafemann, L.G., Sabourin, R., Oliveira, L.S.: Learning features for offline handwritten signature verification using deep convolutional neural networks. Pattern Recogn. 70, 163–176 (2017)

    Article  Google Scholar 

  11. Hayashi, H., Abe, K., Uchida, S.: GlyphGAN: style-consistent font generation based on generative adversarial networks. Knowl.-Based Syst. 186, 104927 (2019)

    Article  Google Scholar 

  12. Jeong, C.B., Kwag, H.K., Kim, S., Kim, J.S., Park, S.C.: Identification of font styles and typefaces in printed Korean documents. In: International Conference on Asian Digital Libraries, pp. 666–669 (2003)

    Google Scholar 

  13. Jung, M.C., Shin, Y.C., Srihari, S.: Multifont classification using typographical attributes. In: International Conference on Document Analysis and Recognition (1999)

    Google Scholar 

  14. Khosravi, H., Kabir, E.: Farsi font recognition based on sobel-roberts features. Pattern Recogn. Lett. 31(1), 75–82 (2010)

    Article  Google Scholar 

  15. Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop (2015)

    Google Scholar 

  16. Li, Q., Li, J.P., Chen, L.: A bezier curve-based font generation algorithm for character fonts. In: International Conference on High Performance Computing and Communications, pp. 1156–1159 (2018)

    Google Scholar 

  17. Liu, A.H., Liu, Y.C., Yeh, Y.Y., Wang, Y.C.F.: A unified feature disentangler for multi-domain image translation and manipulation. In: Advances in Neural Information Processing Systems, pp. 2590–2599 (2018)

    Google Scholar 

  18. Liu, Y., Wei, F., Shao, J., Sheng, L., Yan, J., Wang, X.: Exploring disentangled feature representation beyond face identification. In: Conference on Computer Vision and Pattern Recognition, pp. 2080–2089 (2018)

    Google Scholar 

  19. Amer, I.M., ElSayed, S., Mostafa, M.G.: Deep Arabic font family and font size recognition. Int. J. Comput. Appl. 176(4), 1–6 (2017)

    Google Scholar 

  20. Ma, H., Doermann, D.: Font identification using the grating cell texture operator. In: Document Recognition and Retrieval XII, vol. 5676, pp. 148–156. International Society for Optics and Photonics (2005)

    Google Scholar 

  21. Miyazaki, T., et al.: Automatic generation of typographic font from small font subset. IEEE Comput. Graph. Appl. 40, 99–111 (2019)

    Article  Google Scholar 

  22. Moussa, S.B., Zahour, A., Benabdelhafid, A., Alimi, A.M.: New features using fractal multi-dimensions for generalized arabic font recognition. Pattern Recogn. Lett. 31(5), 361–371 (2010)

    Article  Google Scholar 

  23. Nguyen, H.T., Nguyen, C.T., Ino, T., Indurkhya, B., Nakagawa, M.: Text-independent writer identification using convolutional neural network. Pattern Recogn. Lett. 121, 104–112 (2019)

    Article  Google Scholar 

  24. Oöztuörk, S.: Font clustering and cluster identification in document images. J. Electron. Imaging 10(2), 418 (2001)

    Article  Google Scholar 

  25. Pal, U., Chaudhuri, B.B.: Identification of different script lines from multi-script documents. Image Vis. Comput. 20(13–14), 945–954 (2002)

    Article  Google Scholar 

  26. Pan, W., Suen, C., Bui, T.D.: Script identification using steerable Gabor filters. In: International Conference on Document Analysis and Recognition (2005)

    Google Scholar 

  27. Press, O., Galanti, T., Benaim, S., Wolf, L.: Emerging disentanglement in auto-encoder based unsupervised image content transfer (2018)

    Google Scholar 

  28. Ruiz, V., Linares, I., Sanchez, A., Velez, J.F.: Off-line handwritten signature verification using compositional synthetic generation of signatures and siamese neural networks. Neurocomputing 374, 30–41 (2020)

    Article  Google Scholar 

  29. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: Visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. (2019)

    Google Scholar 

  30. Shinahara, Y., Karamatsu, T., Harada, D., Yamaguchi, K., Uchida, S.: Serif or sans: visual font analytics on book covers and online advertisements. In: International Conference on Document Analysis and Recognition, pp. 1041–1046 (2019)

    Google Scholar 

  31. Suveeranont, R., Igarashi, T.: Example-based automatic font generation. In: International Symposium on Smart Graphics, pp. 127–138 (2010)

    Google Scholar 

  32. Tan, T.: Rotation invariant texture features and their use in automatic script identification. IEEE Trans. Pattern Anal. Mach. Intell. 20(7), 751–756 (1998)

    Article  Google Scholar 

  33. Uchida, S., Ide, S., Iwana, B.K., Zhu, A.: A further step to perfect accuracy by training CNN with larger data. In: International Conference on Frontiers in Handwriting Recognition, pp. 405–410 (2016)

    Google Scholar 

  34. Wang, Z., et al.: DeepFont: identify your font from an image. In: ACM International Conference on Multimedia, pp. 451–459 (2015)

    Google Scholar 

  35. Xing, Z.J., yi-chao wu, Liu, C.L., Yin, F.: Offline signature verification using convolution siamese network. In: Yu, H., Dong, J. (eds.) International Conference on Graphic and Image Processing (2018)

    Google Scholar 

  36. Yang, Z., Yang, L., Qi, D., Suen, C.Y.: An EMD-based recognition method for chinese fonts and styles. Pattern Recogn. Lett. 27(14), 1692–1701 (2006)

    Article  Google Scholar 

  37. Zheng, Y., Ohyama, W., Iwana, B.K., Uchida, S.: Capturing micro deformations from pooling layers for offline signature verification. In: International Conference on Document Analysis and Recognition, pp. 1111–1116 (2019)

    Google Scholar 

Download references

Acknowledgment

This work was supported by JSPS KAKENHI Grant Number JP17H06100.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daichi Haraguchi .

Editor information

Editors and Affiliations

Appendix a Font Identification Using a Dataset with Less Fancy Fonts

Appendix a Font Identification Using a Dataset with Less Fancy Fonts

The dataset used in the above experiment contains many fancy fonts and thus there was a possibility that our evaluation might overestimate the font identification performance; this is because fancy fonts are sometimes easy to be identified by their particular appearance. We, therefore, use another font dataset, called the Adobe Font Folio 11.1Footnote 3. From this font set, we selected 1,132 fonts, which are comprised of 511 Serif, 314 Sans Serif, 151 Serif-Sans Hybrid, 74 Script, 61 Historical Script, and (only) 21 Fancy fonts. Note that this font type classification for the 1,132 fonts is given by [1]. We used the same neural network trained by the dataset of Sect. 4.1, i.e., trained with the fancy font dataset and tested on the Adobe dataset. Note that for the evaluation, 367,900 positive pairs and 367,900 negative pairs are prepared using the 1,132 fonts. Using the Adobe fonts as the test, the identification accuracy was 88.33 ± 0.89%. This was lower than \(92.27\%\) of the original dataset. However, considering the fact that formal fonts are often very similar to each other, we can still say that the character-independent font identification is possible even for the formal fonts.

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Haraguchi, D., Harada, S., Iwana, B.K., Shinahara, Y., Uchida, S. (2020). Character-Independent Font Identification. In: Bai, X., Karatzas, D., Lopresti, D. (eds) Document Analysis Systems. DAS 2020. Lecture Notes in Computer Science(), vol 12116. Springer, Cham. https://doi.org/10.1007/978-3-030-57058-3_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-57058-3_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-57057-6

  • Online ISBN: 978-3-030-57058-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics