Character-Independent Font Identification

Haraguchi, Daichi; Harada, Shota; Iwana, Brian Kenji; Shinahara, Yuto; Uchida, Seiichi

doi:10.1007/978-3-030-57058-3_35

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12116))

Included in the following conference series:

International Workshop on Document Analysis Systems

4 Citations
1 Altmetric

Abstract

There are a countless number of fonts with various shapes and styles. In addition, there are many fonts that only have subtle differences in features. Due to this, font identification is a difficult task. In this paper, we propose a method of determining if any two characters are from the same font or not. This is difficult due to the difference between fonts typically being smaller than the difference between alphabet classes. Additionally, the proposed method can be used with fonts regardless of whether they exist in the training or not. In order to accomplish this, we use a Convolutional Neural Network (CNN) trained with various font image pairs. In the experiment, the network is trained on image pairs of various fonts. We then evaluate the model on a different set of fonts that are unseen by the network. The evaluation is performed with an accuracy of 92.27%. Moreover, we analyzed the relationship between character classes and font identification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Large-Scale Font Identification from Document Images

Font Recognition in Natural Images via Transfer Learning

A Novel Bangla Font Recognition Approach Using Deep Learning

Notes

1.
Throughout this paper, we assume that the pairs come from different character classes. This is simply because our font identification becomes a trivial task for the pairs of the same character class (e.g., ‘A’); if the two images are exactly the same, they are the same font; otherwise, they are different.
2.
http://www.ultimatefontdownload.com/.
3.
https://www.adobe.com/jp/products/fontfolio.html.

References

Type Identifier for Beginners. Seibundo Shinkosha Publishing (2013)
Google Scholar
Abe, K., Iwana, B.K., Holmér, V.G., Uchida, S.: Font creation using class discriminative deep convolutional generative adversarial networks. In: Asian Conference on Pattern Recognition, pp. 232–237 (2017)
Google Scholar
Avilés-Cruz, C., Villegas, J., Arechiga-Martínez, R., Escarela-Perez, R.: Unsupervised font clustering using stochastic versio of the EM algorithm and global texture analysis. In: Sanfeliu, A., Martínez Trinidad, J.F., Carrasco Ochoa, J.A. (eds.) CIARP 2004. LNCS, vol. 3287, pp. 275–286. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30463-0_34
Chapter Google Scholar
Bagoriya, Y., Sharma, N.: Font type identification of Hindi printed document. Int. J. Res. Eng. Technol. 03(03), 513–516 (2014)
Article Google Scholar
Chaudhuri, B., Garain, U.: Automatic detection of italic, bold and all-capital words in document images. In: International Conference on Pattern Recognition (1998)
Google Scholar
Chen, G., et al.: Large-scale visual font recognition. In: Conference on Computer Vision and Pattern Recognition, pp. 3598–3605 (2014)
Google Scholar
Elgammal, A.M., Ismail, M.A.: Techniques for language identification for hybrid Arabic-English document images. In: International Conference on Document Analysis and Recognition (2001)
Google Scholar
Ghosh, D., Dube, T., Shivaprasad, A.: Script recognition-a review. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2142–2161 (2010)
Article Google Scholar
Gupta, A., Gutierrez-Osuna, R., Christy, M., Furuta, R., Mandell, L.: Font identification in historical documents using active learning. arXiv preprint arXiv:1601.07252 (2016)
Hafemann, L.G., Sabourin, R., Oliveira, L.S.: Learning features for offline handwritten signature verification using deep convolutional neural networks. Pattern Recogn. 70, 163–176 (2017)
Article Google Scholar
Hayashi, H., Abe, K., Uchida, S.: GlyphGAN: style-consistent font generation based on generative adversarial networks. Knowl.-Based Syst. 186, 104927 (2019)
Article Google Scholar
Jeong, C.B., Kwag, H.K., Kim, S., Kim, J.S., Park, S.C.: Identification of font styles and typefaces in printed Korean documents. In: International Conference on Asian Digital Libraries, pp. 666–669 (2003)
Google Scholar
Jung, M.C., Shin, Y.C., Srihari, S.: Multifont classification using typographical attributes. In: International Conference on Document Analysis and Recognition (1999)
Google Scholar
Khosravi, H., Kabir, E.: Farsi font recognition based on sobel-roberts features. Pattern Recogn. Lett. 31(1), 75–82 (2010)
Article Google Scholar
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop (2015)
Google Scholar
Li, Q., Li, J.P., Chen, L.: A bezier curve-based font generation algorithm for character fonts. In: International Conference on High Performance Computing and Communications, pp. 1156–1159 (2018)
Google Scholar
Liu, A.H., Liu, Y.C., Yeh, Y.Y., Wang, Y.C.F.: A unified feature disentangler for multi-domain image translation and manipulation. In: Advances in Neural Information Processing Systems, pp. 2590–2599 (2018)
Google Scholar
Liu, Y., Wei, F., Shao, J., Sheng, L., Yan, J., Wang, X.: Exploring disentangled feature representation beyond face identification. In: Conference on Computer Vision and Pattern Recognition, pp. 2080–2089 (2018)
Google Scholar
Amer, I.M., ElSayed, S., Mostafa, M.G.: Deep Arabic font family and font size recognition. Int. J. Comput. Appl. 176(4), 1–6 (2017)
Google Scholar
Ma, H., Doermann, D.: Font identification using the grating cell texture operator. In: Document Recognition and Retrieval XII, vol. 5676, pp. 148–156. International Society for Optics and Photonics (2005)
Google Scholar
Miyazaki, T., et al.: Automatic generation of typographic font from small font subset. IEEE Comput. Graph. Appl. 40, 99–111 (2019)
Article Google Scholar
Moussa, S.B., Zahour, A., Benabdelhafid, A., Alimi, A.M.: New features using fractal multi-dimensions for generalized arabic font recognition. Pattern Recogn. Lett. 31(5), 361–371 (2010)
Article Google Scholar
Nguyen, H.T., Nguyen, C.T., Ino, T., Indurkhya, B., Nakagawa, M.: Text-independent writer identification using convolutional neural network. Pattern Recogn. Lett. 121, 104–112 (2019)
Article Google Scholar
Oöztuörk, S.: Font clustering and cluster identification in document images. J. Electron. Imaging 10(2), 418 (2001)
Article Google Scholar
Pal, U., Chaudhuri, B.B.: Identification of different script lines from multi-script documents. Image Vis. Comput. 20(13–14), 945–954 (2002)
Article Google Scholar
Pan, W., Suen, C., Bui, T.D.: Script identification using steerable Gabor filters. In: International Conference on Document Analysis and Recognition (2005)
Google Scholar
Press, O., Galanti, T., Benaim, S., Wolf, L.: Emerging disentanglement in auto-encoder based unsupervised image content transfer (2018)
Google Scholar
Ruiz, V., Linares, I., Sanchez, A., Velez, J.F.: Off-line handwritten signature verification using compositional synthetic generation of signatures and siamese neural networks. Neurocomputing 374, 30–41 (2020)
Article Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: Visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. (2019)
Google Scholar
Shinahara, Y., Karamatsu, T., Harada, D., Yamaguchi, K., Uchida, S.: Serif or sans: visual font analytics on book covers and online advertisements. In: International Conference on Document Analysis and Recognition, pp. 1041–1046 (2019)
Google Scholar
Suveeranont, R., Igarashi, T.: Example-based automatic font generation. In: International Symposium on Smart Graphics, pp. 127–138 (2010)
Google Scholar
Tan, T.: Rotation invariant texture features and their use in automatic script identification. IEEE Trans. Pattern Anal. Mach. Intell. 20(7), 751–756 (1998)
Article Google Scholar
Uchida, S., Ide, S., Iwana, B.K., Zhu, A.: A further step to perfect accuracy by training CNN with larger data. In: International Conference on Frontiers in Handwriting Recognition, pp. 405–410 (2016)
Google Scholar
Wang, Z., et al.: DeepFont: identify your font from an image. In: ACM International Conference on Multimedia, pp. 451–459 (2015)
Google Scholar
Xing, Z.J., yi-chao wu, Liu, C.L., Yin, F.: Offline signature verification using convolution siamese network. In: Yu, H., Dong, J. (eds.) International Conference on Graphic and Image Processing (2018)
Google Scholar
Yang, Z., Yang, L., Qi, D., Suen, C.Y.: An EMD-based recognition method for chinese fonts and styles. Pattern Recogn. Lett. 27(14), 1692–1701 (2006)
Article Google Scholar
Zheng, Y., Ohyama, W., Iwana, B.K., Uchida, S.: Capturing micro deformations from pooling layers for offline signature verification. In: International Conference on Document Analysis and Recognition, pp. 1111–1116 (2019)
Google Scholar

Download references

Acknowledgment

This work was supported by JSPS KAKENHI Grant Number JP17H06100.

Author information

Authors and Affiliations

Kyushu University, Fukuoka, Japan
Daichi Haraguchi, Shota Harada, Brian Kenji Iwana, Yuto Shinahara & Seiichi Uchida

Authors

Daichi Haraguchi
View author publications
You can also search for this author in PubMed Google Scholar
Shota Harada
View author publications
You can also search for this author in PubMed Google Scholar
Brian Kenji Iwana
View author publications
You can also search for this author in PubMed Google Scholar
Yuto Shinahara
View author publications
You can also search for this author in PubMed Google Scholar
Seiichi Uchida
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daichi Haraguchi .

Editor information

Editors and Affiliations

Huazhong University of Science and Technology, Wuhan, China
Xiang Bai
Autonomous University of Barcelona, Barcelona, Spain
Dimosthenis Karatzas
Lehigh University, Bethlehem, PA, USA
Daniel Lopresti

Appendix a Font Identification Using a Dataset with Less Fancy Fonts

The dataset used in the above experiment contains many fancy fonts and thus there was a possibility that our evaluation might overestimate the font identification performance; this is because fancy fonts are sometimes easy to be identified by their particular appearance. We, therefore, use another font dataset, called the Adobe Font Folio 11.1^{Footnote 3}. From this font set, we selected 1,132 fonts, which are comprised of 511 Serif, 314 Sans Serif, 151 Serif-Sans Hybrid, 74 Script, 61 Historical Script, and (only) 21 Fancy fonts. Note that this font type classification for the 1,132 fonts is given by [1]. We used the same neural network trained by the dataset of Sect. 4.1, i.e., trained with the fancy font dataset and tested on the Adobe dataset. Note that for the evaluation, 367,900 positive pairs and 367,900 negative pairs are prepared using the 1,132 fonts. Using the Adobe fonts as the test, the identification accuracy was 88.33 ± 0.89%. This was lower than \(92.27\%\) of the original dataset. However, considering the fact that formal fonts are often very similar to each other, we can still say that the character-independent font identification is possible even for the formal fonts.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Haraguchi, D., Harada, S., Iwana, B.K., Shinahara, Y., Uchida, S. (2020). Character-Independent Font Identification. In: Bai, X., Karatzas, D., Lopresti, D. (eds) Document Analysis Systems. DAS 2020. Lecture Notes in Computer Science(), vol 12116. Springer, Cham. https://doi.org/10.1007/978-3-030-57058-3_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-57058-3_35
Published: 14 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57057-6
Online ISBN: 978-3-030-57058-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Character-Independent Font Identification

Abstract

Access this chapter

Similar content being viewed by others

Large-Scale Font Identification from Document Images

Font Recognition in Natural Images via Transfer Learning

A Novel Bangla Font Recognition Approach Using Deep Learning

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix a Font Identification Using a Dataset with Less Fancy Fonts

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Character-Independent Font Identification

Abstract

Access this chapter

Similar content being viewed by others

Large-Scale Font Identification from Document Images

Font Recognition in Natural Images via Transfer Learning

A Novel Bangla Font Recognition Approach Using Deep Learning

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix a Font Identification Using a Dataset with Less Fancy Fonts

Appendix a Font Identification Using a Dataset with Less Fancy Fonts

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation