A Comparison of Approaches for Automated Text Extraction from Scholarly Figures

  • Falk Böschen
  • Ansgar Scherp
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10132)


So far, there has not been a comparative evaluation of different approaches for text extraction from scholarly figures. In order to fill this gap, we have defined a generic pipeline for text extraction that abstracts from the existing approaches as documented in the literature. In this paper, we use this generic pipeline to systematically evaluate and compare 32 configurations for text extraction over four datasets of scholarly figures of different origin and characteristics. In total, our experiments have been run over more than 400 manually labeled figures. The experimental results show that the approach BS-4OS results in the best F-measure of 0.67 for the Text Location Detection and the best average Levenshtein Distance of 4.71 between the recognized text and the gold standard on all four datasets using the Ocropy OCR engine.


Scholarly figures Text extraction Comparison 



This research was co-financed by the EU H2020 project MOVING ( under contract no 693092.


  1. 1.
    Böschen, F., Scherp, A.: A systematic comparison of different approaches for unsupervised extraction of text from scholarly figures [extended report]. Technical report 1607, Christian-Albrechts-Universität zu Kiel (2016).
  2. 2.
    Böschen, F., Scherp, A.: Formalization and preliminary evaluation of a pipeline for text extraction from infographics. In: Bergmann, R., Görg, S., Müller, G. (eds.) LWA 2015 Workshop: KDML, pp. 20–31. CEUR (2015)Google Scholar
  3. 3.
    Böschen, F., Scherp, A.: Multi-oriented text extraction from information graphics. In: DocEng, pp. 35–38. ACM (2015)Google Scholar
  4. 4.
    Carberry, S., Elzer, S., Demir, S.: Information graphics: an untapped resource for digital libraries. In: SIGIR, pp. 581–588. ACM (2006)Google Scholar
  5. 5.
    Chiang, Y., Knoblock, C.A.: A general approach for extracting road vector data from raster maps. IJDAR 16(1), 55–81 (2013)CrossRefGoogle Scholar
  6. 6.
    Chiang, Y., Knoblock, C.A.: Recognizing text in raster maps. GeoInformatica 19(1), 1–27 (2015)CrossRefGoogle Scholar
  7. 7.
    Choudhury, S.R., Giles, C.L.: An architecture for information extraction from figures in digital libraries. In: WWW, pp. 667–672 (2015)Google Scholar
  8. 8.
    Fraz, M., Sarfraz, M.S., Edirisinghe, E.A.: Exploiting colour information for better scene text detection and recognition. IJDAR 18(2), 153–167 (2015)CrossRefGoogle Scholar
  9. 9.
    Huang, W., Tan, C.L., Leow, W.K.: Associating text and graphics for scientific chart understanding. In: ICDAR, pp. 580–584. IEEE Computer Society (2005)Google Scholar
  10. 10.
    Jayant, C., Renzelmann, M., Wen, D., Krisnandi, S., Ladner, R.E., Comden, D.: Automated tactile graphics translation: in the field. In: ASSETS, pp. 75–82 (2007)Google Scholar
  11. 11.
    Jiuzhou, Z.: Creation of synthetic chart image database with ground truth. Honors year project report, National University of Singapore (2006).
  12. 12.
    Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S.K., Bagdanov, A.D., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., Shafait, F., Uchida, S., Valveny, E.: ICDAR 2015 competition on robust reading. In: ICDAR, 23–26 August 2015, pp. 1156–1160. IEEE Computer Society (2015)Google Scholar
  13. 13.
    Khurshid, K., Siddiqi, I., Faure, C., Vincent, N.: Comparison of Niblack inspired binarization methods for ancient documents. In: Document Recognition and Retrieval (DRR), pp. 1–10. SPIE (2009)Google Scholar
  14. 14.
    Lu, X., Kataria, S., Brouwer, W.J., Wang, J.Z., Mitra, P., Giles, C.L.: Automated analysis of images in documents for intelligent document search. IJDAR 12(2), 65–81 (2009)CrossRefGoogle Scholar
  15. 15.
    Otsu, N.: A threshold selection method from gray-level histograms. TSMC 9(1), 62–66 (1979)MathSciNetGoogle Scholar
  16. 16.
    Samet, H., Tamminen, M.: Efficient component labeling of images of arbitrary dimension represented by linear bintrees. IEEE TPAMI 10(4), 579–586 (1988)CrossRefGoogle Scholar
  17. 17.
    Sas, J., Zolnierek, A.: Three-stage method of text region extraction from diagram raster images. In: Burduk, R., Jackowski, K., Kurzynski, M., Wozniak, M., Zolnierek, A. (eds.) Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013, vol. 226, pp. 527–538. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  18. 18.
    Savva, M., Kong, N., Chhajta, A., Fei-Fei, L., Agrawala, M., Heer, J.: ReVision: automated classification, analysis and redesign of chart images. In: UIST, pp. 393–402. ACM (2011)Google Scholar
  19. 19.
    Xu, S., Krauthammer, M.: A new pivoting and iterative text detection algorithm for biomedical images. J. Biomed. Inform. 43, 924–931 (2010)CrossRefGoogle Scholar
  20. 20.
    Yang, L., Huang, W., Tan, C.L.: Semi-automatic ground truth generation for chart image recognition. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 324–335. Springer, Heidelberg (2006). doi: 10.1007/11669487_29 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Kiel UniversityKielGermany
  2. 2.ZBW - Leibniz Information Centre for EconomicsKielGermany

Personalised recommendations