Skip to main content

Compound Figure Separation Combining Edge and Band Separator Detection

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9516))

Included in the following conference series:

Abstract

We propose an image processing algorithm to automatically separate compound figures appearing in scientific articles. We classify compound images into two classes and apply different algorithms for detecting vertical and horizontal separators to each class: the edge-based algorithm aims at detecting visible edges between subfigures, whereas the band-based algorithm tries to detect whitespace separating subfigures (separator bands). The proposed algorithm has been evaluated on two datasets for compound figure separation (CFS) in the biomedical domain and compares well to semi-automatic or more comprehensive state-of-the-art approaches. Additional experiments investigate CFS effectiveness and classification accuracy of various classifier implementations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In recently published datasets drawn from open access biomedical literature, between 40 % and 60 % of figures occurring in articles are compound figures [1, 3, 4].

  2. 2.

    We therefore call NLM’s approach [7] semi-automatic, although an automatic classifier could be easily integrated.

  3. 3.

    Section 3 describes 17 internal parameters. A table with initial and optimized parameter values could not be included due to space constraints, but will be provided by authors upon request.

  4. 4.

    The dataset reported in [1] contains 400 images with 1764 ground-truth subfigures, so reported recall may be up to 0.4 % higher if evaluated on the 398 images of the dataset available to us.

References

  1. Apostolova, E., You, D., Xue, Z., Antani, S., Demner-Fushman, D., Thoma, G.R.: Image retrieval from scientific publications: text and image content processing to separate multipanel figures. J. Assoc. Inf. Sci. Technol. 64(5), 893–908 (2013)

    Article  Google Scholar 

  2. Chatzichristofis, S.A., Boutalis, Y.S.: CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS, vol. 5008, pp. 312–322. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  3. Chhatkuli, A., Foncubierta-Rodríguez, A., Markonis, D., Meriaudeau, F., Müller, H.: Separating compound figures in journal articles to allow for subfigure classification. In: Proceedings of the SPIE, vol. 8674, pp. 86740J–86740J-12 (2013)

    Google Scholar 

  4. García Seco de Herrera, A., Kalpathy-Cramer, J., Demner-Fushman, D., Antani, S., Müller, H.: Overview of the ImageCLEF 2013 medical tasks. CLEF 2013 Working Notes. CEUR Proc., vol. 1179 (2013). http://ceur-ws.org/Vol-1179/

  5. García Seco de Herrera, A., Müller, H., Bromuri, S.: Overview of the ImageCLEF 2015 medical classification task. CLEF 2015 Working Notes. CEUR Proc., vol. 1391 (2015). http://ceur-ws.org/Vol-1391/

  6. Kitanovski, I., Dimitrovski, I., Loskovska, S.: FCSE at medical tasks of ImageCLEF 2013. CLEF 2013 Working Notes. CEUR Proc., vol. 1179 (2013). http://ceur-ws.org/Vol-1179/

  7. Santosh, K., Xue, Z., Antani, S., Thoma, G.: NLM at ImageCLEF 2015: biomedical multipanel figure separation. CLEF 2015 Working Notes. CEUR Proc., vol. 1391 (2015). http://ceur-ws.org/Vol-1391/

  8. Shatkay, H., Chen, N., Blostein, D.: Integrating image data into biomedical text categorization. Bioinformatics 22(14), 446–453 (2006). http://dx.doi.org/10.1093/bioinformatics/btl235

    Article  Google Scholar 

  9. Simpson, M.S., You, D., Rahman, M.M., Xue, Z., Demner-Fushman, D., Antani, S., Thoma, G.: Literature-based biomedical image classification and retrieval. Comput. Med. Imag. Graph. 39, 3–13 (2015)

    Article  Google Scholar 

  10. Taschwer, M., Marques, O.: AAUITEC at ImageCLEF 2015: Compound figure separation. CLEF 2015 Working Notes. CEUR Proc., vol. 1391 (2015). http://ceur-ws.org/Vol-1391/

  11. Yuan, X., Ang, D.: A novel figure panel classification and extraction method for document image understanding. Int. J. Data Min. Bioinform. 9(1), 22–36 (2014). http://dx.doi.org/10.1504/IJDMB.2014.057779

    Article  Google Scholar 

Download references

Acknowledgements

We thank Sameer Antani (NLM) and the authors of [1] for providing their compound figure separation dataset for evaluation purposes, and Laszlo Böszörmenyi (ITEC, AAU) for valuable discussions and comments on this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mario Taschwer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Taschwer, M., Marques, O. (2016). Compound Figure Separation Combining Edge and Band Separator Detection. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27671-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27670-0

  • Online ISBN: 978-3-319-27671-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics