Automatic Compound Figure Separation in Scientific Articles: A Study of Edge Map and Its Role for Stitched Panel Boundary Detection

  • A. Aafaque
  • K. C. SantoshEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 709)


We present a technique that uses edge map to separate panels from stitched compound figures appearing in biomedical scientific research articles. Since such figures may comprise images from different imaging modalities, separating them is a critical first step for effective biomedical content-based image retrieval (CBIR). We study state-of-the-art edge detection algorithms to detect gray-level pixel changes. It then applies a line vectorization process that connects prominent broken lines along the panel boundaries while eliminating insignificant line segments within the panels. We have validated our fully automatic technique on a subset of stitched multipanel biomedical figures extracted from articles within the Open Access subset of PubMed Central repository, and have achieved precision and recall of 74.20% and 71.86%, respectively, in less than 0.272 s per image, on average.


Automation Edge detection Stitched multipanel figures Biomedical publications Content-based image retrieval 


  1. 1.
    Aigrain, P., Zhang, H., Petkovic, D.: Content-based representation and retrieval of visual media: a state-of-the-art review. Multimedia Tools Appl. 3(3), 179–202 (1996)CrossRefGoogle Scholar
  2. 2.
    Akgül, C.B., Rubin, D.L., Napel, S., Beaulieu, C.F., Greenspan, H., Acar, B.: Content-based image retrieval in radiology: current status and future directions. J. Digit. Imaging 24(2), 208–222 (2011)CrossRefGoogle Scholar
  3. 3.
    Apostolova, E., You, D., Xue, Z., Antani, S., Demner-Fushman, D., Thoma, G.R.: Image retrieval from scientific publications: text and image content processing to separate multipanel figures. JASIST 64(5), 893–908 (2013)CrossRefGoogle Scholar
  4. 4.
    Aucar, J.A., Fernandez, L., Wagner-Mann, C.: If a picture is worth a thousand words, what is a trauma computerized tomography panel worth? Am. J. Surg. 6(194), 734–740 (2007)CrossRefGoogle Scholar
  5. 5.
    Cheng, B., Antani, S., Stanley, R.J., Thoma, G.R.: Automatic segmentation of subfigure image panels for multimodal biomedical document retrieval. In: Agam, G., Viard-Gaudin, C. (eds.) Proceedings of the 18th Document Recognition and Retrieval Conference, Part of the IS&T-SPIE Electronic Imaging Symposium on Document Recognition and Retrieval XVIII - DRR 2011, San Jose, CA, USA. SPIE Proceedings, vol. 7874, pp. 1–10, 24–29 January 2011Google Scholar
  6. 6.
    Chhatkuli, A., Markonis, D., Foncubierta-Rodríguez, A., Meriaudeau, F., Müller, H.: Separating compound figures in journal articles to allow for subfigure classification. In: SPIE, Medical Imaging (2013)Google Scholar
  7. 7.
    Cooper, M.S., Sommers-Herivel, G., Poage, C.T., McCarthy, M.B., Crawford, B.D., Phillips, C.: The zebrafish DVD exchange project: a bioinformatics initiative. Methods Cell Biol. 77, 439–457 (2004)CrossRefGoogle Scholar
  8. 8.
    Demner-Fushman, D., Antani, S., Simpson, M.S., Thoma, G.R.: Design and development of a multimodal biomedical information retrieval system. J. Comput. Sci. Eng. 6(2), 168–177 (2012)CrossRefGoogle Scholar
  9. 9.
    Dice, L.R.: Measures of the amount of ecologic association between species. Ecology 26(3), 297–302 (1945)CrossRefGoogle Scholar
  10. 10.
    de Herrera, A.G.S., Kalpathy-Cramer, J., Demner-Fushman, D., Antani, S., Müller, H.: Overview of the imageCLEF 2013 medical tasks. In: Forner, P., Navigli, R., Tufis, D., Ferro, N. (eds.) Proceedings of the Working Notes for CLEF 2013 Conference, Valencia, Spain. CEUR Workshop, vol. 1179., 23–26 September 2013Google Scholar
  11. 11.
    Kalpathy-Cramer, J., Müller, H., Bedrick, S., Eggel, I., de Herrera, A.G.S., Tsikrika, T.: Overview of the CLEF 2011 medical image classification and retrieval tasks. In: Petras, V., Forner, P., Clough, P.D. (eds.) CLEF 2011 Labs and Workshop, Notebook Papers, Amsterdam, The Netherlands. CEUR Workshop Proceedings, vol. 1177, 19–22 September 2011Google Scholar
  12. 12.
    Licklider, J.C.R.: A picture is worth a thousand words: and it costs. In: Proceedings of the Joint Computer Conference, AFIPS 1969 (Spring), pp. 617–621. ACM, New York (1969)Google Scholar
  13. 13.
    Lopez, L.D., Yu, J., Arighi, C.N., Tudor, C.O., Torii, M., Huang, H., Vijay-Shanker, K., Wu, C.H.: A framework for biomedical figure segmentation towards image-based document retrieval. BMC Syst. Biol. 7(s–4), s8 (2013)CrossRefGoogle Scholar
  14. 14.
    Müller, H.: Medical (visual) information retrieval. In: Agosti, M., Ferro, N., Forner, P., Müller, H., Santucci, G. (eds.) PROMISE 2012. LNCS, vol. 7757, pp. 155–166. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-36415-0_10 CrossRefGoogle Scholar
  15. 15.
    Müller, H., de Herrera, A.G.S., Kalpathy-Cramer, J., Demner-Fushman, D., Antani, S., Eggel, I.: Overview of the imageCLEF 2012 medical image retrieval and classification tasks. In: Forner, P., Karlgren, J., Womser-Hacker, C. (eds.) CLEF 2012 Evaluation Labs and Workshop, Online Working Notes, Rome, Italy. CEUR Workshop Proceedings, vol. 1178, 17–20 September 2012Google Scholar
  16. 16.
    Müller, H., Michoux, N., Bandon, D., Geissbühler, A.: A review of content-based image retrieval systems in medical applications - clinical benefits and future directions. Int. J. Med. Inform. 73(1), 1–23 (2004)CrossRefGoogle Scholar
  17. 17.
    Murphy, R.F., Velliste, M., Yao, J., Porreca, G.: Searching online journals for fluorescence microscope images depicting protein subcellular location patterns. In: Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering, BIBE 2001, pp. 119–128 (2001)Google Scholar
  18. 18.
    Rahman, M.M., You, D., Simpson, M.S., Antani, S., Demner-Fushman, D., Thoma, G.R.: Interactive cross and multimodal biomedical image retrieval based on automatic region-of-interest (ROI) identification and classification. Int. J. Multimedia Inf. Retr. 3(3), 131–146 (2014)CrossRefGoogle Scholar
  19. 19.
    Santosh, K.C., Antani, S.K., Thoma, G.R.: Stitched multipanel biomedical figure separation. In: 28th IEEE International Symposium on Computer-Based Medical Systems, CBMS, pp. 54–59 (2015)Google Scholar
  20. 20.
    Simpson, M.S., Demner-Fushman, D., Antani, S., Thoma, G.R.: Multimodal biomedical image indexing and retrieval using descriptive text and global feature mapping. Inf. Retr. 17(3), 229–264 (2014)CrossRefGoogle Scholar
  21. 21.
    Yu, H.: Towards answering biological questions with experimental evidence: automatically identifying text that summarize image content in full-text articles, pp. 834–838 (2006)Google Scholar
  22. 22.
    Santosh, K.C., Aafaque, A., Antani, S., Thoma, G.R.: Line segment-based stitched multipanel figure separation for effective biomedical CBIR. Int. J. Pattern Recogn. Artif. Intell. 31(5), 1757003 (2017). 17 pages.

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceThe University of South DakotaVermillionUSA

Personalised recommendations