Bag–of–Colors for Biomedical Document Image Classification

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7723)


The number of biomedical publications has increased noticeably in the last 30 years. Clinicians and medical researchers regularly have unmet information needs but require more time for searching than is usually available to find publications relevant to a clinical situation. The techniques described in this article are used to classify images from the biomedical open access literature into categories, which can potentially reduce the search time. Only the visual information of the images is used to classify images based on a benchmark database of ImageCLEF 2011 created for the task of image classification and image retrieval. We evaluate particularly the importance of color in addition to the frequently used texture and grey level features.

Results show that bags–of–colors in combination with the Scale Invariant Feature Transform (SIFT) provide an image representation allowing to improve the classification quality. Accuracy improved from 69.75% of the best system in ImageCLEF 2011 using visual information, only, to 72.5% of the system described in this paper. The results highlight the importance of color for the classification of biomedical images.


bag–of–colors SIFT image categorization ImageCLEF 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hunter, L., Cohen, K.B.: Biomedical language processing: What’s beyond pubmed? Molecular Cell 21(5), 589–594 (2006)CrossRefGoogle Scholar
  2. 2.
    Depeursinge, A., Duc, S., Eggel, I., Müller, H.: Mobile medical visual information retrieval. IEEE Transactions on Information Technology in BioMedicine 16(1), 53–61 (2012)CrossRefGoogle Scholar
  3. 3.
    Hersh, W., Jensen, J., Müller, H., Gorman, P., Ruch, P.: A qualitative task analysis for developing an image retrieval test collection. In: ImageCLEF/MUSCLE Workshop on Image Retrieval Evaluation, Vienna, Austria, pp. 11–16 (2005)Google Scholar
  4. 4.
    Müller, H., Despont-Gros, C., Hersh, W., Jensen, J., Lovis, C., Geissbuhler, A.: Health care professionals’ image use and search behaviour. In: Proceedings of the Medical Informatics Europe Conference (MIE 2006). Studies in Health Technology and Informatics, pp. 24–32. IOS Press, Maastricht (2006)Google Scholar
  5. 5.
    Hersh, W.R., Hickam, D.H.: How well do physicians use electronic information retrieval systems? Journal of the American Medical Association 280(15), 1347–1352 (1998)CrossRefGoogle Scholar
  6. 6.
    Hoogendam, A., Stalenhoef, A.F., de Vries Robbé, P.F., Overbeke, A.J.: Answers to questions posed during daily patient care are more likely to be answered by uptodate than pubmed. Journal of Medical Internet Research 10(4) (2008)Google Scholar
  7. 7.
    Kahn, C.E., Thao, C.: Goldminer: A radiology image search engine. American Journal of Roentgenology 188(6), 1475–1478 (2007)CrossRefGoogle Scholar
  8. 8.
    Rafkind, B., Lee, M., Chang, S.-F., Yu, H.: Exploring text and image features to classify images in bioscience literature. In: Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis, New York, NY, USA, pp. 73–80 (2006)Google Scholar
  9. 9.
    Demner-Fushman, D., Antani, S., Siadat, M.R., Soltanian-Zadeh, H., Fotouhi, F., Elisevich, K.: Automatically finding images for clinical decision support. In: Proceedings of the Seventh IEEE International Conference on Data Mining Workshops, ICDMW 2007, pp. 139–144. IEEE Computer Society, Washington, DC (2007)Google Scholar
  10. 10.
    Pentland, A.P., Picard, R.W., Scarloff, S.: Photobook: Tools for content–based manipulation of image databases. International Journal of Computer Vision 18(3), 233–254 (1996)CrossRefGoogle Scholar
  11. 11.
    Lakdashti, A., Moin, M.S.: A New Content-Based Image Retrieval Approach Based on Pattern Orientation Histogram. In: Gagalowicz, A., Philips, W. (eds.) MIRAGE 2007. LNCS, vol. 4418, pp. 587–595. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Jain, A.K., Vailaya, A.: Image retrieval using color and shape. Pattern Recognition 29(8), 1233–1244 (1996)CrossRefGoogle Scholar
  13. 13.
    van de Sande, K.E., Gevers, T., Snoek, C.G.: A comparison of color features for visual concept classification. In: Proceedings of the 2008 International Conference on Content-Based Image and Video Retrieval, CIVR 2008, pp. 141–150. ACM, New York (2008)CrossRefGoogle Scholar
  14. 14.
    Tou, J.Y., Tay, Y.H., Lau, P.Y.: Recent trends in texture classification: A review. In: Symposium on Progress in Information & Communication Technology, Kuala Lumpur, Malaysia, pp. 63–68 (2009)Google Scholar
  15. 15.
    Zhang, D., Lu, G.: Review of shape representation and description techniques. Pattern Recognition 37(1), 1–19 (2004)zbMATHCrossRefGoogle Scholar
  16. 16.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  17. 17.
    Burghouts, G.J., Geusebroek, J.M.: Performance evaluation of local colour invariants. Compututer Vision and Image Understanding 113(1), 48–62 (2009)CrossRefGoogle Scholar
  18. 18.
    van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 1582–1596 (2010)CrossRefGoogle Scholar
  19. 19.
    Ai, D., Han, X.H., Ruan, X., Chen, Y.W.: Adaptive color independent components based sift descriptors for image classification. In: ICPR, pp. 2436–2439. IEEE (2010)Google Scholar
  20. 20.
    Markonis, D., García Seco de Herrera, A., Eggel, I., Müller, H.: Multi–scale visual words for hierarchical medical image categorization. In: SPIE Medical Imaging 2012: Advanced PACS–based Imaging Informatics and Therapeutic Applications, vol. 8319, pp. 83190F–11 (February 2012)Google Scholar
  21. 21.
    Wengert, C., Douze, M., Jégou, H.: Bag–of–colors for improved image search. In: Proceedings of the 19th ACM International Conference on Multimedia, MM 2011, pp. 1437–1440. ACM, New York (2011)Google Scholar
  22. 22.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis & Machine Intelligence 27(10), 1615–1630 (2005)CrossRefGoogle Scholar
  23. 23.
    Kalpathy-Cramer, J., Müller, H., Bedrick, S., Eggel, I., García Seco de Herrera, A., Tsikrika, T.: The CLEF 2011 medical image retrieval and classification tasks. In: Working Notes of CLEF 2011 (Cross Language Evaluation Forum) (September 2011)Google Scholar
  24. 24.
    Sharma, G., Trussell, H.J.: Digital color imaging. IEEE Transactions on Image Processing 6(7), 901–932 (1997)CrossRefGoogle Scholar
  25. 25.
    Banu, M., Nallaperumal, K.: Analysis of color feature extraction techniques for pathology image retrieval system. IEEE (2010)Google Scholar
  26. 26.
    Grauman, K., Leibe, B.: Visual Object Recognition (2011)Google Scholar
  27. 27.
    MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)Google Scholar
  28. 28.
    Hinneburg, A., Keim, D.A.: An efficient approach to clustering in large multimedia databases with noise. In: Conference on Knowledge Discovery and Data Mining (KDD), vol. 5865, pp. 58–65. AAAI Press (1998)Google Scholar
  29. 29.
    Swain, M.J., Ballard, D.H.: Color indexing. International Journal of Computer Vision 7(1), 11–32 (1991)CrossRefGoogle Scholar
  30. 30.
    Snoek, C.G.M., Worring, M., Smeulders, A.W.M.: Early versus late fusion in semantic video analysis. In: MULTIMEDIA 2005: Proceedings of the 13th Annual ACM International Conference on Multimedia, pp. 399–402. ACM, New York (2005)CrossRefGoogle Scholar
  31. 31.
    Fox, E.A., Shaw, J.A.: Combination of multiple searches. In: Text REtrieval Conference, pp. 243–252 (1993)Google Scholar
  32. 32.
    Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining (Adaptive Computation and Machine Learning). The MIT Press (2001)Google Scholar
  33. 33.
    Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: Computer Vision and Pattern Recognition, pp. 1–8 (2008)Google Scholar
  34. 34.
    Csurka, G., Clinchant, S., Jacquet, G.: XRCE’s participation at medical image modality classification and ad–hoc retrieval task of ImageCLEFmed 2011. In: Working Notes of CLEF 2011 (2011)Google Scholar
  35. 35.
    Faria, F.A., Calumby, R.T., da Silva Torres, R.: RECOD at ImageCLEF 2011: Medical modality classification using genetic programming. In: Working Notes of CLEF 2011 (2011)Google Scholar
  36. 36.
    Deserno, T.M., Antani, S., Long, L.R.: Content–based image retrieval for scientific literature access. Methods of Information In Medicine 48(4), 371–380 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.University of Applied Sciences Western Switzerland (HES–SO)SierreSwitzerland

Personalised recommendations