Do We Need Annotation Experts? A Case Study in Celiac Disease Classification

  • Roland Kwitt
  • Sebastian Hegenbart
  • Nikhil Rasiwasia
  • Andreas Vécsei
  • Andreas Uhl
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8674)


Inference of clinically-relevant findings from the visual appearance of images has become an essential part of processing pipelines for many problems in medical imaging. Typically, a sufficient amount labeled training data is assumed to be available, provided by domain experts. However, acquisition of this data is usually a time-consuming and expensive endeavor. In this work, we ask the question if, for certain problems, expert knowledge is actually required. In fact, we investigate the impact of letting non-expert volunteers annotate a database of endoscopy images which are then used to assess the absence/presence of celiac disease. Contrary to previous approaches, we are not interested in algorithms that can handle the label noise. Instead, we present compelling empirical evidence that label noise can be compensated by a sufficiently large corpus of training data, labeled by the non-experts.


Celiac Disease Local Binary Pattern Image Representation Training Corpus Fisher Vector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B 57(1), 289–300 (1995)MathSciNetGoogle Scholar
  2. 2.
    Bootkrajang, J., Kabán, A.: Label-noise robust logistic regression and its applications. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part I. LNCS, vol. 7523, pp. 143–158. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Brodley, C., Friedl, M.: Identifying mislabeled training data. J. Artif. Intell. Res. 11, 131–167 (1999)Google Scholar
  4. 4.
    Dickey, W., Hughes, D.: Prevalence of celiac disease and its endoscopic markers among patients having routine upper gastrointestinal endoscopy. Am. J. Gastroenterol. 94, 2182–2186 (1999)CrossRefGoogle Scholar
  5. 5.
    Fan, R., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. JMLR 9, 1871–1874 (2008)Google Scholar
  6. 6.
    Kwitt, R., Uhl, A.: Modeling the marginal distributions of complex wavelet coefficient magnitudes for the classification of zoom-endoscopy images. In: MMBIA (2007)Google Scholar
  7. 7.
    Leung, T., Song, Y., Zhang, J.: Handling label noise in video classification via multiple instance learning. In: ICCV (2011)Google Scholar
  8. 8.
    Mäenpää, T., Ojala, T., Pietikäinen, M., Soriano, M.: Robust texture classification by subsets of local binary patterns. In: ICPR (2000)Google Scholar
  9. 9.
    Mäenpää, T., Pietikäinen, M.: Multi-scale binary patterns for texture analysis. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 885–892. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  10. 10.
    Mahapatra, D., Vezhnevets, A., Schüffler, P., Tielbeek, J., Franciscus, M., Buhmann, J.: Weakly supervised semantic segmentation of Crohn’s disease tissues from abdominal MRI. In: ISBI (2013)Google Scholar
  11. 11.
    Oberhuber, G., Granditsch, G., Vogelsang, H.: The histopathology of coeliac disease: time for a standardized report scheme for pathologists. Eur. J. Gastroen. Hepat. 11, 1185–1194 (1999)CrossRefGoogle Scholar
  12. 12.
    Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: CVPR (2007)Google Scholar
  13. 13.
    Vahdat, A., Mori, G.: Handling uncertain tags in visual recognition. In: ICCV (2013)Google Scholar
  14. 14.
    Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008),

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Roland Kwitt
    • 1
  • Sebastian Hegenbart
    • 1
  • Nikhil Rasiwasia
    • 3
  • Andreas Vécsei
    • 2
  • Andreas Uhl
    • 1
  1. 1.Department of Computer ScienceUniversity of SalzburgAustria
  2. 2.St. Anna Children’s HospitalMedical University ViennaAustria
  3. 3.Yahoo Labs!BangaloreIndia

Personalised recommendations