Semi-automatic Image Annotation

  • Julia Moehrmann
  • Gunther Heidemann
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8048)


High quality ground truth data is essential for the development of image recognition systems. General purpose datasets are widely used in research, but they are not suitable as training sets for specialized real-world recognition tasks. The manual annotation of custom ground truth data sets is expensive, but machine learning techniques can be applied to preprocess image data and facilitate annotation. We propose a semi-automatic image annotation process, which clusters images according to similarity in a bag-of-features (BoF) approach. Clusters of images can be efficiently annotated in one go. The system recalculates the clustering continuously, based on partial annotations provided during annotation, by weighting BoF vector elements to increase intra-cluster similarity. Visualization of top-weighted codebook elements allows users to estimate the quality of annotations and of the recalculated clustering.


Image annotation semi-supervised clustering pairwise constraints 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Moehrmann, J., Heidemann, G.: Efficient development of user-defined image recognition systems. In: Park, J.-I., Kim, J. (eds.) ACCV Workshops 2012, Part I. LNCS, vol. 7728, pp. 242–253. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  2. 2.
    Moehrmann, J., Heidemann, G.: Efficient Annotation of Image Data Sets for Computer Vision Applications. In: International Workshop on Visual Interfaces for Ground Truth Collection in Computer Vision Applications, pp. 2:1–2:6 (2012)Google Scholar
  3. 3.
    Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: A database and web-based tool for image annotation. International Journal of Computer Vision 77(1), 157–173 (2008)CrossRefGoogle Scholar
  4. 4.
    von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: ACM CHI, pp. 319–326 (2004)Google Scholar
  5. 5.
    Šimko, J., Bieliková, M.: Personal image tagging: a game-based approach. In: Proceedings of the Intl. Conference on Semantic Systems, pp. 88–93. ACM (2012)Google Scholar
  6. 6.
    Kumar, N., Kummamuru, K.: Semi-supervised clustering with metric learning using relative comparisons. IEEE Transactions on Knowledge and Data Engineering 20(4), 496–503 (2008)CrossRefGoogle Scholar
  7. 7.
    Cai, H., Yan, F., Mikolajczyk, K.: Learning weights for codebook in image classification and retrieval. In: IEEE CVPR, pp. 2320–2327 (2010)Google Scholar
  8. 8.
    Xing, E., Ng, A., Jordan, M., Russell, S.: Distance metric learning, with application to clustering with side-information. Advances in Neural Information Processing Systems 15, 505–512 (2002)Google Scholar
  9. 9.
    Lee, D., Seung, H.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)CrossRefGoogle Scholar
  10. 10.
    Chen, Y., Rege, M., Dong, M., Hua, J.: Non-negative matrix factorization for semi-supervised data clustering. Knowledge and Information Systems 17(3), 355–379 (2008)CrossRefGoogle Scholar
  11. 11.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)Google Scholar
  12. 12.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)CrossRefGoogle Scholar
  13. 13.
    Manjunath, B.S., Ohm, J.-R., Vasudevan, V.V., Yamada, A.: Color and texture descriptors. IEEE Transactions on Circuits and Systems for Video Technology 11(6), 703–715 (2001)CrossRefGoogle Scholar
  14. 14.
    Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), 971–987 (2002)CrossRefGoogle Scholar
  15. 15.
    Jiang, Y.-G., Yang, J., Ngo, C.-W., Hauptmann, A.G.: Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Transactions on Multimedia 12(1), 42–53 (2010)CrossRefGoogle Scholar
  16. 16.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE CVPR, vol. 2, pp. 2169–2178 (2006)Google Scholar
  17. 17.
    Fruchterman, T., Reingold, E.: Graph drawing by force-directed placement. Software: Practice and Experience 21(11), 1129–1164 (1991)CrossRefGoogle Scholar
  18. 18.
    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: Workshop on Generative-Model Based Vision (2004)Google Scholar
  19. 19.
    Opelt, A., Pinz, A., Fussenegger, M., Auer, P.: Generic object recognition with boosting. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(3), 416–431 (2006b)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Julia Moehrmann
    • 1
  • Gunther Heidemann
    • 1
  1. 1.Institute of Cognitive ScienceUniversity of OsnabrueckGermany

Personalised recommendations