Skip to main content
Log in

Automatic Annotation and Retrieval of Images

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Although a variety of techniques have been developed for content-based image retrieval (CBIR), automatic image retrieval by semantics still remains a challenging problem. We propose a novel approach for semantics-based image annotation and retrieval. Our approach is based on the monotonic tree model. The branches of the monotonic tree of an image, termed as structural elements, are classified and clustered based on their low level features such as color, spatial location, coarseness, and shape. Each cluster corresponds to some semantic feature. The category keywords indicating the semantic features are automatically annotated to the images. Based on the semantic features extracted from images, high-level (semantics-based) querying and browsing of images can be achieved. We apply our scheme to analyze scenery features. Experiments show that semantic features, such as sky, building, trees, water wave, placid water, and ground, can be effectively retrieved and located in images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. N. Ahuja, “Dot pattern processing using Voronoi neighborhoods,” IEEE Transactions on Pattern Analysis and Machine Intelligence 4(3), 1982, 336-343.

    Google Scholar 

  2. N. Ahuja and M. Tuceryan, “Extraction of early perceptual structure in dot patterns: integrating region, boundary, and component gestalt,” Computer Vision, Graphics, and Image Processing 48(3), 1989, 304-356.

    Google Scholar 

  3. A. Bjarnestam, “Description of an image retrieval system,” in The Challenge of Image Retrieval Research Workshop, Newcastle upon Tyne, February 5, 1998.

  4. J. Eakins and M. Graham, “Content-based image retrieval,” Reports of JISC Technology Applications Programme, January 10, 1999, http://www.jtap.ac.uk/reports/htm/jtap-039.html

  5. D. Forsyth, J. Malik, M. Fleck, H. Greenspan, T. Leung, S. Belongie, C. Carson, and C. Bregler, “Finding pictures of objects in large collections of images,” in Report of the NSF/ARPA Workshop on 3D Object Representation for Computer Vision, 1996, p. 335.

  6. B. Manjunath and W. Ma, “Texture features for browsing and retrieval of image data,” IEEE Transactions on Pattern Analysis and Machine Intelligence 18(8), 1996, 837-842.

    Google Scholar 

  7. S. P. Morse, “Concepts of use in computer map processing,” Communications of the ACM 12(3), 1969, 147-152.

    Google Scholar 

  8. G. Pass, R. Zabih, and J. Miller, “Comparing images using color coherence vectors,” in Proceedings of ACM Multimedia 96, Boston, MA, 1996, pp. 65-73.

  9. C. Robl and G. Farber, “Contour tracer for a fast and precise edge-line extraction,” in IAPR Workshop on Machine Vision Applications (MVA98), 1998.

  10. P. Rosin and G. West, “Segmentation of edges into lines and arcs,” Image and Vision Computing 7(2), 1989, 109-114.

    Google Scholar 

  11. P. Rosin and G. West, “Multi-stage combined ellipse and line detection,” in British Machine Vision Conference (BMVC92), 1992, pp. 197-206.

  12. J. Roubal and T. Poiker, “Automated contour labelling and the contour tree,” in Proceedings AUTO-CARTO 7, 1985, pp. 472-481.

  13. M. Safar, C. Shahabi, and X. Sun, “Image retrieval by shape: A comparative study,” in Proceedings of IEEE International Conference on Multimedia and Exposition (ICME), USA, 2000.

  14. C. Shahabi and M. Safar, “Efficient retrieval and spatial querying of 2D objects,” in IEEE International Conference on Multimedia Computing and Systems (ICMCS99), Florence, Italy, 1999, pp. 611-617.

  15. G. Sheikholeslami and A. Zhang, “An approach to clustering large visual databases using wavelet transform,” in Proceedings of the SPIE Conference on Visual Data Exploration and Analysis IV, San Jose, 1997, pp. 322-333.

  16. J. R. Smith and S. Chang, “Transform features for texture classification and discrimination in large image databases,” in Proceedings of the IEEE International Conference on Image Processing, 1994, pp. 407-411.

  17. J. R. Smith and S.-F. Chang, “VisualSeek: a fully automated content-based image query system,” in Proceedings of ACM Multimedia 96, Boston, MA, 1996, pp. 87-98.

  18. Y. Song, “Monotonic tree and its application to multimedia information retrieval,” PhD dissertation, Department of Computer Science and Engineering, State University of New York at Buffalo, 2002, http://www.cse.Buffalo.EDU/?ys2/publication/thesis.pdf

  19. Y. Song and A. Zhang, “Monotonic tree,” in The 10th International Conference on Discrete Geometry for Computer Imagery, Bordeaux, France, April 3-5, 2002.

  20. G. Strang and T. Nguyen, Wavelets and Filter Banks, Wellesley-Cambridge Press, Wellesley, MA, 1996.

    Google Scholar 

  21. M. Swain and D. Ballard, “Color indexing,” International Journal of Computer Vision 7(1), 1991, 11-32.

    Google Scholar 

  22. Y. Tao and W. Grosky, “Delaunay triangulation for image object indexing: A novel method for shape representation,” in Proceedings of the Seventh SPIE Symposium on Storage and Retrieval for Image and Video Databases, San Jose, CA, 1999, pp. 631-942.

  23. A. Torralba and A. Oliva, “Semantic organization of scenes using discriminant structural templates,” in International Conference on Computer Vision (ICCV99), 1999, pp. 1253-1258.

  24. M. van Kreveld, R. van Oostrum, C. Bajaj, V. Pascucci, and D. Schikore, “Contour trees and small seed sets for iso-surface traversal,” in Proceedings of the 13th ACM Symposium on Computational Geometry, 1997, pp. 212-220.

  25. C. Zahn, “Graph-theoretical methods for detecting and describing gestalt clusters,” IEEE Transactions on Computers C-20, 1971, 68-86.

    Google Scholar 

  26. R. Zhao and W. Grosky, “Bridging the semantic gap in image retrieval,” in Distributed Multimedia Databases: Techniques and Applications, ed. T. Shih, Idea Group Publishing, Hershey, PA, 2001, pp. 13-36.

    Google Scholar 

  27. L. Zhu, A. Rao, and A. Zhang, “Theory of keyblock-based image retrieval,” ACM Transactions on Information Systems, 2000.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, Y., Wang, W. & Zhang, A. Automatic Annotation and Retrieval of Images. World Wide Web 6, 209–231 (2003). https://doi.org/10.1023/A:1023674722438

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1023674722438

Navigation