Text-Based Approaches for the Categorization of Images

  • Carl L. Sable
  • Vasileios Hatzivassiloglou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1696)


The rapid expansion of multimedia digital collections brings to the fore the need for classifying not only text documents but their embedded non-textual parts as well. We propose a model for basing classification of multimedia on broad, non-topical features, and show how information on targeted nearby pieces of text can be used to effectively classify photographs on a first such feature, distinguishing between indoor and outdoor images. We examine several variations to a TF*IDF-based approach for this task, empirically analyze their effects, and evaluate our system on a large collection of images from current news newsgroups. In addition, we investigate alternative classification and evaluation methods, and the effect that a secondary feature can have on indoor/outdoor classification. We obtain a classification accuracy of 82%, a number that clearly outperforms baseline estimates and competing image-based approaches and nears the accuracy of humans who perform the same task with access to comparable information.


Information Retrieval Proper Noun Video Database Probability Density Estimate Category Vector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. A. V. Aho, S.-F. Chang, K. R. McKeown, D. Radev, J. R. Smith, and K. Zaman. Columbia Digital News Project. International Journal of Digital Libraries, 1(4):377–385, 1998.CrossRefGoogle Scholar
  2. J. R. Bach, C. Fuller, A. Gupta, A. Hampapur, B. Horowitz, R. Humphrey, R. C. Jain, and C. Shu. The VIRAGE Image Search Engine: An Open Framework for Image Management. In Proceedings of the Symposium on Electronic Imagic: Science and Technology—Storage and Retrieval for Image and Video Databases IV. IS&T/SPIE, February 1996.Google Scholar
  3. D. M. Bates and D. G. Watts. Nonlinear Regression Analysis and its Applications. Wiley, New York, 1988.MATHGoogle Scholar
  4. K. W. Church. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In Proceedings of the Second Conference on Applied Natural Language Processing (ANLP-88), pages 136–143, Austin, Texas, February 1988.Google Scholar
  5. R. A. Fisher. Statistical Methods for Research Workers. Oliver and Boyd, E;;dinburgh, United Kingdom, 5th edition, 1934.Google Scholar
  6. J. L. Fleiss. Statistical Methods for Rates and Proportions. Wiley, New York, 2nd edition, 1981.MATHGoogle Scholar
  7. D. A. Forsyth and M. M. Fleck. Finding Naked People. In Proceedings of the European Conference on Computer Vision, Berlin, Germany, 1996.Google Scholar
  8. L. S. Gay and W. B. Croft. Interpreting Nominal Compounds for Information Retrieval. Information Processing and Management, 26(1):21–38, 1990.CrossRefGoogle Scholar
  9. T. Hastie and D. Pregibon. Shrinking Trees. Technical report, AT&T Bell Laboratories, 1990.Google Scholar
  10. V. Hatzivassiloglou and K. R. McKeown. Towards the Automatic Identification of Adjectival Scales: Clustering Adjectives According to Meaning. In Proceedings of the 31st Annual Meeting of the ACL, pages 172–182, Columbus, Ohio, June 1993.Google Scholar
  11. C. R. Hicks. Fundamental Concepts in the Design of Experiments. Holt, Rinehart, and Wilson, New York, 3rd edition, 1982.Google Scholar
  12. D. Lewis, R. Schapire, J. Callan, and R. Papka. Training Algorithms for Linear Text Classifiers. In Proceedings of the 19th International ACM SIGIR Conference on Researce and Development in Information Retrieval (SIGIR-96), 1996.Google Scholar
  13. D. Lewis. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. In Proceedings of the European Conference on Machine Learning, 1998.Google Scholar
  14. W. Niblack, R. Barber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. Yanker, C. Faloutsos, and G. Taubin. The QBIC Project: Quering Images by Content Using Color, Texture, and Shape. In Proceedings of Symposium on Electronic Imaging: Science and Technology—Storage and Retrieval for Image and Video Databases. SPIE, February 1993.Google Scholar
  15. V. E. Ogle and M. Stonebraker. Chabot: Retrieval from a Relational Database of Images. IEEE Computer Magazine, 28(9):40–48, September 1995.Google Scholar
  16. S. Paek, C. L. Sable, V. Hatzivassiloglou, A. Jaimes, B. H. Schiffman, S.-F. Chang, and K. R. McKeown. Integration of Visual and Text-Based Approaches for the Content Labeling and Classification of Photographs, 1999. In preparation.Google Scholar
  17. A. Pentland, R. W. Picard, and S. Sclaroff. Photobook: Tools for Content-Based Manipulation of Image Databases. In Proceedings of the Symposium on Electronic Imagic: Science and Technology—Storage and Retrieval for Image and Video Databases II, pages 34–47, Bellingham, Washington, 1994. SPIE.Google Scholar
  18. J. R. Quinlan. Induction of Decision Trees. Machine Learning, 1(1):81–106, 1986.Google Scholar
  19. J. Rocchio. Relevance Feedback in Information Retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing, chapter 14, pages 974–979. Prentice-Hall, 1971.Google Scholar
  20. N. C. Rowe and E. J. Guglielmo. Exploiting Captions in the Retrieval of Multimedia Data. Information Processing and Management, 29(4):453–561, 1993.CrossRefGoogle Scholar
  21. G. Salton and C. Buckley. Term Weighting Approaches in Automatic Text Retrieval. Information Processing and Management, 25(5):513–523, 1988.CrossRefGoogle Scholar
  22. G. Salton and M. Smith. On the Application of Syntactic Methodologies in Automatic Text Analysis. In Proceedings of the 12th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1989.Google Scholar
  23. G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, Massachusetts, 1989.Google Scholar
  24. T. J. Santner and D. E. Duffy. The Statistical Analysis of Discrete Data. Springer-Verlag, New York, 1989.MATHGoogle Scholar
  25. D. W. Scott. Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley and Sons, New York, 1992.MATHGoogle Scholar
  26. A. F. Smeaton and I. Quigley. Experiments on Using Semantic Distances Between Words in Image Caption Retrieval. In Proceedings of the 19th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1996.Google Scholar
  27. A. F. Smeaton. Progress in the Application of Natural Language Processing to Information Retrieval Tasks. The Computer Journal, 35(3):268–278, 1992.CrossRefGoogle Scholar
  28. J. R. Smith and S.-F. Chang. Visually Searching the Web for Content. IEEE Multimedia, 4(3):12–20, July-September 1997.CrossRefGoogle Scholar
  29. R. K. Srihari. Automatic Indexing and Content-Based Retrieval of Captioned Images. IEEE Computer Magazine, 28(9):49–58, September 1995.Google Scholar
  30. M. Szummer and R. W. Picard. Indoor-Outdoor Image Classification. In IEEE Workshop on Content Based Access of Image and Video Databases (CAIVD-98), pages 42–51, Bombay, India, January 1998.Google Scholar
  31. A. Vailaya, M. Figueiredo, A. K. Jain, and H. Zhang. Bayesian Framework for Semantic Classification of Outdoor Vacation Images. In Proceedings of SPIE—Storage and Retrieval for Image and Video Databases VII, San Jose, California, 1999.Google Scholar
  32. N. Wacholder, Y. Ravin, and M. Choi. Disambiguation of Proper Names in Text. In Proceedings of the 5th ACL Conference on Applied Natural Language Processing (ANLP-97), pages 202–208, Washington, D.C., April 1997.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Carl L. Sable
    • 1
  • Vasileios Hatzivassiloglou
    • 1
  1. 1.Department of Computer Science, 450 Computer Science BuildingColumbia UniversityNew YorkUSA

Personalised recommendations