Skip to main content

Abstract

Human and Computer do not speak the same language. This is one of the challenging problems to those working on interface of human and computers. This paper is an effort to summarize the approaches which brings humans and computers a bit closer in terms of interpretation of visual information. When we describe outside world we describe in terms of language expressions but what we see is in terms of pictures. A significant portion of human information is gathered through our visual channel. But we communicate using language (text). However, this is not the ease for computational systems. The interpretation of images in computational process is generally in the form of attribute values which cannot be directly correlated with words or concepts. Describing the visual scenes in terms of phrases is the problem in reference for this paper. Recent research efforts focus on combining text and images for semantic image interpretation. We summarize some of these approaches and propose a conceptual framework for information extraction that combines both image and text..

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andrenucci, A, Sneiders, E.: Automated question answering: review of the main approaches. In: ICITA’ 05, pp. 514–519 (2005)

    Google Scholar 

  2. Barnard, K., Forsyth, D.: learning the semantics of words and pictures. In: International Conference on Computer Vision 2, 408–415 (2001)

    Google Scholar 

  3. Blei, D., Michael, Jordan, M. I.: Modeling annotated data. In: proceedings of 26th Annual international ACM SIGIR conference (2003)

    Google Scholar 

  4. Buitelaar, P., Sintek, M., Kiesel, M.: A lexicon model for multilingual/multimedia Ontologies. Proceedings of the third European Semantic Web Conference (2006)

    Google Scholar 

  5. Carson et al., 1997 Carson et al. (1997)

    Google Scholar 

  6. Hudelot, C., Maillot, N., Thonnat, M.: Symbol Grounding for Semantic Image Interpretation: From image data to semantics (2005)

    Google Scholar 

  7. Duygulu, P., Barnard, K., de Freitas, N. and Forsyth, D: Object recognition as machine translation: Learning a lexicon from a fixed image vocabulary. In seventh European Conference on Computer Vision, pages 97–112 (2002)

    Google Scholar 

  8. Jeon, J., Lavrenko, V, Manmatha Automatic Image annotation and retrieval using cross media relevance models. In the Proceedings of SIGIR’03, Toronto, Canada (2003)

    Google Scholar 

  9. Katz, B., Lin, J., Stauffer, C., Grimson, E.: Answering questions about moving objects in surveillance videos. In. Proc. of AAAI Spring Symposium on New Directions in QA (2003)

    Google Scholar 

  10. Boris, K.: (Annotating the World Wide Web using natural language’, Proceedings of the 5th RIAO Conference on Computer Assisted Information Searching on the Internet (1997)

    Google Scholar 

  11. Ma and Manjunath 1998

    Google Scholar 

  12. Martin-Valdivia, M. T., Diaz-Galiano, M. C., Montejo-Raez, A., Urena-Lopez, L. A. Using information gain to improve multi-maodal information retrieval systems, Information Processing Management 44, 1146–1158 (2008)

    Google Scholar 

  13. Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In MISRM’ 99 First International Workshop on Multimedia Intelligent Storage and Retrieval Management. (1999)

    Google Scholar 

  14. Papadopoulsola, G. T., Mezaris, V., Dasiopoulou, S and Kompatsiaris, I.: Semantic image analysis using a learning approach and spatial context. In Proceedings of the 1st International conference on semantic and digital media technologies (SAMT). (2006)

    Google Scholar 

  15. Su, L., Sharp, B., Chibelushi, C.: Knowledge-based image understanding: A rulebased production system for X-ray segmentation. In Proceedings of Fourth International Conference on Enterprise Information System, volume 1 (2002)

    Google Scholar 

  16. Town, C., Sinclair, D. Language-based querying of image collections on the basis of an extensible ontology. Image vision Comput. 22, 251–267 (2004)

    Google Scholar 

  17. Vompras, J.: Towards adaptive ontology-based image retrieval. In: Stefan Brass, C. G., editor, 17th GI-Workshop on the Foundations of Databases, Worlitz, Germany, pp. 148–152. Institute of Computer Science, Martin-Luther-University Halle-Wittenberg. (2005).

    Google Scholar 

  18. Winograd, T.: Understanding Natural Language, Academic Press, New York (1973)

    Google Scholar 

  19. Yang, H., Chaisorn, L., Zhao, Y., Neo, S. Y., Chua, T. S Video QA: question answering on news video. In: Proc. Of ACM MM’ 03, pp. 632–641 (2003)

    Google Scholar 

  20. Yeh, T., Lee, J.J., Darell, T.: Photo-based Question Answering, ACM Multimedia (2008)

    Google Scholar 

  21. Möller, M. M., Sintek M.: A Generic Framework for Semantic Medical Image Retrieval. In Proceedings of 7th Korea-Germany Joint Workshopon Advanced Medical Image Pro (2007)

    Google Scholar 

  22. Hudelot, C. Maillot, N. Thonnat, M.: Symbol Grounding for Semantic Image Interpretation: From Image Data to Semantics, In Proceedings of Tenth IEEE International Cofnerence on Computer Vision (2005)

    Google Scholar 

  23. Siddiqui, T., Tiwary, U. Natural Language Processing and Information Retrieval. Oxford University Press. (2007)

    Google Scholar 

  24. Faraday, S A. Attending to Web Pages Pete Faraday, Microsoft, Redmond http://www.cofc.edu/∼learning/chi01_faraday.pdf

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Indian Institute of Information Technology, India

About this paper

Cite this paper

Siddiqui, T.J., Tiwary, U.S. (2009). Words and Pictures: An HCI Perspective. In: Tiwary, U.S., Siddiqui, T.J., Radhakrishna, M., Tiwari, M.D. (eds) Proceedings of the First International Conference on Intelligent Human Computer Interaction. Springer, New Delhi. https://doi.org/10.1007/978-81-8489-203-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-81-8489-203-1_4

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-8489-404-2

  • Online ISBN: 978-81-8489-203-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics