International Journal of Computer Vision

, Volume 18, Issue 3, pp 233–254 | Cite as

Photobook: Content-based manipulation of image databases

  • A. Pentland
  • R. W. Picard
  • S. Sclaroff


We describe the Photobook system, which is a set of interactive tools for browsing and searching images and image sequences. These query tools differ from those used in standard image databases in that they make direct use of the image content rather than relying on text annotations. Direct search on image content is made possible by use of semantics-preserving image compression, which reduces images to a small set of perceptually-significant coefficients. We discuss three types of Photobook descriptions in detail: one that allows search based on appearance, one that uses 2-D shape, and a third that allows search based on textural properties. These image content descriptions can be combined with each other and with text-based descriptions to provide a sophisticated browsing and search capability. In this paper we demonstrate Photobook on databases containing images of people, video keyframes, hand tools, fish, texture swatches, and 3-D medical data.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Adelson, E. and Bergen, J. 1991. The plenoptic function and the elements of early vision. In M. Landy and J.A. Movshon (Eds.), Computational Models of Visual Processing, MIT Press.Google Scholar
  2. ACMSIGIR. 1991. Proceedings of International Conference on Multimedia Information Systems, Singapore.Google Scholar
  3. Ballard, D. and Brown, C. 1982. Computer Vision. Prentice Hall.Google Scholar
  4. Binaghi. E., Gagliardi, I., and Schettini, R. Indexing and fuzzy logicbased retrieval of color images. In Visual Database Systems, II, IFIP Transactions, A-7:79–92.Google Scholar
  5. Blanz, W.E., Petkovic, D., and Sanz, J.L. 1989. Algorithms and Architectures for Machine Vision. C.H. Chen (Ed.), Marcel Decker Inc.Google Scholar
  6. Breuel, T. 1990. Indexing for recognition from a large model base. M.I.T. Artificial Intelligence Laboratory Memo. 1108.Google Scholar
  7. Brodatz, P. (1966). Textures: A Photographic Album for Artists and Designers, Dover: New York.Google Scholar
  8. Chang, C.C. and Lee, S.Y. (1991). Retrieval of similar pictures on pictorial databases. Pattern recognition, 24(7):675–680.Google Scholar
  9. Chang, C.-C. and Wu, T.-C. (1992). Retrieving the most similar symbolic pictures from pictorial databases. Information Processing and Management, 28(5):581–588.Google Scholar
  10. Chen, Z. and Ho, S.-Y. (1991). Computer vision for robust 3D aircraft recognition with fast library search. Pattern Recognition, 24(5):375–390.Google Scholar
  11. Darrell, T. and Pentland, A. 1991. Robust estimation of a multi-layer motion representation. In Proceedings IEEE Workshop on Visual Motion, pp. 173–177. Longer version available as M.I.T. Media Laboratory Perceptual Computing Technical Report No. 163.Google Scholar
  12. Darrell, T., Maes, P., Blumberg, B., and Pentland, A. 1994. A novel environment for situated vision and behavior. IEEE Workshop on Visual Behaviors. Seattle. WA, pp. 68–72.Google Scholar
  13. Duda, R. and Hart, P. (1973). Pattern Classification and Scene Analysis. Wiley: New York.Google Scholar
  14. Francos, J. 1993. Orthogonal decompositions of 2-D random fields and their applications for 2-D spectral estimation. In N.K. Bose and C.R. Rao (Eds.), Signal Processing and its Applications. North-Holland, pp. 287–327.Google Scholar
  15. Gast, P. 1993. Integrating eigenpicture analysis with an image database. M.I.T. Bachelors Thesis, Computer Science and Electrical Engineering Deptartment. Advisor: Alex Pentland.Google Scholar
  16. Grosky, W.I., Neo, P., and Mehrotra, R. 1992. A pictorial index mechanism for model-based matching. Data and Knowledge Engineering, 8:309–327.Google Scholar
  17. Haase, K. 1993a. FRAMER: A portable persistent representation library. Proceedings of the AAAI Workshop on AI in Systems and Support, Am. Asso. for AI.Google Scholar
  18. Haase, K. 1993b. AI in service and support: Bridging the gap, Haase. Proceedings of Am. Asso. AI. Google Scholar
  19. Helson, H. and Lowdenslager, D. (1962). Prediction theory and fourier series in several variables II. Acta Mathmatica, 196:175–213.Google Scholar
  20. Hirata, K. and Kato, T. (1992). Query by visual example. In Advances in Database Technology EDBT'92, Third International Conference on Extending Database Technology. Springer-Verlag: Vienna, Austria.Google Scholar
  21. Ioka, M. 1989. A method of defining the similarity of images on the basis of color information. Technical Report RT-003 0, IBM Tokyo Research Lab.Google Scholar
  22. Ireton, M.A. and Xydeas, C.S. 1990. Classification of shape for content retrieval of images in a multimedia database. In Sixth International Conference on Digital Processing of Signals in Communications, Loughborough, UK, 2–6. IEE, pp. 111–116.Google Scholar
  23. Jagadish, H.V. 1991. A retrieval technique for similar shapes. In International Conference on Management of Data, SIGMOD 91, Denver CO, ACM, pp. 208–217.Google Scholar
  24. Jain, R. and Niblack, W. 1992. NSF Workshop on Visual Information Management.Google Scholar
  25. Kato, T., Kurita, T., Shimogaki, H., Mizutori, T., and Fujimura, K. 1991. A cognitive approach to visual interaction. In International Conference of Multimedia Information Systems, MIS'91, ACM and National University of Singapore, pp. 109–120.Google Scholar
  26. Lamdan, Y. and Wolfson, H.J. 1988. Geometric hashing: A genral and efficient model-based recognition scheme. In 2nd International Conference on Computer Vision (ICCV), Tampa, Florida, IEEE, pp. 238–249.Google Scholar
  27. Lee, S.-Y. and Hsu, F.-J. (1990). 2D C-string: A new spatial knowledge representation for image database systems. Pattern Recognition, 23(10):1077–1087.Google Scholar
  28. Lee, S.-Y. and Hsu, F.-J. (1992). Spatial reasoning and similarity retrieval of images using 2D c-string knowledge representation. Pattern Recognition, 25(2):305–318.Google Scholar
  29. Lippman, A. 1981. Semantic bandwidth compression. Picture Coding Symposium.Google Scholar
  30. McLean, P. 1989. Structured video coding. M.I.T. Masters Thesis, Advisor: Andrew Lippman.Google Scholar
  31. Mao, J. and Jain, A. (1992). Texture classification and segmentation using multiresolution simultaneous autoregressive models. Pattern Recognition, 25(2):173–188.Google Scholar
  32. Mehrotra, R. and Grosky, W.I. 1989. Shape matching utilizing indexed hypotheses generation and testing. IEEE Transactions of Robotics and Automation, 5(1):70–77.Google Scholar
  33. Moghaddam, B. and Pentland, A. 1994. Face recognition using viewbased and modular eigenspaces for identification and inspection of Humans. SPIE Conf. on Automatic Systems, San Diego.Google Scholar
  34. Niblack, W., Barber, R., Equitz, W., Flickner, M., Glasman, E., Petkovic, D., and Yanker, P. 1993. The QBIC project: Querying image s by content using color, texture, and shape. In IS & T/SPIE 1993 International Symposium on Electronic Imaging: Science & Technology., Conference 1908, Storage and Retrieval for Image and Video Databases.Google Scholar
  35. Martin, J., Pentland, A., and Kikinis, R., 1994. Shape analysis of brain structures using physical and experimental modes. IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, pp. 752–755.Google Scholar
  36. Pentland, A. and Sclaroff, S. 1991. Closed-form solutions for physically based shape modeling and recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(7):715–730.Google Scholar
  37. Pentland, A., Picard, R., Davenport, G., and Welsh, R. 1993. The BT/MIT project on advanced image tools for telecommunications: An overview. Image Com'93, 2nd International Conference on Image Communications, Bordeaux, France, pp. 23–25.Google Scholar
  38. Pentland, A., Moggadam, B., and Starner, T., 1994. View-based and modular eigenspaces for face recognition. IEEE Conf. Computer Vision and Pattern Recognition, Seattle, WA, pp. 84–90.Google Scholar
  39. Picard, R.W. (1982). Random field texture coding. Society for Information Display International Symposium Digest, XXIII:685–688.Google Scholar
  40. Picard, R.W. and Kabir, T. 1993. Finding similar patterns in large image databases. Proc. ICASSP, Minneapolis, MN, Vol. 5, pp. 161–164.Google Scholar
  41. Picard, R.W. and Gorkani, M. 1994. Finding perceptually dominant orientations in natural textures. Spatial Vision, 8(2):221–253.Google Scholar
  42. Picard, R.W. and Liu, F. 1994. A new Wold ordering for image similarity. Proc. ICASSP, Adelaide, Australia.Google Scholar
  43. Picard, R.W. and Minka, T.P. 1995. Vision texture for annotation. ACM/Springer-Verlag Journal of Multimedia Systems, 3:3–14.Google Scholar
  44. Rao, A.R. and Lohse, G.L. 1993. Towards a texture naming system: Identifying relevant dimensions of texture. IEEE Conf. on Visualization, San Jose, CA.Google Scholar
  45. Sclaroff, S. and Pentland, A. 1993. A finite-element framework for correspondence and matching. 4th International Conference on Computer Vision, Berlin, Germany, pp. 308–313.Google Scholar
  46. Sclaroff, S. and Pentland, A. 1995. Modal matching for correspondence and recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 17(6):562–575. Also available as: M.I.T. Media Laboratory Perceptual Computing Technical Note No. 304.Google Scholar
  47. Sirovich, L. and Kirby, M. 1987. Low-dimensional procedure for the characterization of human faces. J. Opt. Soc. Am. A, 4(3):519–524.Google Scholar
  48. Smoliar, S. and Zhang, H. 1994. Content-based video indexing and retrieval. IEEE Multimedia Magazine, 1(2):62–72.Google Scholar
  49. Sriram, R., Francos, J.M., and Pearlman, W.A. 1994. Texture coding using a wold decomposition model. Proc. 12th IAPR Int. Conf. Pat. Rec., Jerusalem, Israel.Google Scholar
  50. Swain, M. and Ballard, D. 1991, Color indexing. Int. J. of Computer Vision, 7(1):11–32.Google Scholar
  51. Tanaka, S., Shima, M., Shibayama, J., and Maeda, A. 1989. Retrieval method for an image database based on topographical structure. In Applic. of Digital Image Processing, SPIE, 1153:318–327.Google Scholar
  52. Therrien, C.W. 1992. Discrete Random Signals and Statistical Signal Processing. Prentice-HallL: Englewood Cliffs, NJ.Google Scholar
  53. Turk, M. and Pentland, A. 1991. Eigenfaces for recognition. Journal of Cognitive Neuroscience.Google Scholar
  54. Wakimoto, K., Shima, M., Tanaka, S., and Maeda, A. 1990. An intelligent user interface to an image database using a figure interpretation method. In 9th Int. Conference on Pattern Recognition, Vol. 2, pp. 516–991.Google Scholar
  55. Wang, J.Y.A. and Adelson, E.H. Layered representation for motion analysis IEEE CVPR'93. Longer version available as: M.I.T. Media Laboratory Perceptual Computing Technical Report No. 228.Google Scholar

Copyright information

© Kluwer Academic Publishers 1996

Authors and Affiliations

  • A. Pentland
  • R. W. Picard
  • S. Sclaroff

There are no affiliations available

Personalised recommendations