Journal of Intelligent Information Systems

, Volume 4, Issue 1, pp 7–25

Automated analysis and exploration of image databases: Results, progress, and challenges


  • Usama M. Fayyad
    • Jet Propulsion LaboratoryCalifornia Institute of Technology
  • Padhraic Smyth
    • Jet Propulsion LaboratoryCalifornia Institute of Technology
  • Nicholas Weir
    • Astronomy DepartmentCalifornia Institute of Technology
  • S. Djorgovski
    • Astronomy DepartmentCalifornia Institute of Technology

DOI: 10.1007/BF00962819

Cite this article as:
Fayyad, U.M., Smyth, P., Weir, N. et al. J Intell Inf Syst (1995) 4: 7. doi:10.1007/BF00962819


In areas as diverse as earth remote sensing, astronomy, and medical imaging, image acquisition technology has undergone tremendous improvements in recent years. The vast amounts of scientific data are potential treasure-troves for scientific investigation and analysis. Unfortunately, advances in our ability to deal with this volume of data in an effective manner have not paralleled the hardware gains. While special-purpose tools for particular applications exist, there is a dearth of useful general-purpose software tools and algorithms which can assist a scientist in exploring large scientific image databases. This paper presents our recent progress in developing interactive semi-automated image database exploration tools based on pattern recognition and machine learning technology. We first present a completed and successful application that illustrates the basic approach: the SKICAT system used for the reduction and analysis of a 3 terabyte astronomical data set. SKICAT integrates techniques from image processing, data classification, and database management. It represents a system in which machine learning played a powerful and enabling role, and solved a difficult, scientifically significant problem. We then proceed to discuss the general problem of automated image database exploration, the particular aspects of image databases which distinguish them from other databases, and how this impacts the application of off-the-shelf learning algorithms to problems of this nature. A second large image database is used to ground this discussion: Magellan's images of the surface of the planet Venus. The paper concludes with a discussion of current and future challenges.


Machine LearningPattern RecognitionAutomated Data AnalysisAstronomySky SurveysImage ProcessingLarge Image Databases

Copyright information

© Kluwer Academic Publishers 1995