Abstract
In areas as diverse as earth remote sensing, astronomy, and medical imaging, image acquisition technology has undergone tremendous improvements in recent years. The vast amounts of scientific data are potential treasure-troves for scientific investigation and analysis. Unfortunately, advances in our ability to deal with this volume of data in an effective manner have not paralleled the hardware gains. While special-purpose tools for particular applications exist, there is a dearth of useful general-purpose software tools and algorithms which can assist a scientist in exploring large scientific image databases. This paper presents our recent progress in developing interactive semi-automated image database exploration tools based on pattern recognition and machine learning technology. We first present a completed and successful application that illustrates the basic approach: the SKICAT system used for the reduction and analysis of a 3 terabyte astronomical data set. SKICAT integrates techniques from image processing, data classification, and database management. It represents a system in which machine learning played a powerful and enabling role, and solved a difficult, scientifically significant problem. We then proceed to discuss the general problem of automated image database exploration, the particular aspects of image databases which distinguish them from other databases, and how this impacts the application of off-the-shelf learning algorithms to problems of this nature. A second large image database is used to ground this discussion: Magellan's images of the surface of the planet Venus. The paper concludes with a discussion of current and future challenges.
Similar content being viewed by others
References
General Accounting Office (1992). “Earth Observing System—NASA's EOSDIS Development Approach is Risky”,GAO Report: GAO/IMTEC-92-24, Feb. 1992.
Aubele, J. C. and Slyuta, E. N. (1990). “Small domes on Venus: characteristics and origins,” inEarth, Moon and Planets, 50/51, 493–532.
Amit, Y., Grenander, U., and Piccioni, M. (1991) “Structural image restoration through deformable templates,”J. American Statistical Association, 86(414), pp. 376–387, June 1991.
Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J. (1984).Classification and Regression Trees. Monterey, CA: Wadsworth & Brooks.
Burl, M.C., Fayyad, U.M., Perona, P., Smyth, P., and Burl, M.P. (1994). “Automating the hunt for volcanoes on Venus”. To appear inProc. of Computer Vision and Pattern Recognition Conference.
Burrough, P. A. (1986).Principles of Geographic Information Systems for Land Resources Assessment, Oxford: Clarenden.
Cheeseman, P. et al (1988). “Bayesian Classification.”Proc. of the 7th Nat. Conf.on Artificial Intelligence AAAI-88, pp. 607–611, Saint Paul, MN.
Chesters, M. S. (1992). “Human visual perception and ROC methodology in medical imaging,”Phys. Med. Biol., vol. 37, no.7, pp. 1433–1476.
Cooke, R. M. (1991).Experts in Uncertainty: Opinion and Subjective Probability in Science, Oxford University Press, New York.
Cross, A.M. (1987).Int. J. Remote Sensing, 9, no.9, 1519–1528.
Djorgovski, S., Weir, N., and Fayyad, U. (1994). “Processing and Analysis of the Palomar — STScI Digital Sky Survey Using a Novel Software Technology”, in D. Crabtree, J. Barnes, and R. Hanisch (eds.),Astronomical Data Analysis Software and Systems III, A.S.P. Conf. Ser. in press.
Dubois, D., Prade, H., Godo, L., Lopez de Mantaras, R. (1992). “A symbolic approach to reasoning with linguistic qualifiers,” inProceedings of the Eight Conference on Uncertainty in AI, San Mateo, CA: Morgan Kaufmann, pp. 74–82.
Duda, R.O. and Hart, P.E. (1973)Pattern Classification and Scene Analysis. New York: John Wiley and Sons.
Fayyad, U.M. and Irani, K.B. (1990). “What should be minimized in a decision tree?”Proceedings of Eighth National Conference on Artificial Intelligence AAAI-90, Boston, MA.
Fayyad, U.M. (1991).On the Induction of Decision Trees for Multiple Concept Learning. PhD Dissertation, EECS Dept. The University of Michigan.
Fayyad, U.M. and Irani, K.B. (1992). “The attribute selection problem in decision tree generation”Proc. of the Tenth National Conference on Artificial Intelligence AAAI-92 (pp. 104–110). Cambridge, MA: MIT Press.
Fayyad, U. Weir, N., and Djorgovski, S.G. (1993). “SKICAT: a machine learning system for automated cataloging of large scale sky surveys.”Proc. of Tenth Int. Conf. on Machine Learning, Morgan Kaufman.
Fayyad, U.M. and Irani, K.B. (1993). “Multi-interval discretization of continuous-valued attributes for classification learning.”Proc. of the 13th International Joint Conference on Artificial Intelligence IJCAI-93. Chambery, France: Morgan Kauffman.
Finney, D.J., Latscha, R., Bennett, B.M., and Hsu, P. (1963).Tables for Testing Significance in a 2x2 Contingency Table. Cambridge: Cambridge University Press.
Genest, C. and Zidek, J. V. (1986). “Combining probability distributions: a critique and an annotated bibliography,”Statistical Science, vol. 1, no.1, pp. 114–118.
Geman, S. and Geman, D. (1984). “Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images,”IEEE Trans. Patt. Anal Mach. Int., vol. 6, no.6, 721–741.
Guest, J. E. et al. (1992).Journal Geophys. Res., 97, E10, 15949.
Head, J. W., et al. (1991). “Venus volcanic centers and their environmental settings: recent data from Magellan,” EOS 72, p.175, American Geophysical Union Spring meeting abstracts.
Head, J. W. et al. (1992).Journal Geophysical Res., 97, E8, 13,153–13,197.
Jarvis, J. and Tyson, A. (1981).Astr. Journ. 86:41.
Marble, D. F. and Peuquet, D. J. (1983). “Geographical information systems and remote sensing,” inManual of Remote Sensing, 2nd ed., R. E. Colwell (ed.), Falls Church, VA: Amer. Soc. Photogrammetry.
Miller, M. I., Christensen, G. E., and Amit, Y. (1993). “A mathematical textbook of deformable neuroanatomies,” submitted toScience.
Pettengill, G. H. et al. (1991). “Magellan: radar performance and products,”Science, vol. 252, 260–265, 12 April 1991.
Quegan, S., et al, (1988).Trans. R. Soc. London, A 324, 409–421.
Quinlan, J.R. (1986). “The induction of decision trees.”Machine Learning vol. 1, no. 1.
Quinlan, J.R. (1990). “Probabilistic decision trees.”Machine Learning: An Artificial Intelligence Approach vol. III. Y. Kodratoff & R. Michalski (eds.) San Mateo, CA: Morgan Kaufmann.
Ripley, B. D. (1988).Statistical Inference for Spatial Processes, Cambridge University Press, Cambridge.Science, special issue on Magellan data, April 12, 1991.
Smyth, P. and Mellstrom, J. (1992). “Detecting novel classes with applications to fault diagnosis,” inProceedings of the Ninth International Conference on Machine Learning, Morgan Kaufmann Publishers: Los Altos, CA, pp. 416–425.
Smyth, P. (1994). “Learning with probabilistic supervision,” inComputational Learning Theory and Natural Learning Systems 3, T. Petcshe, M. Kearns, S. Hanson, R. Rivest (eds), Cambridge, MA: MIT Press, to appear.
Turk, M. and Pentland, A. (1991). “Eigenfaces for recognition.”J. of Cognitive Neurosci., 3:71–86.
Valdes (1982).Instrumentation in Astronomy IV, SPIE vol. 331, no. 465.
Way, J. and Smith, E. A. (1991). “The evolution of synthetic aperture radar systems and their progression to the EOS SAR,”IEEE Trans, on Geoscience and Remote Sensing, vol. 29, no.6, pp. 962–985.
Weir, N. Djorgovski, S.G., Fayyad, U. et al (1992). “SKICAT: A system for the scientific analysis of the Palomar-STScI Digital Sky Survey.”Proc. of Astronomy from Large databases II, p. 509, Munich, Germany: European Southern Observatory.
Weir, N., Djorgovski, S., Fayyad, U., Smith, J.D., and Roden, J. (1994). “Cataloging the Northern Sky Using a New Generation of Software Technology”, in H. MacGillivray (ed.),Astronomy From Wide-Field Imaging, Proceedings of the IAU Symp. #161, in press. Dordrecht: Kluwer.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Fayyad, U.M., Smyth, P., Weir, N. et al. Automated analysis and exploration of image databases: Results, progress, and challenges. J Intell Inf Syst 4, 7–25 (1995). https://doi.org/10.1007/BF00962819
Issue Date:
DOI: https://doi.org/10.1007/BF00962819