Advertisement

Machine Learning

, Volume 53, Issue 1–2, pp 157–191 | Cite as

Improved Rooftop Detection in Aerial Images with Machine Learning

  • M.A. Maloof
  • P. Langley
  • T.O. Binford
  • R. Nevatia
  • S. Sage
Article

Abstract

In this paper, we examine the use of machine learning to improve a rooftop detection process, one step in a vision system that recognizes buildings in overhead imagery. We review the problem of analyzing aerial images and describe an existing system that detects buildings in such images. We briefly review four algorithms that we selected to improve rooftop detection. The data sets were highly skewed and the cost of mistakes differed between the classes, so we used ROC analysis to evaluate the methods under varying error costs. We report three experiments designed to illuminate facets of applying machine learning to the image analysis task. One investigated learning with all available images to determine the best performing method. Another focused on within-image learning, in which we derived training and testing data from the same image. A final experiment addressed between-image learning, in which training and testing sets came from different images. Results suggest that useful generalization occurred when training and testing on data derived from images differing in location and in aspect. They demonstrate that under most conditions, naive Bayes exceeded the accuracy of other methods and a handcrafted classifier, the solution currently used in the building detection system.

supervised learning learning for computer vision evaluation of algorithms applications of learning 

References

  1. Aha, D., Kibler, D., & Albert, M. (1991). Instance-based learning algorithms. Machine Learning, 6, 37–66.Google Scholar
  2. Ali, K., Langley, P., Maloof, M., Sage, S.,& Binford, T. (1998). Improving rooftop detection with interactive visual learning. In Proceedings of the Image Understanding Workshop (pp. 479–492). San Francisco, CA: Morgan Kaufmann.Google Scholar
  3. Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology, 12, 387–415.Google Scholar
  4. Beiden, S., Maloof, M., & Wagner, R. (2002). Analysis of competing classifiers in terms of components of variance of ROC summary accuracy measures: Generalization to a population of trainers and a population of testers. In Proceedings of the SPIE International Symposium on Medical Imaging: Image Processing (Vol. 4684).Google Scholar
  5. Beiden, S., Wagner, R., & Campbell, G. (2000). Components-of-variance models and multiple-bootstrap experiments: An alternative method for random-effects Receiver Operating Characteristic analysis. Academic Radiology, 7, 341–349.Google Scholar
  6. Beymer, D., & Poggio, T. (1996). Image representations for visual learning. Science, 272, 1905–1909.Google Scholar
  7. Binford, T., Levitt, T., & Mann, W. (1987). Bayesian inference in model-based machine vision. In Proceedings of the Third Annual Conference on Uncertainty in Artificial Intelligence (pp. 73–97). New York, NY: Elsevier Science.Google Scholar
  8. Blake, C., & Merz, C. (1998). UCI repository of machine learning databases ([http://www.ics.uci.edu/?mlearn/ MLRepository.html]). Department of Information and Computer Sciences, University of California, Irvine.Google Scholar
  9. Bradley, A. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30:7, 1145–1159.Google Scholar
  10. Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.Google Scholar
  11. Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and Regression Trees. Boca Raton, FL: Chapman & Hall/CRC Press.Google Scholar
  12. Burl, M., Asker, L., Smyth, P., Fayyad, U., Perona, P., Crumpler, L., & Aubele, J. (1998). Learning to recognize volcanoes on Venus. Machine Learning, 30, 165–194.Google Scholar
  13. Cardie, C., & Howe, N. (1997). Improving minority class prediction using case-specific feature weights. In Proceedings of the Fourteenth International Conference on Machine Learning (pp. 57–65). San Francisco, CA: Morgan Kaufmann.Google Scholar
  14. Chan, L., Nasrabadi, N., & Mirelli, V. (1996). Multi-stage target recognition using modular vector quantizers and multilayer perceptrons. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(pp. 114–119). Los Alamitos, CA: IEEE Press.Google Scholar
  15. Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–284.Google Scholar
  16. Conklin, D. (1993). Transformation-invariant indexing and machine discovery for computer vision. In Papers from the the AAAI Fall Symposium on Machine Learning in Computer Vision: What, Why, and How? Technical Report No. FS-93-04 (pp. 10–14). Menlo Park, CA: AAAI Press.Google Scholar
  17. Connell, J., & Brady, M. (1987). Generating and generalizing models of visual objects. Artificial Intelligence, 31, 159–183.Google Scholar
  18. Cook, D., Hall, L., Stark, L., & Bowyer, K. (1993). Learning combination of evidence functions in object recognition. In Papers from the the AAAI Fall Symposium on Machine Learning in Computer Vision: What, Why, and How? Technical Report No. FS-93-04 (pp. 139–143). Menlo Park, CA: AAAI Press.Google Scholar
  19. Cromwell, R., & Kak, A. (1991). Automatic generation of object class descriptions using symbolic learning techniques. In Proceedings of the Ninth National Conference on Artificial Intelligence (pp. 710–717). Menlo Park, CA: AAAI Press.Google Scholar
  20. DeLong, E., DeLong, D., & Clarke-Peterson, D. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics, 44, 837–845.Google Scholar
  21. Domingos, P. (1999). MetaCost: A general method for making classifiers cost-sensitive. In Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining (pp. 155–164). New York, NY: ACM Press.Google Scholar
  22. Dorfman, D., & Alf, E. Jr. (1969). Maximum likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—rating method data. Journal of Mathematical Psychology, 6, 487–496.Google Scholar
  23. Dorfman, D., Berbaum, K., & Metz, C. (1992). Receiver Operating Characteristic rating analysis: Generalization to the population of readers and patients with the Jackknife method. Investigative Radiology, 27, 723–731.Google Scholar
  24. Draper, B. (1996). Learning grouping strategies for 2D and 3D object recognition. In Proceedings of the Image Understanding Workshop (pp. 1447–1454). San Francisco, CA: Morgan Kaufmann.Google Scholar
  25. Draper, B. (1997). Learning control strategies for object recognition. In K. Ikeuchi, & M. Veloso (Eds.), Symbolic Visual Learning (pp. 49–76). New York, NY: Oxford University Press.Google Scholar
  26. Draper, B., Brodley, C., & Utgoff, P. (1994). Goal-directed classification using linear machine decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(9), 888–893.Google Scholar
  27. Duda, R., & Hart, P. (1973). Pattern Classification and Scene Analysis. New York, NY: John Wiley & Sons.Google Scholar
  28. Egan, J. (1975). Signal Detection Theory and ROC Analysis. New York, NY: Academic Press.Google Scholar
  29. Ezawa, K., Singh, M.,& Norton, S. (1996). Learning goal-oriented Bayesian networks for telecommunications risk management. In Proceedings of the Thirteenth International Conference on Machine Learning (pp. 139–147). San Francisco, CA: Morgan Kaufmann.Google Scholar
  30. Fawcett, T., & Provost, F. (1997). Adaptive fraud detection. Data Mining and Knowledge Discovery, 1, 291–316.Google Scholar
  31. Fayyad, U., Smyth, P., Burl, M., & Perona, P. (1996). Learning to catalog science images. In S. Nayar, & T. Poggio (Eds.), Early Visual Learning (pp. 237–268). New York, NY: Oxford University Press.Google Scholar
  32. Firschein, O., & Strat, T. (Eds.). (1997). RADIUS: Image Understanding for Imagery Intelligence. San Francisco, CA: Morgan Kaufmann.Google Scholar
  33. Freund, Y., & Schapire, R. (1996). Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning (pp. 148–156). San Francisco, CA: Morgan Kaufmann.Google Scholar
  34. Freund, Y., Seung, H., Shamir, E., & Tishby, N. (1997). Selective sampling using the Query by Committee algorithm. Machine Learning, 28, 133–168.Google Scholar
  35. Green, D., & Swets, J. (1974). Signal Detection Theory and Psychophysics. New York, NY: Robert E. Krieger Publishing.Google Scholar
  36. Gros, P. (1993). Matching and clustering: Two steps towards automatic object model generation in computer vision. In Papers from the the AAAI Fall Symposium on Machine Learning in Computer Vision: What, Why, and How? Technical Report No. FS-93-04 (pp. 40–44). Menlo Park, CA: AAAI Press.Google Scholar
  37. Gutta, S., Huang, J., Imam, I., & Weschler,H. (1996). Face and hand gesture recognition using hybrid classifiers. In Proceedings of the Second International Conference on Automatic Face and Gesture Recognition (pp. 164–169). Los Alamitos, CA: IEEE Press.Google Scholar
  38. Hand, D., & Till, R. (2001).A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45, 171–186.Google Scholar
  39. Hanley, J., & McNeil, B. (1982). The meaning and use of the area under a Receiver Operating Characteristic (ROC) curve. Radiology, 143, 29–36.Google Scholar
  40. Hinkley, D. (1983). Jackknife methods. In S. Kotz, N. Johnson, & C. Read (Eds.), Encyclopedia of Statistical Sciences (Vol. 4, pp. 280–287). New York, NY: John Wiley & Sons.Google Scholar
  41. John, G., & Langley, P. (1995). Estimating continuous distributions in Bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (pp. 338–345). San Francisco, CA: Morgan Kaufmann.Google Scholar
  42. Keppel, G., Saufley, W., & Tokunaga, H. (1992). Introduction to Design and Analysis, 2nd edn. New York, NY: W.H. Freeman.Google Scholar
  43. Kim, Z., & Nevatia, R. (1999). Uncertain reasoning and learning for feature grouping. Computer Vision and Image Understanding, 76, 278–288.Google Scholar
  44. Kim, Z., & Nevatia, R. (2000). Learning Bayesian networks for diverse and varying numbers of evidence sets. In Proceedings of the Seventeenth International Conference on Machine Learning (pp. 479–486). San Francisco, CA: Morgan Kaufmann.Google Scholar
  45. Kubat, M., Holte, R., & Matwin, S. (1996). Learning when negative examples abound. In Proceedings of the 1997 European Conference on Machine Learning (pp. 146–153). Berlin: Springer-Verlag.Google Scholar
  46. Kubat, M., Holte, R., & Matwin, S. (1998). Machine learning for the detection of oil spills in satellite images. Machine Learning, 30, 195–215.Google Scholar
  47. Kubat, M., & Matwin, S. (1997). Addressing the curse of imbalanced training sets: One-sided selection. In Proceedings of the Fourteenth International Conference on Machine Learning (pp. 179–186). San Francisco, CA: Morgan Kaufmann.Google Scholar
  48. Langley, P., Iba, W., & Thompson, K. (1992). An analysis of Bayesian classifiers. In Proceedings of the Tenth National Conference on Artificial Intelligence (pp. 223–228). Menlo Park, CA: AAAI Press.Google Scholar
  49. Langley, P., & Simon, H. (1995). Applications of machine learning and rule induction. Communications of the ACM, 38, 54–64.Google Scholar
  50. Levitt, T., Agosta, J., & Binford, T. (1989). Model-based influence diagrams for machine vision. In Proceedings of the Fifth Annual Conference on Uncertainty in Artificial Intelligence (pp. 371–388). New York, NY: Elsevier Science.Google Scholar
  51. Lewis, D., & Catlett, J. (1994). Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the Eleventh International Conference on Machine Learning (pp. 148–156). San Francisco, CA: Morgan Kaufmann.Google Scholar
  52. Lin, C., & Nevatia, R. (1998). Building detection and description from a single intensity image. Computer Vision and Image Understanding, 72, 101–121.Google Scholar
  53. Maloof, M. (2000). An initial study of an adaptive hierarchical vision system. In Proceedings of the Seventeenth International Conference on Machine Learning (pp. 567–573). San Francisco, CA: Morgan Kaufmann.Google Scholar
  54. Maloof, M. (2002). On machine learning, ROC analysis, and statistical tests of significance. In Proceedings of the Sixteenth International Conference on Pattern Recognition. Los Alamitos, CA: IEEE Press.Google Scholar
  55. Maloof, M., Beiden, S., & Wagner, R. (2002). Analysis of competing classifiers in terms of components of variance of ROC accuracy measures.Technical Report No. CS-02-01.Washington, DC: Department of Computer Science, Georgetown University. (http://www.cs.georgetown.edu/?maloof/pubs/cstr-02-01.html)Google Scholar
  56. Maloof, M., Duric, Z., Michalski, R., & Rosenfeld, A. (1996). Recognizing blasting caps in X-ray images. In Proceedings of the Image Understanding Workshop (pp. 1257–1261). San Francisco, CA: Morgan Kaufmann.Google Scholar
  57. Maloof, M., Langley, P., Binford, T., & Nevatia, R. (1998). Generalizing over aspect and location for rooftop detection. In Proceedings of the Fourth IEEE Workshop on Applications of Computer Vision (pp. 194–199). Los Alamitos, CA: IEEE Press.Google Scholar
  58. Maloof, M., Langley, P., Binford, T., & Sage, S. (1998). Learning to detect rooftops in overhead imagery. Technical Report No. 98-1. Palo Alto, CA: Institute for the Study of Learning and Expertise.Google Scholar
  59. Maloof, M., Langley, P., Sage, S., & Binford, T. (1997). Learning to detect rooftops in aerial images. In Proceedings of the Image Understanding Workshop (pp. 835–845). San Francisco, CA: Morgan Kaufmann.Google Scholar
  60. Maloof, M., & Michalski, R. (1997). Learning symbolic descriptions of shape for object recognition in X-ray images. Expert Systems with Applications, 12, 11–20.Google Scholar
  61. Metz, C. (1978). Basic principles of ROC analysis. Seminars in Nuclear Medicine, VIII:4, 283–298.Google Scholar
  62. Metz, C. (1989). Some practical issues of experimental design and data analysis in radiological ROC studies. Investigative Radiology, 24, 234–245.Google Scholar
  63. Michalski, R., Mozetic, I., Hong, J., & Lavrac, H. (1986). The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. In Proceedings of the Fifth National Conference on Artificial Intelligence (pp. 1041–1045). Menlo Park, CA: AAAI Press.Google Scholar
  64. Miller, D., & Uyar, H. (1997). A mixture of experts classifier with learning based on both labeled and unlabeled data. In M. Mozer, M. Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems (Vol. 9). Cambridge, MA: MIT Press.Google Scholar
  65. Mohan, R., & Nevatia, R. (1989). Using perceptual organization to extract 3-D structures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 1121–1139.Google Scholar
  66. Mossman, D. (1999). Three-way ROCs. Medical Decision Making, 19, 78–89.Google Scholar
  67. Nayar, S., & Poggio, T. (Eds.). (1996). Early Visual Learning. New York, NY: Oxford University Press.Google Scholar
  68. Noronha, S., & Nevatia, R. (1997). Detection and description of buildings from multiple aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 588–594). Los Alamitos, CA: IEEE Press.Google Scholar
  69. Osuna, E., Freund, R., & Girosi, F. (1997). Training Support Vector Machines: An application to face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 130–136). Los Alamitos, CA: IEEE Press.Google Scholar
  70. Pazzani, M., Merz, C., Murphy, P., Ali, K., Hume, T., & Brunk, C. (1994). Reducing misclassification costs. In Proceedings of the Eleventh International Conference on Machine Learning (pp. 217–225). San Francisco, CA: Morgan Kaufmann.Google Scholar
  71. Pomerleau, D. (1996). Neural network vision for robot driving. In S. Nayar, & T. Poggio (Eds.), Early visual learning (pp. 161–181). New York, NY: Oxford University Press.Google Scholar
  72. Pope, A., & Lowe, D. (1996). Learning probabilistic appearance models for object recognition. In S. Nayar, & T. Poggio (Eds.), Early Visual Learning (pp. 67–97). New York, NY: Oxford University Press.Google Scholar
  73. Pope, A., & Lowe, D. (2000). Probabilistic models of appearance for 3-D object recognition. International Journal of Computer Vision, 40, 149–167.Google Scholar
  74. Provan, G., Langley, P., & Binford, T. (1996). Probabilistic learning of three-dimensional object models. In Proceedings of the Image Understanding Workshop (pp. 1403–1413). San Francisco, CA: Morgan Kaufmann.Google Scholar
  75. Provost, F., & Fawcett,T. (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (pp. 43–48). Menlo Park, CA: AAAI Press.Google Scholar
  76. Provost, F., Fawcett, T., & Kohavi, R. (1998). The case against accuracy estimation for comparing induction algorithms. In Proceedings of the Fifteenth International Conference on Machine Learning (pp. 445–453). San Francisco, CA: Morgan Kaufmann.Google Scholar
  77. Quinlan, J. (1993). C4.5: Programs for Machine Learning. San Francisco, CA: Morgan Kaufmann.Google Scholar
  78. Rowley, H., Baluja, S., & Kanade, T. (1996). Neural network-based face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 203–208). Los Alamitos, CA: IEEE Press.Google Scholar
  79. Sarkar, S., & Soundararajan, P. (2000). Supervised learning of large perceptual organization: Graph spectral partitioning and learning automata. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:5, 504–525.Google Scholar
  80. Segen, J. (1994). GEST: A learning computer vision system that recognizes hand gestures. In R. Michalski, & G. Tecuci (Eds.), Machine Learning: A Multistrategy Approach (Vol. 4, pp. 621–634). San Francisco, CA: Morgan Kaufmann.Google Scholar
  81. Sengupta, K., & Boyer, K. (1993). Incremental model base updating: Learning new model sites. In Papers from the the AAAI Fall Symposium on Machine Learning in Computer Vision: What, Why, and How? Technical Report No. FS-93-04 (pp. 1–5). Menlo Park, CA: AAAI Press.Google Scholar
  82. Shepherd, B. (1983). An appraisal of a decision tree approach to image classification. In Proceedings of the Eighth International Joint Conference on Artificial Intelligence (pp. 473–475). San Francisco, CA: Morgan Kaufmann.Google Scholar
  83. Soderland, S., & Lehnert, W. (1994). Corpus-driven knowledge acquisition for discourse analysis. In Proceedings of the Twelfth National Conference on Artificial Intelligence (pp. 827–832). Menlo Park, CA: AAAI Press.Google Scholar
  84. Swets, J. (1988). Measuring the accuracy of diagnostic systems. Science, 240, 1285–1293.Google Scholar
  85. Swets, J., & Pickett, R. (1982). Evaluation of Diagnostic Systems: Methods from Signal Detection Theory. New York, NY: Academic Press.Google Scholar
  86. Teller, A., & Veloso, M. (1997). PADO: A new learning architecture for object recognition. In K. Ikeuchi, & M. Veloso (Eds.), Symbolic Visual Learning (pp. 77–112). New York, NY: Oxford University Press.Google Scholar
  87. Thompson, M., & Zucchini, W. (1986). On the statistical analysis of ROC curves. Statistics in Medicine, 18, 452–462.Google Scholar
  88. Turney, P. (1995). Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm. Journal of Artificial Intelligence Research, 2, 369–409.Google Scholar
  89. Viola, P. (1993). Feature-based recognition of objects. In Papers from the the AAAI Fall Symposium on Machine Learning in Computer Vision: What, Why, and How? Technical Report No. FS-93-04 (pp. 60–64). Menlo Park, CA: AAAI Press.Google Scholar
  90. Wagner, R., Beiden, S., & Metz, C. (2001). Continuous versus categorical data for ROC analysis: Some quantitative considerations. Academic Radiology, 8, 328–334.Google Scholar
  91. Walpole, R., Myers, R., & Myers, S. (1998). Probability and Statistics for Engineers and Scientists, 6th edn. Upper Saddle River, NJ: Prentice-Hall.Google Scholar
  92. Wolpert, D. (1992). Stacked generalization. Neural Networks, 5, 241–259.Google Scholar
  93. Woods, K., Cook, D., Hall, L., Bowyer, K., & Stark, L. (1995). Learning membership functions in a function-based object recognition system. Journal of Artificial Intelligence Research, 3, 187–222.Google Scholar
  94. Woods, K., Kegelmeyer, W., & Bowyer, K. (1997). Combination of multiple classifiers using local accuracy estimates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19:4, 405–410.Google Scholar
  95. Zurada, J. (1992). Introduction to Artificial Neural Systems. St. Paul, MN: West Publishing.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • M.A. Maloof
    • 1
  • P. Langley
    • 2
  • T.O. Binford
    • 3
  • R. Nevatia
    • 4
  • S. Sage
    • 2
  1. 1.Department of Computer ScienceGeorgetown UniversityWashingtonUSA
  2. 2.Institute for the Study of Learning and ExpertisePalo AltoUSA
  3. 3.Robotics Laboratory, Department of Computer ScienceStanford UniversityStanfordUSA
  4. 4.Institute for Robotics and Intelligent Systems, School of EngineeringUniversity of Southern CaliforniaLos AngelesUSA

Personalised recommendations