Statistics and Computing

, Volume 7, Issue 1, pp 45–56 | Cite as

Applying classification algorithms in practice

  • Carla E. Brodley
  • Padhraic Smyth


In this paper we present a perspective on the overall process of developing classifiers for real-world classification problems. Specifically, we identify, categorize and discuss the various problem-specific factors that influence the development process. Illustrative examples are provided to demonstrate the iterative nature of the process of applying classification algorithms in practice. In addition, we present a case study of a large scale classification application using the process framework described, providing an end-to-end example of the iterative nature of the application process. The paper concludes that the process of developing classification applications for operational use involves many factors not normally considered in the typical discussion of classification models and algorithms.

Classification algorithms diagnostics model selection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Ardanuy, P. E., Han, D. and Salomonson, V. V. (1991) The moderate resolution imaging spectrometer (MODIS) science and data system requirement. IEEE Transactions on Geoscience and Remote Sensing, 29, 75–88.Google Scholar
  2. Belesley, D. A. (1986) Model selection in regression analysis, regression diagnostics and prior knowledge. International Journal of Forecasting, 2, 41–6.Google Scholar
  3. Bourlard, H. A. and Morgan, N. (1994) Connectionist Speech Recognition: A Hybrid Approach. Boston, MA: Kluwer Academic Publishers.Google Scholar
  4. Box, D. R. (1990) Role of models in statistical analysis. Statistical Science, 5, 169–74.Google Scholar
  5. Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984) Classification and Regression Trees. Belmont, CA: Wadsworth International Group.Google Scholar
  6. Brodley, C. E. (1995) Recursive automatic bias selection for classifier construction. Machine Learning, 20, 63–94.Google Scholar
  7. Buntine, W. and Smyth, P. (1994) Learning from data: A probabilistic framework. Tutorial notes for AAAI-94 conference. Menlo Park, CA: AAAI.Google Scholar
  8. Buntine, W. (1994) Operations for learning with graphical models. Journal of Artificial Intelligence Research, 2, 159–225.Google Scholar
  9. Burl, M. C., Fayyad, U. M., Perona, P., Smyth, P. and Burl, M. P. (1994) Automating the hunt for volcanoes on Venus. Proceedings of the 1994 Computer Vision and Pattern Recognition Conference (CVPR-94) pp. 302–309. Los Alamitos, CA: IEEE Computer Society Press.Google Scholar
  10. Cheeseman, P. (1990) On finding the most probable model. In Shrager and Langley (eds), Computational Models of Scientific Discovery and Theory Formation. San Mateo, CA: Morgan Kaufmann.Google Scholar
  11. Dawid, A. P. (1976) Properties of diagnostic data distributions. Biometrics, 32, 647–58.Google Scholar
  12. Draper, B. A., Brodley, C. E. and Utgoff, P. E. (1994) Goal-directed classification using linear machine decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16, 888–93.Google Scholar
  13. Evans, B. and Fisher, D. (1994) Overcoming process delays with decision tree induction. IEEE Expert, 9, 60–6.Google Scholar
  14. Fayyad, U. M., Smyth, P., Weir, N. and Djorgovski, S. (1995) Automated analysis and exploration of large image databases. Journal of Intelligent Information Systems, 4, 7–25.Google Scholar
  15. Fayyad, U. M., Piatetsky-Shapiro, G. and Smyth, P. (1996a) From data-mining to knowledge discovery: An overview. In Fayyad, Piatetsky-Shapiro, Smyth and Uthurasamy (eds), Advances in Knowledge Discovery and Data Mining. AAAI/ MIT Press, 1–36.Google Scholar
  16. Fayyad, U. M., Djorgovski, S. G. and Weir, N. (1996b) Automating the analysis and cataloging of sky surveys. In Fayyad, Piatetsky-Shapiro, Smyth and Uthurasamy (eds), Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 471–94.Google Scholar
  17. Fung, W. K. (1995) Diagnostics in linear discriminant analysis. Journal of American Statistics Association, 90, 952–6.Google Scholar
  18. Gelman, A., Carlin, J. B., Stern, H. and Rubin, D. (1995) Bayesian Data Analysis. New York, NY: Chapman and Hall.Google Scholar
  19. Hand, D. J. (1993) Artificial Intelligence Frontiers in Statistics: AI and Statistics III. London, UK: Chapman and Hall.Google Scholar
  20. Hand, D. J. (1994a) Statistical strategy: Step 1. In Cheeseman and Oldford (eds), Selecting Models from Data: Artificial Intelligence and Statistics IV. New York: Springer-Verlag.Google Scholar
  21. Hand, D. J. (1994b) Deconstructing statistical questions. Journal of the Royal Statistical Society, Series A, 157, 317–56.Google Scholar
  22. Hastie, T. and Tibshirani, R. (1995) Discriminant adaptive nearest neighbor classification. Proceedings of the First International Conference on Knowledge Discovery and Data Mining. Montreal, Quebec: AAAI Press, 142–49.Google Scholar
  23. Kodratoff, Y. (1994) Guest editorial. AI Communications, 7.Google Scholar
  24. Landgrebe, D; and Biehl, L. (1994) An Introduction to Multispec. West Lafayette, IN: Purdue Research Foundation.Google Scholar
  25. Langley, P. and Simon, H. A. (1995) Applications of machine learning and rule induction. Communications of the ACM, 38, 55–64.Google Scholar
  26. Lee, K. F. (1989) Automatic Speech Recognition: The Development of the Sphinx System. Boston, MA: Kluwer Academic Publishers.Google Scholar
  27. Lehmann, E. L. (1990) Model specification: The views of Fisher and Neyman, and later developments. Statistical Science, 5, 160–8.Google Scholar
  28. Linhart, H. and Zucchini, W. (1986) Model Selection. NY: Wiley.Google Scholar
  29. Matthies, L. (1992) Stereo vision for planetary rovers-stochastic modeling to near real-time implementation. The International Journal of Computer Vision, 8, 71–91.Google Scholar
  30. Michie, D. (1989) Problems of computer-aided concept formation. In Quinlan (ed.), Applications of Expert Systems. Wokingham, UK: Addison-Wesley.Google Scholar
  31. Nakhaeizadeh, G. (1995) What Daimler-Benz has learned as an industrial partner from the machine learning project StatLog. Working Notes of: Workshop on Applying Machine Learning in Practice: Twelfth International Machine Learning Conference pp. 22–6. Available at mil/aha/imlc95-workshop/notes.html.Google Scholar
  32. Petsche, T., Marcantonio, A., Darken, C., Hanson, S. J., Kuhn, G. M. and Santoso, I. (in press) A neural network autoassociator for induction motor failure prediction. In Touretzky, Mozer and Hasselmo (eds), Advances in Neural Information Processing Systems 8, MIT Press.Google Scholar
  33. Pettit, L. I. (1986) Diagnostics in Bayesian model choice. The Statistician, 35, 183–90.Google Scholar
  34. Quinlan, J. R. (1986) Induction of decision trees. Machine Learning, 1, 81–106.Google Scholar
  35. Quinlan, J. R. (1993) C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann.Google Scholar
  36. Reich, Y., Konda, S. L., Levy, S. N., Monarch, I. A. and Subrah-manian, E. (1993) New roles for machine learning in design. Artificial Intelligence in Design, 8, 165–81.Google Scholar
  37. Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge, UK: Cambridge University Press.Google Scholar
  38. Royce, W. W. (1970) Managing the development of large software systems. Proceedings IEEE WESCON pp. 1–9.Google Scholar
  39. Rudstrom, A. (1995) Applications of machine learning, (Technical Report: 95–018), Stockholm, Sweden: University of Stockholm, Department of Computer and Systems Sciences.Google Scholar
  40. Schmidt, W. F., Levelt, D. F. and Duin, R. P. W. (1994) An experimental comparison of neural classifiers with ‘traditional’ classifiers. In Gelsema and Kanal (eds), Pattern Recognition in Practice IV: Multiple Paradigms, Comparative Studies, and Hybrid Systems. Amsterdam: Elsevier Science.Google Scholar
  41. Schwartz, S., Wiles, J. and Philips, S. (1993) Connectionist, rule-based, and Bayesian decision aids: An empirical comparison. In Hand (ed.), Artificial Intelligence Frontiers in Statistics: AI and Statistics III. London: Chapman and Hall.Google Scholar
  42. Silverman, B. W. (1986) Density Estimation for Statistics and Data Analysis. London: Chapman and Hall.Google Scholar
  43. Smyth, P. (1994a) Hidden Markov monitoring for fault detection in dynamic systems. Pattern Recognition, 27, 149–64.Google Scholar
  44. Smyth, P. (1994b) Markov monitoring with unknown states. IEEE Journal on Selected Areas in Communications, special issue on intelligent signal processing for communications, 12, 1600–12.Google Scholar
  45. Smyth, P., Burl, M., Fayyad, U. M. and Perona, P. (1996) Knowledge discovery in large image databases: Dealing with uncertainties in ground truth. In Fayyad, Piatetsky-Shapiro, Smyth and Uthurasamy (eds), Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 517–40.Google Scholar
  46. Spiegelhalter, D. J., Dawid, A. P., Lauritzen, S. L. and Cowell, R. G. (1993) Bayesian analysis in expert systems (with discussion). Statistical Science, 8, 219–83.Google Scholar
  47. Wang, Q. R. and Suen, C. Y. (1984) Analysis and design of a decision tree based on entropy reduction and its application to large character set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 4, 406–17.Google Scholar
  48. Weir, N., Fayyad, U. and Djorgovski, S. G. (1995a) Automated star/galaxy classification for POSS-II. The Astronomical Journal, 109, 2401–14.Google Scholar
  49. Weir, N., Djorgovski, S. G. and Fayyad, U. (1995b) Initial galaxy counts from digitized POSS-II. The Astronomical Journal, 110, 1–20.Google Scholar
  50. Weiss, S. M. and Kulikowski, C. S. (1991) Computer Systems that Learn. Palo Alto: Morgan Kaufmann.Google Scholar
  51. Widrow, B., Rumelhart, D. E. and Lehr, M. A. (1994) Neural networks: Applications in industry, business, and science. Communications of the ACM, 37, 93–105.Google Scholar

Copyright information

© Chapman and Hall 1997

Authors and Affiliations

  • Carla E. Brodley
    • 1
  • Padhraic Smyth
    • 2
  1. 1.School of Electrical and Computer EngineeringPurdue UniversityWest LafayetteUSA
  2. 2.Information and Computer ScienceUniversity of CaliforniaIrvineUSA

Personalised recommendations