Skip to main content

Supervised Classification Techniques

  • Chapter
  • First Online:
Remote Sensing Digital Image Analysis

Abstract

Supervised classification is the technique most often used for the quantitative analysis of remote sensing image data. At its core is the concept of segmenting the spectral domain into regions that can be associated with the ground cover classes of interest to a particular application. In practice those regions may sometimes overlap. A variety of algorithms is available for the task, and it is the purpose of this chapter to cover those most commonly encountered. Essentially, the different methods vary in the way they identify and describe the regions in spectral space. Some seek a simple geometric segmentation while others adopt statistical models with which to associate spectral measurements and the classes of interest. Some can handle user-defined classes that overlap each other spatially and are referred to as soft classification methods; others generate firm boundaries between classes and are called hard classification methods, in the sense of establishing boundaries rather than having anything to do with difficulty in their use. Often the data from a set of sensors is available to help in the analysis task. Classification methods suited to multi-sensor or multi-source analysis are the subject of Chap. 12.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    N.J. Nilsson, Learning Machines, McGraw-Hill, N.Y., 1965.

  2. 2.

    The important distinction between information and spectral classes was first made in P.H. Swain and S.M. Davis, eds., Remote Sensing: the Quantitative Approach, McGraw-Hill, N.Y., 1978.

  3. 3.

    J.E. Freund, Mathematical Statistics, 5th ed., Prentice Hall, N.J., 1992.

  4. 4.

    See Appendix D.

  5. 5.

    See Swain and Davis, loc. cit.

  6. 6.

    See Swain and Davis, loc. cit., although some authors regard this as a conservatively high number of samples.

  7. 7.

    See M. Pal and G.F. Foody, Feature selection for classification of hyperspectral data for SVM, IEEE Transactions on Geoscience and Remote Sensing, vol. 48, no. 5, May 2010, pp. 2297–2307 for a demonstration of the Hughes phenomenon with the support vector machine of Sect. 8.14.

  8. 8.

    Based on results presented in K.S. Fu, D.A. Landgrebe and T.L. Phillips, Information processing of remotely sensed agricultural data, Proc. IEEE, vol. 57, no. 4, April 1969, pp. 639–653.

  9. 9.

    G. F Hughes, On the mean accuracy of statistical pattern recognizers, IEEE Transactions on Information Theory, vol. IT-14, no.1, 1968, pp. 55–63.

  10. 10.

    C.M. Bishop, Pattern Recognition and Machine Learning, Springer Science + Business Media, LLC, N.Y., 2006.

  11. 11.

    A very good treatment of Gaussian mixture models will be found in C.M. Bishop, loc. cit.

  12. 12.

    As in Sect. 8.3, using the logarithmic expression simplifies the analysis to follow.

  13. 13.

    See K.B. Petersen and M.S. Pedersen, The Matrix Cookbook, 14 Nov 2008, http://matrixcookbook.com

  14. 14.

    In C.M. Bishop, loc. cit., it is called the responsibility.

  15. 15.

    See Bishop, loc. cit.

  16. 16.

    See K.B. Petersen and M.S. Pedersen, loc. cit.

  17. 17.

    For a good treatment of Lagrange multipliers see C.M. Bishop, loc. cit., Appendix E

  18. 18.

    It is possible to implement a minimum distance classifier using distance measures other than Euclidean: see A.G. Wacker and D.A. Landgrebe, Minimum distance classification in remote sensing, First Canadian Symposium on Remote Sensing, Ottawa, 1972.

  19. 19.

    See N.J. Nilsson, Learning Machines, McGraw-Hill, N.Y., 1965.

  20. 20.

    See B.V. Dasarathy, Nearest neighbour (NN) norms: NN Pattern Classification Techniques, IEEE Computer Society Press, Los Alamitos, California, 1991.

  21. 21.

    F.A. Kruse, A.B. Letkoff, J.W. Boardman, K.B. Heidebrecht, A.T. Shapiro, P.J. Barloon and A.F.H. Goetz, The spectral image processing system (SIPS)—interactive visualization and analysis of imaging spectrometer data, Remote Sensing of Environment, vol. 44, 1993, pp. 145–163.

  22. 22.

    See Nilsson, ibid.

  23. 23.

    J.A. Gualtieri and R.F. Cromp, Support vector machines for hyperspectral remote sensing classification, Proc. SPIE, vol. 3584, 1998, pp. 221–232.

  24. 24.

    See Prob. 8.13.

  25. 25.

    See C.M. Bishop, loc. cit., Appendix E.

  26. 26.

    See Bishop, loc. cit.

  27. 27.

    ibid.

  28. 28.

    ibid.

  29. 29.

    C.J.C. Burges, A tutorial on support vector machines for pattern recognition, Data Mining and Knowledge Discovery, vol. 2, 1998, pp. 121–166.

  30. 30.

    ibid.

  31. 31.

    See Bishop, loc. cit. p. 296.

  32. 32.

    K. Song, Tackling Uncertainties and Errors in the Satellite Monitoring of Forest Cover Change, Ph.D. Dissertation, The University of Maryland, 2010.

  33. 33.

    See F. Melgani and L. Bruzzone, Classification of hyperspectral remote sensing images with support vector machines, IEEE Transactions on Geoscience and Remote Sensing, vol. 42, no. 8, August 2004, pp. 1778–1790.

  34. 34.

    See Prob. 8.7.

  35. 35.

    See Melgani and Bruzzone, loc. cit.

  36. 36.

    See N.J. Nilsson, Learning Machines, McGraw-Hill, N.Y., 1965.

  37. 37.

    See T. Lee and J.A. Richards, A low cost classifier for multi-temporal applications, Int. J. Remote Sensing, vol. 6, 1985, pp. 1405–1417.

  38. 38.

    All superscripts in this section are stage (iteration) indices and not powers.

  39. 39.

    See C.M. Bishop, loc. cit.

  40. 40.

    Nilsson, loc. cit.

  41. 41.

    See Lee and Richards, loc. cit., for one approach.

  42. 42.

    Y.H. Pao, Adaptive Pattern Recognition and Neural Networks, Addison-Wesley, Reading Mass., 1989.

  43. 43.

    These will be specified in the labelling of the training data pixels. The actual value taken by \( t_{k} \) will depend on how the set of outputs is used to represent classes. Each individual output could be a specific class indicator, e.g. 1 for class 1 and 0 for class 2 as with the \( y_{i} \) in (8.33); alternatively some more complex coding of the outputs could be adopted. This is considered in Sect. 8.19.3.

  44. 44.

    The conjugate gradient method can also be used: see J.A. Benediktsson, P.H. Swain and O.K. Esroy, Conjugate-gradient neural networks in classification of multisource and very high dimensional remote sensing data, Int. J. Remote Sensing, vol. 14, 1993, pp. 2883–2903.

  45. 45.

    Y.H. Pao, loc. cit.

  46. 46.

    This is tantamount to deriving the algorithm with the error being calculated over all pixels \( p \) in the training set, viz \( E = \mathop \sum \limits_{p} E_{p} , \) where \( E_{p} \) is the error for a single pixel in (8.60).

  47. 47.

    R.P. Lippman, An introduction to computing with neural nets, IEEE ASSP Magazine, April 1987, pp. 4–22.

  48. 48.

    This is known as the point spread function effect.

  49. 49.

    For statistical context methods see P.H. Swain, S.B. Varderman and J.C. Tilton, Contextual classification of multispectral image data, Pattern Recognition, vol. 13, 1981, pp. 429–441, and N. Khazenie and M.M. Crawford, A spatial–temporal autocorrelation model for contextual classification, IEEE Transactions on Geoscience and Remote Sensing, vol. 28, no. 4, July 1990, pp. 529–539.

  50. 50.

    See P. Atkinson, J.L. Cushnie, J.R. Townshend and A. Wilson, Improving thematic map land cover classification using filtered data, Int. J. Remote Sensing, vol. 6, 1985, pp. 955–961.

  51. 51.

    R.L. Kettig and D.A. Landgrebe, Classification of multispectral image data by extraction and classification of homogeneous objects, IEEE Transactions on Geoscience Electronics, vol. GE-14, no. 1, 1976, pp. 19–26.

  52. 52.

    http://dynamo.ecn.purdue/-biehl/Multispec/.

  53. 53.

    F.E. Townsend, The enhancement of computer classifications by logical smoothing, Photogrammetric Engineering and Remote Sensing, vol. 52, 1986, pp. 213–221.

  54. 54.

    An alternative way of handling the full neighbourhood is to take the geometric mean of the neighbourhood contributions.

  55. 55.

    J.A. Richards, D.A. Landgrebe and P.H. Swain, On the accuracy of pixel relaxation labelling, IEEE Transactions on Systems, Man and Cybernetics, vol. SMC-11, 1981, pp. 303–309.

  56. 56.

    P. Gong and P.J. Howarth, Performance analyses of probabilistic relaxation methods for land-cover classification,Remote Sensing of Environment, vol. 30, 1989, pp. 33–42.

  57. 57.

    T. Lee, Multisource context classification methods in remote sensing, PhD Thesis, The University of New South Wales, Kensington, Australia, 1984.

  58. 58.

    T. Lee and J.A. Richards, Pixel relaxation labelling using a diminishing neighbourhood effect, Proc. Int. Geoscience and Remote Sensing Symposium, IGARSS89, Vancouver, 1989, pp. 634–637.

  59. 59.

    Other simple examples will be found in Richards, Landgrebe and Swain, loc. cit.

  60. 60.

    ibid.

  61. 61.

    Full details of this example will be found in T. Lee and J.A. Richards, loc. cit.

  62. 62.

    S. Peleg and A. Rosenfeld, A new probabilistic relaxation procedure, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-2, 1980, pp. 362–369.

  63. 63.

    Although a little complex in view of the level of treatment here, see S. German and D. German, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-6, no. 6, 1984, pp. 721–741.

  64. 64.

    For the simple first order (four neighbour) neighbourhood the concept of cliques is not important, since there are only the four neighbourhood relationships.

  65. 65.

    See J. Besag, On the statistical analysis of dirty pictures, J. Royal Statistical Society B, vol. 48, no. 3, 1986, pp. 259–302.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John A. Richards .

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Richards, J.A. (2013). Supervised Classification Techniques. In: Remote Sensing Digital Image Analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30062-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30062-2_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30061-5

  • Online ISBN: 978-3-642-30062-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics