Statistical Approach to the Identification of Separation Surface for Spatial Data

Leung, Yee

doi:10.1007/978-3-642-02664-5_3

Yee Leung²

Part of the book series: Advances in Spatial Science ((ADVSPATIAL))

1413 Accesses

Abstract

In spatial clustering, spatial objects are grouped into clusters according to their similarities. In terms of learning or pattern recognition, it belongs to the identification of structures/classes through an unsupervised process. In terms of data mining, it is the discovery of intrinsic classes, particularly new classes, in spatial data. It formulates class structures and determines the number of classes. I have examined in Chap. 2 the importance of clustering as a means for unraveling interesting, useful and natural patterns in spatial data. The process generally does not involve how to separate predetermined classes, or how to determine whether classes are significantly different from each other, or how to assign new objects to given classes. Another fundamental issue of spatial knowledge discovery involves spatial classification. It essentially deals with the separation of pre-specified classes and the assignment of new spatial objects to these classes on the basis of some measurements (with respect to selected features) about them. In terms of learning or pattern recognition, it is actually a supervised learning process which searches for the decision surface separating appropriately various classes. In terms of data mining, it often involves the discovery of classification rules from the training/learning data set that can separate distinct/genuine classes of spatial objects and the assignment of new spatial objects to these labeled classes. Whether the pre-specified classes are significantly different is usually not the main concern in classification. It can be determined by procedures such as the analysis of variance in statistics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66
Google Scholar
Allenby GM, Rossi PE (1994) Modeling household purchase behavior with logistic normal regression. J Am Stat Assoc 89:1218–1231
Article Google Scholar
Anderson JA (1982) Logistic discrimination. In: Krishnaiah PR, Kanal L (eds) Hand book of statistics, vol 2. North-Holland, Amsterdam, pp 169–191
Google Scholar
Angulo C, Catala A (2000) K_SVCR. A Multi-class support vector machine. In: Lopez de Mantaras R, Plaza E (eds) ECML 2000, LNAI 1810. Springer, Berlin, pp. 31–38
Google Scholar
Berkson J (1944) Application of the logistic function to bio-assay. J Am Stat Assoc 39:357–365
Article Google Scholar
Brown M, Lewis HG, Gunn SR (1999) Support vector machines for spectral unmixing. IGRASS’99 2:1363–1365
Google Scholar
Burges C, Scholkopf B (1997) Improving the accuracy and speed of support vector machines. In: Mozer M, Jordan M, Petsche T (eds) Neural information processing systems. MIT, Cambridge
Google Scholar
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167
Article Google Scholar
Carpenter GA, Grossberg S, Reynolds JH (1991) ARTMAP: supervised real time learning and classification of nonstationary data by a self-organising neural network. Neural Networks 4:565–588
Article Google Scholar
Collett D (1991) Modelling binary data. Chapman and Hall, London
Google Scholar
Das Gupta S (1980) Discriminant analysis. In: Krishnaiah PR, Kanal L, Fisher RA (eds) An appreciation. Springer, New York, pp 161–170
Google Scholar
Efron B (1975) The efficiency of logistic regression compared to normal discriminant analysis. J Am Stat Assoc 70(352):892–898
Article Google Scholar
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Google Scholar
Fitzpatrick DB (1976) An analysis of bank credit card profit. J Bank Res 7:199–205
Google Scholar
Fix E, Hodges JL (1951) Discriminatory analysis – nonparametric discrimination: consistency properties. Report No. 4, Project no. 21-29-004. USAF School of Aviation Medicine, Randolph Field, Texas. Reprinted in Int Stat Rev 57(1989):238–247
Google Scholar
Geisser S (1982) Beyesian discrimination. In: Krishnaiah PR, Kanal L (eds) Hand book of statistics, vol 2. North-Holland, Amsterdam, pp 101–120
Google Scholar
Guadagni P, Little J (1983) A logit model of brand choice calibrated on scanner data. Market Sci 2:203–238
Article Google Scholar
Hand DJ (1982) Kernel discriminant analysis. Research Studies Press, Letchworth
Google Scholar
Hand DJ, Henley WE (1997) Statistical classification methods in consumer credit scoring. J Roy Stat Soc Series A 160:523–541
Google Scholar
Hart PE (1968) The condensed nearest neighbour rule. IEEE Trans Inform Theor 14:515–516
Article Google Scholar
Hastie TJ, Tibshirani RJ (1996) Discriminant analysis by Gaussian mixtures. J R Stat Soc B 58:155–176
Google Scholar
Hearst MA, Scholkopf B, Dumais S (1998) Trends and controversies-support vector machines. IEEE Intell Syst 13(4):18–28
Article Google Scholar
Hermes L, Frieauff D, Puzicha J, Buhmann JM (1999) Support vector machines for land usage classification in Landsat TM imagery. In: Proceeding of the IEEE international geoscience and remote sensing symposium, vol. 1. Hamburg, pp. 348–350
Google Scholar
Hosmer DW, Lemeshow S (1989) Applied logistic regression. Wiley, New York
Google Scholar
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers In Proceedings of the 11th conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, CA
Google Scholar
Keerthi SS, Shevade SK, Bhattacharya U, Murthy KRK (2000) A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Trans Neural Network 11(1):124–136
Article Google Scholar
Kohavi R (1996) Scaling up the accuracy of naive-Bayes classifiers: a deceision-tree hybrid. In: Proceedings of the second international conference on knowledge discovery and data mining. Morgan Kaufman, San Mateo CA
Google Scholar
Langley P, Sage S (1994) Induction of selective Bayesian classifers. In: Proceedings of the 10th conference on uncertainty in artificial intelligence. Morgan Kaufmann, Seattle, WA,
Google Scholar
Leung Y, Leung KS, Ma JH (2003a) Data mining for bank databases (unpublished paper)
Google Scholar
Leung Y, Leung KS, Mei CL (2003b) Data mining for credit card promotion in the banking sector (unpublished paper)
Google Scholar
Leung Y, Luo JC, Zhou CH, Ma JH (2002b) Support vector machine for spatial feature extraction and classification of high resolution remote sensing images (unpublished paper)
Google Scholar
McLachlan GJ (1992) Discriminant analysis and statistical pattern recognition. Wiley, New York
Book Google Scholar
Menard SW (1995) Applied logistic regression analysis. Sage Publication, Thousand Oaks, CA
Google Scholar
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Mateo, CA
Google Scholar
Piramuthu S (1999) Feature selection for financial credit-risk evaluation. Inform J Comput 11(3):258–266
Article Google Scholar
Powell MJD (1987) Radial basis functions for multivariable interpolation: a review. In: Mason JC, Cox MG (eds) Algorithms for Approximation of Functions and Data. Oxford University Press, Oxford, pp 143–167
Google Scholar
Ramoni M, Sebastiani P (1996) Robust learning with missing data. Technical Report Kmi-TR-28, Knowledge Media Institute, The Open University
Google Scholar
Rubinstein YD, Hastie TJ (1997) Discriminative vs informative learning. In: Proceedings of 3rd international conference on knowledge discovery and data mining. AAAI Press, Menlo Park, CA, pp. 49–53
Google Scholar
Scholkopf B, Burges CJC, Smola AJ (1999) Advances in kernel methods: support vector learning. MIT, Cambridge
Google Scholar
Scholkopf B, Sung KK, Burges CJC, Girosi F, Niyogi P, Poggio T, Vapnik VN (1997) Comparing support vector machines with Gaussian Kernels to radial basis function classifiers. IEEE Trans Signal Process 45(11):2758–2765
Article Google Scholar
Smith CAB (1947) Some examples of discrimination. Ann Eugen 13:272–282
Google Scholar
Stanfill C, Waltz D (1986) Toward memory-based reasoning. Commun ACM 29:1213–1228
Article Google Scholar
Thomas LC, Crook JN, Edelman DB (eds) (1992) Credit scoring and credit control. Clarendon Press, Oxford
Google Scholar
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Google Scholar
Wand MP, Jones MC (1995) Kernel smoothing. Chapman and Hall, London
Google Scholar
Xu ZB, Leung Y (2004) How neural networks can be made more effective and efficient: a view of learning theory (unpublished paper)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Geography & Resource Management Shatin, The Chinese University of Hong Kong, New Territories, Hong Kong SAR
Prof. Yee Leung

Authors

Prof. Yee Leung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yee Leung .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Leung, Y. (2010). Statistical Approach to the Identification of Separation Surface for Spatial Data. In: Knowledge Discovery in Spatial Data. Advances in Spatial Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02664-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-02664-5_3
Published: 18 August 2009
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02663-8
Online ISBN: 978-3-642-02664-5
eBook Packages: Humanities, Social Sciences and LawLiterature, Cultural and Media Studies (R0)

Publish with us

Policies and ethics