Abstract
Statistical pattern recognition traditionally relies on feature-based representation. For many applications, such vector representation is not available and we only possess proximity data (distance, dissimilarity, similarity, ranks, etc.). In this paper, we consider a particular point of view on discriminant analysis from dissimilarity data. Our approach is inspired by the Gaussian classifier and we defined decision rules to mimic the behavior of a linear or a quadratic classifier. The number of parameters is limited (two per class). Numerical experiments on artificial and real data show interesting behavior compared to Support Vector Machines and to kNN classifier: (a) lower or equivalent error rate, (b) equivalent CPU time, (c) more robustness with sparse dissimilarity data.
Similar content being viewed by others
References
Bahlmann C, Haasdonk B, Burkhardt H (2002) On-line handwriting recognition with support vector machines—a kernel approach. In: Eighth international workshop on frontiers in handwriting recognition, ON, Canada
Borg I, Groenen P (1997) Modern multidimensional scaling: theory and applications. Springer, New York
Celeux G, Govaert G (1995) Parsimonious Gaussian models in cluster analysis. Pattern Recognit 28:781–793
Dubuisson MP, Jain AK (1994) Modified Hausdorff distance for object matching. In: 12th international conference on pattern recognition, vol 1, pp 566–568
Duch W (2000) Similarity-based methods: a general framework for classification, approximation and association. Control Cybern 29(4):937–968
Garris M, Blue J, Candela G, Grother P, Janet S, Wilson C (1997) NIST Form-Based Handprint Recognition System (Release 2.0), internal report
Guérin-Dugué A, Oliva A (2000) Classification of scene photographs from local orientations features. Pattern Recognit Lett 11:1135–1140
Guérin-Dugué A, Celeux G (2001) Discriminant analysis on dissimilarity data: a new fast Gaussian-like algorithm. In: AISTATS’20001, Fl, USA, pp 202–207
Haasdonk B, Bahlmann C (2004) Learning with distance substitution kernels, pattern recognition. In: Proceedings of the 26th DAGM symposium, Tübingen, Germany
Haasdonk B, Keysers D (2002) Tangent distance kernels for support vector machines. In: International conference on pattern recognition, QC, Canada
Ho PT, Guérin-Dugué A (2007) A new adaptation of self-organizing map for dissimilarity data. In: Lecture notes in computer sciences, IWANN’2007, 20–22 June 2007, San Sebastian, Spain, pp 219–226
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs
Hammer B, Vilmann T (2005) Classification using non-standard metrics. In: Esann’2005, 27–29 April, Bruges, Belgium, pp 303–316
Kohonen T, Somervuo PJ (1998) Self-organizing maps for symbol strings, Neurocomputing 21:19–30
Kohonen T, Somervuo PJ (2002) How to make large self-organizing maps for non vectorial data, Neural Netw 21(8):945–952
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Sov Phys Dokl 10(8):707–710
Lozano M, Sotoca JM, Sànchez JS, Pla F, Pekalska E, Duin RPW (2006) Experimental study on prototype-based classifier in vector spaces. Pattern Recognit 39:1827–1838
Moreno P, Ho P, Vasconcelos N (2003) A Kullback–Leibler divergence based kernel for SVM classification in multimedia applications. In: Neural information processing system. Whistler, Canada
Pekalska E, Paclik P, Duin RPW (2001) A generalizes kernel approach to dissimilarity-based classification. J Mach Learn Res 2:175–211
Pekalska E (2005) Dissimilarity representations in pattern recognition: concept, theory and applications. Phd Thesis, ISBN 90-9019021-X
Schlökopf B (2000) The kernel trick for distances. In: Neural information processing system, Vancouver, Canada, pp 301–307
Simard P, LeCun Y, Denker J (1993) Efficient pattern recognition using a new transformation distance. In: Hanson S, Cowan J, Giles L (eds) Advances in neural information processing systems, vol 5. Morgan Kaufmann Publisher
Typke R, Giannopoulos P, Veltkamp RC, Wiering F, van Oostrum R (2003) Using transportation distances for measuring melodic similarity, 4th international conference on music information retrieval, ISMIR, pp 107–114
Van Cutsen B (ed) (1994) Classification and dissimilarity analysis. In: Lecture notes in statistics, vol 95. Springer, Heidelberg
Acknowledgments
The authors would like to sincerely thank Gilles Celeux, who inspired this work and for all the fruitful and valuable scientific discussions, which allowed this study to be carried out.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by grants of the “Fonds National pour la Science”, from the program “ACI Masse de Données” and the project “DataHighDim”.
Rights and permissions
About this article
Cite this article
Manolova, A., Guérin-Dugué, A. Classification of dissimilarity data with a new flexible Mahalanobis-like metric. Pattern Anal Applic 11, 337–351 (2008). https://doi.org/10.1007/s10044-008-0101-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-008-0101-6