Graph-Based Discrete Differential Geometry for Critical Instance Filtering

Marchiori, Elena

doi:10.1007/978-3-642-04174-7_5

Elena Marchiori²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5782))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

3692 Accesses
7 Citations

Abstract

Graph theory has been shown to provide a powerful tool for representing and tackling machine learning problems, such as clustering, semi-supervised learning, and feature ranking. This paper proposes a graph-based discrete differential operator for detecting and eliminating competence-critical instances and class label noise from a training set in order to improve classification performance. Results of extensive experiments on artificial and real-life classification problems substantiate the effectiveness of the proposed approach.

Download to read the full chapter text

Chapter PDF

Imbalanced instance selection based on Laplacian matrix decomposition with weighted k-nearest-neighbor graph

Article 21 April 2024

Unsupervised feature selection based on decision graph

Article 02 December 2016

Hypergraph-based importance assessment for binary classification data

Article Open access 25 December 2022

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Angiulli, F.: Fast condensed nearest neighbor rule. In: ICML 2005: Proceedings of the 22nd international conference on Machine learning, pp. 25–32. ACM, New York (2005)
Google Scholar
Angiulli, F., Folino, G.: Distributed nearest neighbor-based condensation of very large data sets. IEEE Trans. on Knowl. and Data Eng. 19(12), 1593–1606 (2007)
Article Google Scholar
Barnett, V.: The ordering of multivariate data. J. Roy. Statist. Soc., Ser. A 139(3), 318–355 (1976)
Article MathSciNet Google Scholar
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in Neural Information Processing Systems 14, pp. 585–591. MIT Press, Cambridge (2002)
Google Scholar
Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: ICML 2006: Proceedings of the 23rd international conference on Machine learning, pp. 97–104. ACM, New York (2006)
Google Scholar
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: COLT 1992: Proceedings of the fifth annual workshop on Computational learning theory, pp. 144–152. ACM, New York (1992)
Chapter Google Scholar
Brighton, H., Mellish, C.: On the consistency of information filters for lazy learning algorithms. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 283–288. Springer, Heidelberg (1999)
Chapter Google Scholar
Brighton, H., Mellish, C.: Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery (6), 153–172 (2002)
Article MathSciNet MATH Google Scholar
Cayton, L., Dasgupta, S.: A learning framework for nearest neighbor search. In: NIPS, vol. 20 (2007)
Google Scholar
Chang, C., Lin, C.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chapelle, O., Zien, A.: Semi-supervised classification by low density separation. In: Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, pp. 57–64 (2005)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13, 21–27 (1967)
Article MATH Google Scholar
Dasarathy, B.V.: Minimal consistent set (mcs) identification for optimal nearest neighbor decision systems design. IEEE Transactions on Systems, Man, and Cybernetics 24(3), 511–517 (1994)
Article Google Scholar
Dasarathy, B.V.: Nearest unlike neighbor (nun): An aid to decision confidence estimation. Opt. Eng. 34(9), 2785–2792 (1995)
Article Google Scholar
Domeniconi, C., Gunopulos, D., Peng, J.: Large margin nearest neighbor classifiers. IEEE Transactions on Neural Networks 16(4), 899–909 (2005)
Article Google Scholar
Domeniconi, C., Peng, J., Gunopulos, D.: Locally adaptive metric nearest neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(9), 1281–1285 (2002)
Article Google Scholar
Grother, P.J., Candela, G.T., Blue, J.L.: Fast implementation of nearest neighbor classifiers. Pattern Recognition 30, 459–465 (1997)
Article Google Scholar
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems 18 (2005)
Google Scholar
Marchiori, E.: Hit miss networks with applications to instance selection. Journal of Machine Learning Research 9, 997–1017 (2008)
MathSciNet MATH Google Scholar
Pekalska, E., Duin, R.P.W., Paclík, P.: Prototype selection for dissimilarity-based classifiers. Pattern Recognition 39(2), 189–208 (2006)
Article MATH Google Scholar
Rätsch, G., Onoda, T., Müller, K.-R.: Soft margins for AdaBoost. Machine Learning 42(3), 287–320 (2001)
Article MATH Google Scholar
Sánchez, J.S., Pla, F., Ferri, F.J.: Prototype selection for the nearest neighbour rule through proximity graphs. Pattern Recognition Letters 18, 507–513 (1997)
Article Google Scholar
Sebban, M., Nock, R., Lallich, S.: Boosting neighborhood-based classifiers. In: ICML, pp. 505–512 (2001)
Google Scholar
Sebban, M., Nock, R., Lallich, S.: Stopping criterion for boosting-based data reduction techniques: from binary to multiclass problem. Journal of Machine Learning Research 3, 863–885 (2002)
MathSciNet MATH Google Scholar
Shin, H., Cho, S.: Neighborhood property based pattern selection for support vector machines. Neural Computation (19), 816–855 (2007)
Google Scholar
Toussaint, G.T.: Proximity graphs for nearest neighbor decision rules: recent progress. In: Interface 2002, 34th Symposium on Computing and Statistics, pp. 83–106 (2002)
Google Scholar
Vezhnevets, A., Barinova, O.: Avoiding boosting overfitting by removing ”confusing” samples. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 430–441. Springer, Heidelberg (2007)
Chapter Google Scholar
von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)
Article MathSciNet Google Scholar
Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research 10, 207–244 (2009)
MATH Google Scholar
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Cybernetics (2), 408–420 (1972)
Google Scholar
Randall Wilson, D., Martinez, T.R.: Instance pruning techniques. In: Proc. 14th International Conference on Machine Learning, pp. 403–411. Morgan Kaufmann, San Francisco (1997)
Google Scholar
Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38(3), 257–286 (2000)
Article MATH Google Scholar
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: ICML 2007: Proceedings of the 24th international conference on Machine learning, pp. 1151–1157. ACM Press, New York (2007)
Google Scholar
Zhou, D., Huang, J., Schölkopf, B.: Learning from labeled and unlabeled data on a directed graph. In: ICML 2005: Proceedings of the 22nd international conference on Machine learning, pp. 1036–1043. ACM Press, New York (2005)
Google Scholar
Zhou, D., Schölkopf, B.: Regularization on discrete spaces. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds.) DAGM 2005. LNCS, vol. 3663, pp. 361–368. Springer, Heidelberg (2005)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Radboud University, Nijmegen, The Netherlands
Elena Marchiori

Authors

Elena Marchiori
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NICTA, Locked Bag 8001, Canberra, 2601, Australia and Helsinki Institute of IT, Finland
Wray Buntine
Dept. of Knowledge Technologies, Jožef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Marko Grobelnik & Dunja Mladenić &
The Centre for Computational Statistics and Machine Learning Department of Computer Science, University College London, Gower St.,, WC1E 6BT, London, UK
John Shawe-Taylor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Marchiori, E. (2009). Graph-Based Discrete Differential Geometry for Critical Instance Filtering. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04174-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-04174-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04173-0
Online ISBN: 978-3-642-04174-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Graph-Based Discrete Differential Geometry for Critical Instance Filtering

Abstract

Chapter PDF

Similar content being viewed by others

Imbalanced instance selection based on Laplacian matrix decomposition with weighted k-nearest-neighbor graph

Unsupervised feature selection based on decision graph

Hypergraph-based importance assessment for binary classification data

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Graph-Based Discrete Differential Geometry for Critical Instance Filtering

Abstract

Chapter PDF

Similar content being viewed by others

Imbalanced instance selection based on Laplacian matrix decomposition with weighted k-nearest-neighbor graph

Unsupervised feature selection based on decision graph

Hypergraph-based importance assessment for binary classification data

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation