Abstract
Graph theory has been shown to provide a powerful tool for representing and tackling machine learning problems, such as clustering, semi-supervised learning, and feature ranking. This paper proposes a graph-based discrete differential operator for detecting and eliminating competence-critical instances and class label noise from a training set in order to improve classification performance. Results of extensive experiments on artificial and real-life classification problems substantiate the effectiveness of the proposed approach.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Angiulli, F.: Fast condensed nearest neighbor rule. In: ICML 2005: Proceedings of the 22nd international conference on Machine learning, pp. 25–32. ACM, New York (2005)
Angiulli, F., Folino, G.: Distributed nearest neighbor-based condensation of very large data sets. IEEE Trans. on Knowl. and Data Eng. 19(12), 1593–1606 (2007)
Barnett, V.: The ordering of multivariate data. J. Roy. Statist. Soc., Ser. A 139(3), 318–355 (1976)
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in Neural Information Processing Systems 14, pp. 585–591. MIT Press, Cambridge (2002)
Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: ICML 2006: Proceedings of the 23rd international conference on Machine learning, pp. 97–104. ACM, New York (2006)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: COLT 1992: Proceedings of the fifth annual workshop on Computational learning theory, pp. 144–152. ACM, New York (1992)
Brighton, H., Mellish, C.: On the consistency of information filters for lazy learning algorithms. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 283–288. Springer, Heidelberg (1999)
Brighton, H., Mellish, C.: Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery (6), 153–172 (2002)
Cayton, L., Dasgupta, S.: A learning framework for nearest neighbor search. In: NIPS, vol. 20 (2007)
Chang, C., Lin, C.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chapelle, O., Zien, A.: Semi-supervised classification by low density separation. In: Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, pp. 57–64 (2005)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13, 21–27 (1967)
Dasarathy, B.V.: Minimal consistent set (mcs) identification for optimal nearest neighbor decision systems design. IEEE Transactions on Systems, Man, and Cybernetics 24(3), 511–517 (1994)
Dasarathy, B.V.: Nearest unlike neighbor (nun): An aid to decision confidence estimation. Opt. Eng. 34(9), 2785–2792 (1995)
Domeniconi, C., Gunopulos, D., Peng, J.: Large margin nearest neighbor classifiers. IEEE Transactions on Neural Networks 16(4), 899–909 (2005)
Domeniconi, C., Peng, J., Gunopulos, D.: Locally adaptive metric nearest neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(9), 1281–1285 (2002)
Grother, P.J., Candela, G.T., Blue, J.L.: Fast implementation of nearest neighbor classifiers. Pattern Recognition 30, 459–465 (1997)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems 18 (2005)
Marchiori, E.: Hit miss networks with applications to instance selection. Journal of Machine Learning Research 9, 997–1017 (2008)
Pekalska, E., Duin, R.P.W., Paclík, P.: Prototype selection for dissimilarity-based classifiers. Pattern Recognition 39(2), 189–208 (2006)
Rätsch, G., Onoda, T., Müller, K.-R.: Soft margins for AdaBoost. Machine Learning 42(3), 287–320 (2001)
Sánchez, J.S., Pla, F., Ferri, F.J.: Prototype selection for the nearest neighbour rule through proximity graphs. Pattern Recognition Letters 18, 507–513 (1997)
Sebban, M., Nock, R., Lallich, S.: Boosting neighborhood-based classifiers. In: ICML, pp. 505–512 (2001)
Sebban, M., Nock, R., Lallich, S.: Stopping criterion for boosting-based data reduction techniques: from binary to multiclass problem. Journal of Machine Learning Research 3, 863–885 (2002)
Shin, H., Cho, S.: Neighborhood property based pattern selection for support vector machines. Neural Computation (19), 816–855 (2007)
Toussaint, G.T.: Proximity graphs for nearest neighbor decision rules: recent progress. In: Interface 2002, 34th Symposium on Computing and Statistics, pp. 83–106 (2002)
Vezhnevets, A., Barinova, O.: Avoiding boosting overfitting by removing ”confusing” samples. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 430–441. Springer, Heidelberg (2007)
von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)
Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research 10, 207–244 (2009)
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Cybernetics (2), 408–420 (1972)
Randall Wilson, D., Martinez, T.R.: Instance pruning techniques. In: Proc. 14th International Conference on Machine Learning, pp. 403–411. Morgan Kaufmann, San Francisco (1997)
Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38(3), 257–286 (2000)
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: ICML 2007: Proceedings of the 24th international conference on Machine learning, pp. 1151–1157. ACM Press, New York (2007)
Zhou, D., Huang, J., Schölkopf, B.: Learning from labeled and unlabeled data on a directed graph. In: ICML 2005: Proceedings of the 22nd international conference on Machine learning, pp. 1036–1043. ACM Press, New York (2005)
Zhou, D., Schölkopf, B.: Regularization on discrete spaces. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds.) DAGM 2005. LNCS, vol. 3663, pp. 361–368. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Marchiori, E. (2009). Graph-Based Discrete Differential Geometry for Critical Instance Filtering. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04174-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-04174-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04173-0
Online ISBN: 978-3-642-04174-7
eBook Packages: Computer ScienceComputer Science (R0)