Abstract
Combining multiple classifiers can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. This technique can also be used for estimating class membership probabilities. We propose an ensemble of k-Nearest Neighbours (kNN) classifiers for class membership probability estimation in the presence of non-informative features in the data. This is done in two steps. Firstly, we select classifiers based upon their individual performance from a set of base kNN models, each generated on a bootstrap sample using a random feature set from the feature space of training data. Secondly, a step wise selection is used on the selected learners, and those models are added to the ensemble that maximize its predictive performance. We use bench mark data sets with some added non-informative features for the evaluation of our method. Experimental comparison of the proposed method with usual kNN, bagged kNN, random kNN and random forest shows that it leads to high predictive performance in terms of minimum Brier score on most of the data sets. The results are also verified by simulation studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bay, S. (1998). Combining nearest neighbor classifiers through multiple feature subsets. In Proceedings of the Fifteenth International Conference on Machine Learning (Vol.3, pp. 37–45).
Breiman, L. (1996): Bagging predictors. Machine Learning, 24(2), 123–140.
Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78, 1–3.
Cover, T., & Hart, P. (1967). Nearest nieghbor pattern classification. IEEE Transaction on Information Theory, 13, 21–27.
Gneiting, T., & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102 359–378.
Hothorn, T., & Lausen, B. (2003). Double-bagging: Combining classifiers by bootstrap aggregation. Pattern Recognition, 36(9), 1303–1309.
Khan, Z., Perperoglou, A., Gul, A., Mahmoud, O., Adler, W., Miftahuddin, M., & Lausen, B. (2015). An ensemble of optimal trees for class membership probability estimation. In Proceedings of European Conference on Data Analysis.
Kruppa, J., Liu, Y., Biau, G., Kohler, M., Konig, I. R., Malley, J. D., et al. (2014a). Probability estimation with machine learning methods for dichotomous and multicategory outcome: Theory. Biometrical Journal, 56, 534–563.
Kruppa, J., Liu, Y., Diener, H. C., Weimar, C., Konig, I. R., & Ziegler, A. (2014b). Probability Estimation with machine learning methods for dichotomous and multicategory outcome: applications. Biometrical Journal, 56, 564–583.
Kruppa, J., Ziegler, A., & Konig, I. R. (2012). Risk estimation and risk prediction using machine-learning methods. Human Genetics, 131, 1639–1654.
Kuncheva, L. I.(2004). Combining pattern classifiers. Methods and algorithms. New York: Wiley.
Lee, B. K., Lessler, J., & Stuart, E. A. (2010). Improving propensity score weighting using machine learning. Statistics in Medicine, 29, 337–346.
Li, S., Harner, E. J., & Adjeroh, D. (2011). Random knn feature selection a fast and stable alternative to random forests. BMC Bioinformatics, 12(1), 450.
Mahmoud, O., Harrison, A., Perperoglou, A., Gul, A., Khan, Z., & Lausen, B. (2014b). Propoverlap: Feature (gene) selection based on the Proportional Overlapping scores. R package version 1.0, http://CRAN.R-project.org/package=propOverlap
Mahmoud, O., Harrison, A., Perperoglou, A., Gul, A., Khan, Z., Metodiev, M. V., et al. (2014a). A feature selection method for classification within functional genomics experiments based on the proportional overlapping score. BMC Bioinformatics, 15, 274.
Malley, J., Kruppa, J., Dasgupta, A., Malley, K., & Ziegler, A. (2012). Probability machines: Consistent probability estimation using nonparametric learning machines. Methods of Information in Medicine, 51, 74–81.
Mease, D., Wyner, A. J., & Buja, A. (2007). Boosted classification trees and class probability/quantile estimation. The Journal of Machine Learning Research, 8, 409–439.
Melville, P., Shah, N., Mihalkova, L., & Mooney, R. (2004). Experiments on ensembles with missing and noisy data. Multiple Classifier Systems, 53, 293–302.
Nettleton, D. F., Orriols-puig, A., & Fornells, A. (2010). A Study of the effect of different types of noise on the precision of supervised learning techniques. Artificial Intelligence Review, 33(4), 275–306.
Samworth, R. J. (2012). Optimal weighted nearest neighbour classifiers. The Annals of Statistics, 40(5), 2733–2763.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Gul, A. et al. (2016). Ensemble of Subset of k-Nearest Neighbours Models for Class Membership Probability Estimation. In: Wilhelm, A., Kestler, H. (eds) Analysis of Large and Complex Data. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-25226-1_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-25226-1_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25224-7
Online ISBN: 978-3-319-25226-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)