Abstract
Boosting is a method for supervised learning, which has successfully been applied to many different domains and has proven one of the best performers in text classification exercises so far. FloatBoost learning uses a backtrack mechanism after each iteration of AdaBoost learning to minimize the error rate directly, rather than minimizing an exponential function of the margin as in the traditional AdaBoost algorithm. This paper presents an improved FloatBoost boosting algorithm for boosting Naïve Bayes text classification, called DifBoost, which combines Divide and Conquer Principal with the FloatBoost algorithm. Integrating FloatBoost with the Divide and Conquer principal, DifBoost divides the input space into a few sub-spaces during training process and the final classifier is formed with the weighted combination of basic classifiers, where basic classifiers are affected by different sub-spaces differently. Extensive experiments using benchmarks are conducted and the encouraging results show the effectiveness of our proposed algorithm.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of the 2th European Conference on Computational Learning Theory (1995)
Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Freund, Y., Schapier, R.E.: Experiments with a New Boosting Algorithm. In: International Conference on Machine Learning, pp. 148–156 (1996)
Friedman, J.H., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. Annals of Statistics 28(2), 337–374 (2000)
Pudil, P., Novovicova, J., Kittler, J.: Floating search methods in feature selection. Pattern Recognition Letters (11), 1119–1125 (1994)
Li, S.Z., Zhang, Z.Q.: FloatBoost Learning and Statistical Face Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(9), 1112–1123 (2004)
Jiang, W.: Some theoretical aspects of boosting in the presence of noisy data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 234–241 (2001)
Jimmy, L.J., Loe, K.F.: S-AdaBoost and Pattern Detection in Complex Environment. In: Proceeding of CVPR, pp. 413–418 (2003)
Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proceedings of SIGIR 1999, pp. 42–49 (1999)
Kim, H., Kim, J.: Combining Active Learning and Boosting for Naïve Bayes Text Classifiers. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, pp. 519–527. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, X., Yin, J., Dong, J., Ghafoor, M.A. (2005). An Improved FloatBoost Algorithm for Naïve Bayes Text Classification. In: Fan, W., Wu, Z., Yang, J. (eds) Advances in Web-Age Information Management. WAIM 2005. Lecture Notes in Computer Science, vol 3739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11563952_15
Download citation
DOI: https://doi.org/10.1007/11563952_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29227-2
Online ISBN: 978-3-540-32087-6
eBook Packages: Computer ScienceComputer Science (R0)