Skip to main content

An Improved FloatBoost Algorithm for Naïve Bayes Text Classification

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3739))

Abstract

Boosting is a method for supervised learning, which has successfully been applied to many different domains and has proven one of the best performers in text classification exercises so far. FloatBoost learning uses a backtrack mechanism after each iteration of AdaBoost learning to minimize the error rate directly, rather than minimizing an exponential function of the margin as in the traditional AdaBoost algorithm. This paper presents an improved FloatBoost boosting algorithm for boosting Naïve Bayes text classification, called DifBoost, which combines Divide and Conquer Principal with the FloatBoost algorithm. Integrating FloatBoost with the Divide and Conquer principal, DifBoost divides the input space into a few sub-spaces during training process and the final classifier is formed with the weighted combination of basic classifiers, where basic classifiers are affected by different sub-spaces differently. Extensive experiments using benchmarks are conducted and the encouraging results show the effectiveness of our proposed algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of the 2th European Conference on Computational Learning Theory (1995)

    Google Scholar 

  2. Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys 34(1), 1–47 (2002)

    Article  Google Scholar 

  3. Freund, Y., Schapier, R.E.: Experiments with a New Boosting Algorithm. In: International Conference on Machine Learning, pp. 148–156 (1996)

    Google Scholar 

  4. Friedman, J.H., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. Annals of Statistics 28(2), 337–374 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  5. Pudil, P., Novovicova, J., Kittler, J.: Floating search methods in feature selection. Pattern Recognition Letters (11), 1119–1125 (1994)

    Google Scholar 

  6. Li, S.Z., Zhang, Z.Q.: FloatBoost Learning and Statistical Face Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(9), 1112–1123 (2004)

    Article  Google Scholar 

  7. Jiang, W.: Some theoretical aspects of boosting in the presence of noisy data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 234–241 (2001)

    Google Scholar 

  8. Jimmy, L.J., Loe, K.F.: S-AdaBoost and Pattern Detection in Complex Environment. In: Proceeding of CVPR, pp. 413–418 (2003)

    Google Scholar 

  9. Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proceedings of SIGIR 1999, pp. 42–49 (1999)

    Google Scholar 

  10. Kim, H., Kim, J.: Combining Active Learning and Boosting for Naïve Bayes Text Classifiers. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, pp. 519–527. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, X., Yin, J., Dong, J., Ghafoor, M.A. (2005). An Improved FloatBoost Algorithm for Naïve Bayes Text Classification. In: Fan, W., Wu, Z., Yang, J. (eds) Advances in Web-Age Information Management. WAIM 2005. Lecture Notes in Computer Science, vol 3739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11563952_15

Download citation

  • DOI: https://doi.org/10.1007/11563952_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29227-2

  • Online ISBN: 978-3-540-32087-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics