Skip to main content

Naïve Bayes with Higher Order Attributes

  • Conference paper
Advances in Artificial Intelligence (Canadian AI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3060))

Abstract

The popular Naïve Bayes (NB) algorithm is simple and fast. We present a new learning algorithm, Extended Bayes (EB), which is based on Naïve Bayes. EB is still relatively simple, and achieves equivalent or higher accuracy than NB on a wide variety of the UC-Irvine datasets. EB is based on two ideas, which interact. The first is to find sets of seemingly dependent attributes and to add them as new attributes. The second idea is to exploit “zeroes”, that is, the negative evidence provided by attribute values that do not occur at all in particular classes in the training data. Zeroes are handled in Naïve Bayes by smoothing. In contrast, EB uses them as evidence that a potential class labeling may be wrong.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Ghosh, S., Imielinski, T., Iyer, B., Swami, A.: An interval classifier for database mining applications. In: Proc. of the 18th VLDB Conference, pp. 560–573 (1992)

    Google Scholar 

  2. Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Transactions on Knowledge and Data Engineering 5(6), 914–925 (1993)

    Article  Google Scholar 

  3. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD Conference, pp. 207–216 (1993)

    Google Scholar 

  4. Blake, C.L., Merz, C.J.U.: Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  5. Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: generalizing association rules to correlations. In: Proc. of the ACM SIGMOD Conference, pp. 265–276 (1997)

    Google Scholar 

  6. Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3, 261–283 (1989)

    Google Scholar 

  7. Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29, 103–130 (1997)

    Article  MATH  Google Scholar 

  8. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley Interscience, New York (1973)

    MATH  Google Scholar 

  9. Foster, T., Kohavi, R., Provost, F.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings of the 15th International Conference on Machine Learning, pp. 445–453 (1998)

    Google Scholar 

  10. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)

    Article  MATH  Google Scholar 

  11. Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Machine Learning 11, 63–91 (1993)

    Article  MATH  Google Scholar 

  12. Hsu, C., Lin, C.: A comparison of methods for multi-class Support Vector Machines. IEEE Transactions On Neural Networks 13(2), 415–425 (2002)

    Article  Google Scholar 

  13. Keogh, E.J., Pazzani, M.J.: Learning augmented Bayesian classifiers: a comparison of distribution-based and classification-based approaches. In: Proceedings of the 7th International Workshop on Artificial Intelligence and Statistics, pp. 225–230 (1999)

    Google Scholar 

  14. Keim, M., Lewis, D.D., Madigan, D.: Bayesian information retrieval: preliminary evaluation. In: Preliminary Papers of the 6th International Workshop on Artificial Intelligence and Statistics, pp. 303–310 (1997)

    Google Scholar 

  15. Kohavi, R., Becker, B., Sommerfield, D.: Improving simple Bayes. In: Proceedings of the 9th European Conference on Machine Learning, pp. 78–87 (1997)

    Google Scholar 

  16. Kononenko, I.: Semi-naïve Bayesian classifier. In: Proceedings of the 6th European Working Session on Learning, pp. 206–219 (1991)

    Google Scholar 

  17. Lewis, D.D.: Naïve (Bayes) at forty: The independence assumption in information retrieval. In: Proceedings of the European Conference on Machine Learning, pp. 4–15 (1998)

    Google Scholar 

  18. Ling, C.X., Zhang, H.: Toward Bayesian classifiers with accurate probabilities. In: Proceedings of the 6th Pacific Asia Conference on Knowledge Discovery and Data Mining, pp. 123–134 (2002)

    Google Scholar 

  19. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the 4th ACM SIGKDD Conference, pp. 80–86 (1998), http://www.comp.nus.edu.sg/~dm2/result.html

  20. McCallum, A., Nigam, K.: A comparison of event models for Naïve Bayes classification. In: Proceedings of the AAAI Workshop on Learning for Text Categorization (1998)

    Google Scholar 

  21. Meretakis, D., Wuthrich, B.: Extending Naïve Bayes classifiers using long itemsets. In: Proceedings of the 5th ACM SIGKDD Conference, pp. 165–174 (1999)

    Google Scholar 

  22. Meretakis, D., Hongjun, L., Wuthrich, B.: A Study on the performance of Large Bayes Classifier. In: Proceedings of the 11th European Conference on Machine Learning, pp. 271–279 (2000)

    Google Scholar 

  23. Mitchell, T.: Machine Learning. McGraw-Hill, San Francisco (1997)

    MATH  Google Scholar 

  24. Pazzani, M.: Searching for dependencies in Bayesian classifiers. In: Artificial Intelligence and Statistics IV. Lecture Notes In Statistics, Springer, New York (1995)

    Google Scholar 

  25. Peng, F., Schuurmans, D.: Combining Naïve Bayes and n-gram Language Models for Test Classification. In: Sebastiani, F. (ed.) Advances in Information Retrieval: Proceedings of the 25th European Conference On Information Retrieval Research, pp. 335–350 (2003)

    Google Scholar 

  26. Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  27. Quinlan, J.R.: Simplifying decision trees. International Journal for Man Machine Studies 27, 221–234 (1987)

    Article  Google Scholar 

  28. Rachlin, J., Kasif, S., Salzberg, S., Aha, D.W.: Towards a better understanding of memory-based reasoning systems. In: Proceedings of the 11th International Conference on Machine Learning, pp. 242–250 (1994)

    Google Scholar 

  29. Rennie, J.D., Shih, L., Teevan, J., Karger, D.R.: Tackling the poor assumptions of Naïve Bayes text classifiers. In: Proceedings of the 20th International Conference on Machine Learning, pp. 616–623 (2003)

    Google Scholar 

  30. Roth, D.: Learning in natural language. In: Proceedings of the International Joint Conference of Artificial Intelligence, pp. 898–904 (1999)

    Google Scholar 

  31. Sahami, M.: Learning limited dependence Bayesian classifiers. In: Proceedings of the 2nd ACM SIGKDD Conference, pp. 335–338 (1995)

    Google Scholar 

  32. Witten, I., Frank, E.: Machine Learning Algorithms in Java. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rosell, B., Hellerstein, L. (2004). Naïve Bayes with Higher Order Attributes. In: Tawfik, A.Y., Goodwin, S.D. (eds) Advances in Artificial Intelligence. Canadian AI 2004. Lecture Notes in Computer Science(), vol 3060. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24840-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24840-8_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22004-6

  • Online ISBN: 978-3-540-24840-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics