Voting Massive Collections of Bayesian Network Classifiers for Data Streams

Bouckaert, Remco R.

doi:10.1007/11941439_28

Remco R. Bouckaert²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4304))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

3434 Accesses
7 Citations

Abstract

We present a new method for voting exponential (in the number of attributes) size sets of Bayesian classifiers in polynomial time with polynomial memory requirements. Training is linear in the number of instances in the dataset and can be performed incrementally. This allows the collection to learn from massive data streams. The method allows for flexibility in balancing computational complexity, memory requirements and classification performance. Unlike many other incremental Bayesian methods, all statistics kept in memory are directly used in classification.

Experimental results show that the classifiers perform well on both small and very large data sets, and that classification performance can be weighed against computational and memory costs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. University of California, Irvine (1998)
Google Scholar
Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: SIGKDD, pp. 71–80 (2000)
Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997)
Article MATH Google Scholar
Hulten, G., Domingos, P.: Mining complex models from arbitrarily large databases in constant time. In: SIGKDD, pp. 525–531 (2002)
Google Scholar
John, G.H., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. Uncertainty in Artificial Intelligence, 338–345 (1995)
Google Scholar
Keogh, E., Pazzani, M.: Learning augmented Bayesian classifiers: A comparison of distribution-based and classification-based approaches. AIStats, 225–230 (1999)
Google Scholar
Sacha, J.P.: New synthesis of Bayesian network classifiers and interpretation of cardiac SPECT images, Ph.D. Dissertation, University of Toledo (1999)
Google Scholar
Webb, G.I., Boughton, J.R., Wang, Z.: Not so naive Bayes: aggregating one-dependence estimators. Machine Learning 58(1), 5–24 (2005)
Article MATH Google Scholar
Witten, I.H., Frank, E.: Data mining: Practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Francisco (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Waikato, New Zealand
Remco R. Bouckaert

Authors

Remco R. Bouckaert
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DisPRR, National ICT Australia Ltd, QLD, Australia
Abdul Sattar
School of Computing, University of Tasmania, Sandy Bay, 7005, Tasmania, Australia
Byeong-ho Kang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bouckaert, R.R. (2006). Voting Massive Collections of Bayesian Network Classifiers for Data Streams. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_28

Download citation

DOI: https://doi.org/10.1007/11941439_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49787-5
Online ISBN: 978-3-540-49788-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics