Highly Scalable Attribute Selection for Averaged One-Dependence Estimators

Chen, Shenglei; Martinez, Ana M.; Webb, Geoffrey I.

doi:10.1007/978-3-319-06605-9_8

Shenglei Chen^23,24,
Ana M. Martinez²⁴ &
Geoffrey I. Webb²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8444))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

4076 Accesses
10 Citations

Abstract

Averaged One-Dependence Estimators (AODE) is a popular and effective approach to Bayesian learning. In this paper, a new attribute selection approach is proposed for AODE. It can search in a large model space, while it requires only a single extra pass through the training data, resulting in a computationally efficient two-pass learning algorithm. The experimental results indicate that the new technique significantly reduces AODE’s bias at the cost of a modest increase in training time. Its low bias and computational efficiency make it an attractive algorithm for learning from big data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis, 1st edn. John Wiley & Sons Inc. (1973)
Google Scholar
Webb, G.I., Boughton, J.R., Wang, Z.: Not so naive Bayes: Aggregating one-dependence estimators. Machine Learning 58(1), 5–24 (2005)
Article MATH Google Scholar
Zheng, F., Webb, G.I.: A comparative study of semi-naive Bayes methods in classification learning. In: AusDM, pp. 141–156 (2005)
Google Scholar
Yang, Y., Webb, G.I., Cerquides, J., Korb, K.B., Boughton, J., Ting, K.M.: To select or to weigh: A comparative study of linear combination schemes for superparent-one-dependence estimators. IEEE Transactions on Knowledge and Data Engineering 19(12), 1652–1665 (2007)
Article Google Scholar
Zheng, F., Webb, G.I.: Finding the right family: Parent and child selection for averaged one-dependence estimators. In: Kok, J.N., Koronacki, J., de Lopez Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 490–501. Springer, Heidelberg (2007)
Chapter Google Scholar
Webb, G.I., Boughton, J.R., Zheng, F., Ting, K.M., Salem, H.: Learning by extrapolation from marginal to full-multivariate probability distributions: Decreasingly naive Bayesian classification. Machine Learning 86(2), 233–272 (2012)
Article MATH MathSciNet Google Scholar
Cerquides, J., de Mántaras, R.L.: Robust Bayesian linear classifier ensembles. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 72–83. Springer, Heidelberg (2005)
Chapter Google Scholar
Jiang, L., Zhang, H.: Weightily averaged one-dependence estimators. In: Yang, Q., Webb, G. (eds.) PRICAI 2006. LNCS (LNAI), vol. 4099, pp. 970–974. Springer, Heidelberg (2006)
Chapter Google Scholar
Zheng, F., Webb, G.I., Suraweera, P., Zhu, L.: Subsumption resolution: An efficient and effective technique for semi-naive Bayesian learning. Machine Learning 87(1), 93–125 (2012)
Article MATH MathSciNet Google Scholar
Langley, P., Sage, S.: Induction of selective Bayesian classifiers. In: Proceedings of the Tenth International Conference on Uncertainty in Artificial Intelligence, pp. 399–406. Morgan Kaufmann Publishers Inc. (1994)
Google Scholar
Kittler, J.: Feature selection and extraction. In: Handbook of Pattern Recognition and Image Processing, pp. 59–83 (1986)
Google Scholar
MacKay, D.J.: Information theory, inference and learning algorithms. Cambridge university press (2003)
Google Scholar
Kohavi, R.: The power of decision tables. In: Lavrač, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 174–189. Springer, Heidelberg (1995)
Chapter Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI, pp. 1022–1027 (1993)
Google Scholar
Cestnik, B.: Estimating probabilities: A crucial task in machine learning. In: ECAI, vol. 90, pp. 147–149 (1990)
Google Scholar
Bache, K., Lichman, M.: UCI machine learning repository (2013)
Google Scholar
Kohavi, R., Wolpert, D.H.: Bias plus variance decomposition for zero-one loss functions. In: ICML, pp. 275–283 (1996)
Google Scholar
Brain, D., Webb, G.I.: The need for low bias algorithms in classification learning from large data sets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 62–73. Springer, Heidelberg (2002)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Science, Nanjing Audit University, Nanjing, China
Shenglei Chen
Faculty of Information Technology, Monash University, VIC, 3800, Australia
Shenglei Chen, Ana M. Martinez & Geoffrey I. Webb

Authors

Shenglei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ana M. Martinez
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey I. Webb
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Cheng Kung University, Tainan, Taiwan, R.O.C.
Vincent S. Tseng & Hung-Yu Kao &
Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Tu Bao Ho
Nanjing University, China
Zhi-Hua Zhou
National Chengchi University, Taipei, Taiwan, R.O.C.
Arbee L. P. Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, S., Martinez, A.M., Webb, G.I. (2014). Highly Scalable Attribute Selection for Averaged One-Dependence Estimators. In: Tseng, V.S., Ho, T.B., Zhou, ZH., Chen, A.L.P., Kao, HY. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2014. Lecture Notes in Computer Science(), vol 8444. Springer, Cham. https://doi.org/10.1007/978-3-319-06605-9_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-06605-9_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06604-2
Online ISBN: 978-3-319-06605-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics