Abstract
In many application domains, there is a need for learning algorithms that can effectively exploit attribute value taxonomies (AVT)—hierarchical groupings of attribute values—to learn compact, comprehensible and accurate classifiers from data—including data that are partially specified. This paper describes AVT-NBL, a natural generalization of the naïve Bayes learner (NBL), for learning classifiers from AVT and data. Our experimental results show that AVT-NBL is able to generate classifiers that are substantially more compact and more accurate than those produced by NBL on a broad range of data sets with different percentages of partially specified values. We also show that AVT-NBL is more efficient in its use of training data: AVT-NBL produces classifiers that outperform those produced by NBL using substantially fewer training examples.
Similar content being viewed by others
References
Almuallim H, Akiba Y, Kaneda S (1995) On handling tree-structured attributes. In: Proceedings of the twelfth international conference on machine learning. Morgan Kaufmann, pp 12–20
Almuallim H, Akiba Y, Kaneda S (1996) An efficient algorithm for finding optimal gain-ratio multiple-split tests on hierarchical attributes in decision tree learning. In: Proceedings of the thirteenth national conference on artificial intelligence and eighth innovative applications of artificial intelligence conference, vol 1. AAAI/MIT Press, pp 703–708
Aronis J, Provost F, Buchanan B (1996) Exploiting background knowledge in automated discovery. In: Proceedings of the second international conference on knowledge discovery and data mining. AAAI Press, pp 355–358
Aronis J, Provost F (1997) Increasing the efficiency of inductive learning with breadth-first marker propagation. In: Proceedings of the third international conference on knowledge discovery and data mining. AAAI Press, pp 119–122
Ashburner M, et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Gen 25:25–29
Bergadano F, Giordana A (1990) Guiding induction with domain theories. Machine learning—an artificial intelligence approach, vol. 3. Morgan Kaufmann, pp 474–492
Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am pp 35–43
Bhattacharya I, Getoor L (2004) Deduplication and group detection using links. KDD workshop on link analysis and group detection, Aug. 2004. Seattle
Caragea D, Silvescu A, Honavar V (2004) A framework for learning from distributed data using sufficient statistics and its application to learning decision trees. Int J Hybrid Intell Syst 1:80–89
Caragea D, Pathak J, Honavar V (2004) Learning classifiers from semantically heterogeneous data. In: Proceedings of the third international conference on ontologies, databases, and applications of semantics for large scale information systems. pp 963–980
Chen A, Chiu J, Tseng F (1996) Evaluating aggregate operations over imprecise data. IEEE Trans Knowl Data En 8:273–284
Clare A, King R (2001) Knowledge discovery in multi-label phenotype data. In: Proceedings of the fifth European conference on principles of data mining and knowledge discovery. Lecture notes in computer science, vol 2168. Springer, Berlin Heidelberg New York, pp 42–53
Cohen W (1996) Learning trees and rules with set-valued features. In: Proceedings of the thirteenth national conference on artificial intelligence. AAAI/MIT Press, pp 709–716
DeMichiel L (1989) Resolving database incompatibility: an approach to performing relational operations over mismatched domains. IEEE Trans Knowl Data Eng 1:485–493
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc, Series B 39:1–38
desJardins M, Getoor L, Koller D (2000) Using feature hierarchies in Bayesian network learning. In: Proceedings of symposium on abstraction, reformulation, and approximation 2000. Lecture notes in artificial intelligence, vol 1864, Springer, Berlin Heidelberg New York, pp 260–270
Dhar V, Tuzhilin A (1993) Abstract-driven pattern discovery in databases. IEEE Trans Knowl Data Eng 5:926–938
Domingos P, Pazzani M (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29:103–130
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29:131–163
Han J, Fu Y (1996) Attribute-oriented induction in data mining. Advances in knowledge discovery and data mining. AAAI/MIT Press, pp 399–421
Haussler D (1998) Quantifying inductive bias: AI learning algorithms and Valiant's learning framework. Artif Intell 36:177–221
Hendler J, Stoffel K, Taylor M (1996) Advances in high performance knowledge representation. University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland, July 1996. CS-TR-3672 (Also cross-referenced as UMIACS-TR-96-56)
Kang D, Silvescu A, Zhang J, Honavar V (2004) Generation of attribute value taxonomies from data for data-driven construction of accurate and compact classifiers. In: Proceedings of the fourth IEEE international conference on data mining, pp 130–137
Kohavi R, Becker B, Sommerfield D (1997) Improving simple Bayes. Tech. Report, Data mining and visualization group, Silicon Graphics Inc.
Kohavi R, Provost P (2001) Applications of data mining to electronic commerce. Data Min Knowl Discov 5:5–10
Kohavi R, Mason L, Parekh R, Zheng Z (2004) Lessons and challenges from mining retail E-commerce data. Special Issue: Data mining lessons learned. Mach Learn 57:83–113
Koller D, Sahami M (1997) Hierarchically classifying documents using very few words. In: Proceedings of the fourteenth international conference on machine learning. Morgan Kaufmann, pp 170–178
Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers. In: Proceedings of the tenth national conference on artificial intelligence. AAAI/MIT Press, pp 223-228
McCallum A, Rosenfeld R, Mitchell T, Ng A (1998) Improving text classification by shrinkage in a hierarchy of classes. In: Proceedings of the fifteenth international conference on machine learning. Morgan Kaufmann, pp 359–367
McClean S, Scotney B, Shapcott M (2001) Aggregation of imprecise and uncertain information in databases. IEEE Trans Know Data Eng 13:902–912
Mitchell T (1997) Machine Learning. Addison-Wesley
Núñez M (1991) The use of background knowledge in decision tree induction. Mach Learn 6:231–250
Pazzani M, Kibler D (1992) The role of prior knowledge in inductive learning. Mach Learn 9:54–97
Pazzani M, Mani S, Shankle W (1997) Beyond concise and colorful: learning intelligible rules. In: Proceedings of the third international conference on knowledge discovery and data mining. AAAI Press, pp 235–238
Pereira F, Tishby N, Lee L (1993) Distributional clustering of English words. In: Proceedings of the thirty-first annual meeting of the association for computational linguistics. pp 183–190
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, CA
Rissanen J (1978) Modeling by shortest data description. Automatica 14:37–38
Slonim N, Tishby N (2000) Document clustering using word clusters via the information bottleneck method. ACM SIGIR 2000. pp 208–215
Taylor M, Stoffel K, Hendler J (1997) Ontology-based induction of high level classification rules. SIGMOD data mining and knowledge discovery workshop, Tuscon, Arizona
Towell G, Shavlik J (1994) Knowledge-based artificial neural networks. Artif Intell 70:119–165
Undercoffer J, et al (2004) A target centric ontology for intrusion detection: using DAML+OIL to classify intrusive behaviors. Knowledge Engineering Review—Special Issue on Ontologies for Distributed Systems, January 2004, Cambridge University Press
Walker A (1980) On retrieval from a small version of a large database. In: Proceedings of the sixth international conference on very large data bases. pp 47–54
Yamazaki T, Pazzani M, Merz C (1995) Learning hierarchies from ambiguous natural language data. In: Proceedings of the twelfth international conference on machine learning. Morgan Kaufmann, pp 575–583
Zhang J, Silvescu A, Honavar V (2002) Ontology-driven induction of decision trees at multiple levels of abstraction. In: Proceedings of symposium on abstraction, reformulation, and approximation 2002. Lecture notes in artificial intelligence, vol 2371. Springer, Berlin Heidelberg New York, pp 316–323
Zhang J, Honavar V (2003) Learning decision tree classifiers from attribute value taxonomies and partially specified data. In: Proceedings of the twentieth international conference on machine learning. AAAI Press, pp 880–887
Zhang J, Honavar V (2004) AVT-NBL: an algorithm for learning compact and accurate naive Bayes classifiers from attribute value taxonomies and data. In: Proceedings of the fourth IEEE international conference on data mining. IEEE Computer Society, pp 289-296
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper is an extended version of a paper published in the 4th IEEE International Conference on Data Mining, 2004.
Jun Zhang is currently a PhD candidate in computer science at Iowa State University, USA. His research interests include machine learning, data mining, ontology-driven learning, computational biology and bioinformatics, evolutionary computation and neural networks. From 1993 to 2000, he was a lecturer in computer engineering at University of Science and Technology of China. Jun Zhang received a MS degree in computer engineering from the University of Science and Technology of China in 1993 and a BS in computer science from Hefei University of Technology, China, in 1990.
Dae-Ki Kang is a PhD student in computer science at Iowa State University. His research interests include ontology learning, relational learning, and security informatics. Prior to joining Iowa State, he worked at a Bay-area startup company and at Electronics and Telecommunication Research Institute in South Korea. He received a Masters degree in computer science at Sogang University in 1994 and a bachelor of engineering (BE) degree in computer science and engineering at Hanyang University in Ansan in 1992.
Adrian Silvescu is a PhD candidate in computer science at Iowa State University. His research interests include machine learning, artificial intelligence, bioinformatics and complex adaptive systems. He received a MS degree in theoretical computer science from the University of Bucharest, Romania, in 1997, and received a BS in computer science from the University of Bucharest in 1996.
Vasant Honavar received a BE in electronics engineering from Bangalore University, India, an MS in electrical and computer Engineering from Drexel University and an MS and a PhD in computer science from the University of Wisconsin, Madison. He founded (in 1990) and has been the director of the Artificial Intelligence Research Laboratory at Iowa State University (ISU), where he is currently a professor of computer science and of bioinformatics and computational biology. He directs the Computational Intelligence, Learning & Discovery Program, which he founded in 2004. Honavar's research and teaching interests include artificial intelligence, machine learning, bioinformatics, computational molecular biology, intelligent agents and multiagent systems, collaborative information systems, semantic web, environmental informatics, security informatics, social informatics, neural computation, systems biology, data mining, knowledge discovery and visualization. Honavar has published over 150 research articles in refereed journals, conferences and books and has coedited 6 books. Honavar is a coeditor-in-chief of the Journal of Cognitive Systems Research and a member of the Editorial Board of the Machine Learning Journal and the International Journal of Computer and Information Security. Prof. Honavar is a member of the Association for Computing Machinery (ACM), American Association for Artificial Intelligence (AAAI), Institute of Electrical and Electronic Engineers (IEEE), International Society for Computational Biology (ISCB), the New York Academy of Sciences, the American Association for the Advancement of Science (AAAS) and the American Medical Informatics Association (AMIA).
Rights and permissions
About this article
Cite this article
Zhang, J., Kang, DK., Silvescu, A. et al. Learning accurate and concise naïve Bayes classifiers from attribute value taxonomies and data. Knowl Inf Syst 9, 157–179 (2006). https://doi.org/10.1007/s10115-005-0211-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-005-0211-z