RNBL-MN: A Recursive Naive Bayes Learner for Sequence Classification

  • Dae-Ki Kang
  • Adrian Silvescu
  • Vasant Honavar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3918)


Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe RNBL-MN, which relaxes this assumption by constructing a tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the tree is based on a multinomial event model (one for each class at each node in the tree). In our experiments on protein sequence and text classification tasks, we observe that RNBL-MN substantially outperforms NB classifier. Furthermore, our experiments show that RNBL-MN outperforms C4.5 decision tree learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information.


Support Vector Machine Class Label Nominal Attribute Splitting Criterion Decision Tree Learner 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI 1998 Workshop on Learning for Text Categorization (1998)Google Scholar
  2. 2.
    Andorf, C., Silvescu, A., Dobbs, D., Honavar, V.: Learning classifiers for assigning protein sequences to gene ontology functional families. In: 5th International Conference on Knowledge Based Computer Systems, pp. 256–265 (2004)Google Scholar
  3. 3.
    Langley, P.: Induction of recursive bayesian classifiers. In: Proc. of the European Conf. on Machine Learning, London, UK, pp. 153–164. Springer-Verlag, Heidelberg (1993)Google Scholar
  4. 4.
    Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
  5. 5.
    Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)CrossRefMATHGoogle Scholar
  6. 6.
    Kang, D.K., Zhang, J., Silvescu, A., Honavar, V.: Multinomial event model based abstraction for sequence and text classification. In: 6th International Symposium on Abstraction, Reformulation and Approximation, pp. 134–148 (2005)Google Scholar
  7. 7.
    Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. Advances in kernel methods: support vector learning, 185–208 (1999)Google Scholar
  8. 8.
    Apté, C., Damerau, F., Weiss, S.M.: Towards language independent automated learning of text categorization models. In: 17th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 23–30 (1994)Google Scholar
  9. 9.
    Dumais, S., Platt, J., Heckerman, D., Sahami, M.: Inductive learning algorithms and representations for text categorization. In: Proceedings of the 7th international conference on Information and knowledge management, pp. 148–155. ACM Press, New York (1998)Google Scholar
  10. 10.
    Reinhardt, A., Hubbard, T.: Using neural networks for prediction of the subcellular location of proteins. Nucleic Acids Research 26, 2230–2236 (1998)CrossRefGoogle Scholar
  11. 11.
    Bairoch, A., Apweiler, R.: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Research 28, 45–48 (2000)CrossRefGoogle Scholar
  12. 12.
    Kohavi, R.: Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In: Proc. of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 202–207 (1996)Google Scholar
  13. 13.
    Gama, J., Brazdil, P.: Cascade generalization. Machine Learning 41, 315–343 (2000)CrossRefMATHGoogle Scholar
  14. 14.
    Blake, C., Merz, C.: UCI repository of machine learning databases (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Dae-Ki Kang
    • 1
  • Adrian Silvescu
    • 1
  • Vasant Honavar
    • 1
  1. 1.Artificial Intelligence Research Laboratory, Department of Computer ScienceIowa State UniversityAmesUSA

Personalised recommendations