A discriminative model selection approach and its application to text classification
- 150 Downloads
Classification is one of the fundamental problems in data mining, in which a classification algorithm attempts to construct a classifier from a given set of training instances with class labels. It is well known that some classification algorithms perform very well on some domains, and poorly on others. For example, NB performs well on some domains, and poorly on others that involve correlated features. C4.5, on the other hand, typically works better than NB on such domains. To integrate their advantages and avoid their disadvantages, many model hybrid approaches, such as model insertion and model combination, are proposed. In this paper, we focus on a novel view and propose a discriminative model selection approach, called discriminative model selection (DMS). DMS discriminatively chooses different single models for different test instances and retains the interpretability of single models. Empirical studies on a collection of 36 classification problems from the University of California at Irvine repository show that our discriminative model selection approach outperforms single models, model insertion approaches and model combination approaches. Besides, we apply the proposed discriminative model selection approach to some state-of-the-art naive Bayes text classifiers and also improve their performance.
KeywordsNaive Bayes C4.5 Discriminative model selection Text classification
The work was partially supported by the National Natural Science Foundation of China (61203287), the Program for New Century Excellent Talents in University (NCET-12-0953), the Chenguang Program of Science and Technology of Wuhan (2015070404010202), and the Open Research Project of Hubei Key Laboratory of Intelligent Geo-Information Processing (KLIGIP201601).
Compliance with ethical standards
Conflict of interest
We confirm that there is no conflict of interest in the submission, and the manuscript has been approved by all authors for publication.
- 1.Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput 17(2–3):255–287Google Scholar
- 4.Frank A, Asuncion A (2010) UCI machine learning repository. Department of Information and Computer Science, University of California, IrvineGoogle Scholar
- 5.Han E, Karypis G (2000) Centroid-based document classification: analysis and experimental results. In: Proceedings of the 4th European conference on the principles of data mining and knowledge discovery. Springer, pp 424–431Google Scholar
- 6.Han S, Karypis G, Kumar V (2001) Text categorization using weight adjusted k-nearest neighbor classification. In: Proceedings of the 5th Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 53–65Google Scholar
- 8.Jiang L, Li C (2011) Scaling up the accuracy of decision-tree classifiers: a naive-Bayes combination. J Comput 6(7):1325–1331Google Scholar
- 9.Jiang L, Li C, Wang S, Zhang L (2016a) Deep feature weighting for naive Bayes and its application to text classification. Eng Appl Artif Intell 50:188–203Google Scholar
- 13.Kohavi R (1996) Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid. In: Proceedings of the second international conference on knowledge discovery and data mining. ACM, pp 202–207Google Scholar
- 14.Langley P, Iba W, Thomas K (1992) An analysis of Bayesian classifiers. In: Proceedings of the tenth national conference of artificial intelligence. AAAI Press, pp 223–228Google Scholar
- 16.Losada DE, Azzopardi L (2008) Assessing multivariate bernoulli models for information retrieval. ACM Trans Inf Syst 26(3), Article No. 17Google Scholar
- 17.McCallum A, Nigam KA (1998) A comparison of event models for naive Bayes text classification. In: Working notes of the 1998 AAAI/ICML workshop on learning for text categorization. AAAI Press, pp 41–48Google Scholar
- 19.Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf B, Burges C, Smola A (eds) Advances in kernel methods—support vector learning. MIT Press, Cambridge, pp 185–208Google Scholar
- 20.Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval. ACM Press, Melbourne, pp 275–281Google Scholar
- 22.Quinlan JR (1993) C4.5: programs for machine learning, 1st edn. Morgan Kaufmann, San MateoGoogle Scholar
- 24.Rennie JD, Shih L, Teevan J, Karger DR (2003) Tackling the poor assumptions of naive Bayes text classifiers. In: Proceedings of the twentieth international conference on machine learning. Morgan Kaufmann, pp 616–623Google Scholar
- 26.Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, Los AltosGoogle Scholar
- 32.Zhang L, Jiang L, Li C (2016a) C4.5 or naive Bayes: a discriminative model selection approach. In: Proceedings of the twenty-fifth international conference on artificial neural networks. Springer, pp 419–426Google Scholar