Abstract
Recent studies in classification have proposed ways of exploiting the association rule mining paradigm. These studies have performed extensive experiments to show their techniques to be both efficient and accurate. However, existing studies in this paradigm either do not provide any theoretical justification behind their approaches or assume independence between some parameters. In this work, we propose a new classifier based on association rule mining. Our classifier rests on the maximum entropy principle for its statistical basis and does not assume any independence not inferred from the given dataset. We use the classical generalized iterative scaling algorithm (GIS) to create our classification model. We show that GIS fails in some cases when itemsets are used as features and provide modifications to rectify this problem. We show that this modified GIS runs much faster than the original GIS. We also describe techniques to make GIS tractable for large feature spaces – we provide a new technique to divide a feature space into independent clusters each of which can be handled separately. Our experimental results show that our classifier is generally more accurate than the existing classification methods.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Beeferman, D., Berger, A., Lafferty, J.: Statistical models for text segmentation. Machine Learning 34(1-3), 177–210 (1999)
Boutilier, C., Friedman, N., Goldszmidt, M., Koller, D.: Context-specific independence in bayesian-networks. In: Uncertainty in Artificial Intelligence(UAI) (1996)
Clark, P., Niblett, T.: The cn2 induction algorithm. Machine Learning 2, 261–283 (1989)
Clark, P., Niblett, T.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Darroch, J., Ratcliff, D.: Generalized iterative scaling for log-linear models. Annals of Mathematical Statistics 43, 1470–1480 (1972)
Dong, G., Zhang, X., Wong, L., Li, J.: Classification by aggregating emerging patterns. In: Discovery Science (December 1999)
Duda, R., Hart, P.: Pattern Classification and Scene Analysis. John Wiley & Sons, Chichester (1973)
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Intl. Joint Conf. on Artificial Intelligence(IJCAI), pp. 1022–1029 (1993)
Good, I.: Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables. Annals of Mathematical Statistics 34, 911–934 (1963)
Kononenko, I.: Semi-naive bayesian classifier. In: European Working Session on Learnign, pp. 206–219 (1991)
Langley, P., Sage, S.: Induction of selective-bayesian classifiers. In: Uncertainty in Artificial Intelligence(UAI), pp. 399–406 (1994)
Lau, R.: Adaptive statistical language modeling. Master’s thesis, Massachusetts Institute of Technology, Cambridge, MA (1994)
Li, W., Han, J., Pei, J.: CMAR: Accurate and efficient classification based on multiple class-association rules. In: ICDM (2001)
Lim, T.-S., Loh, W.-Y., Shih, Y.-S.: A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning 40(3), 203–228 (2000)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proc. of 4th Intl. Conf. on Knowledge Discovery and Data Mining, KDD (August 1998)
Meretakis, D., Lu, H., Wuthrich, B.: A study on the performance of large bayes classifier. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 271–279. Springer, Heidelberg (2000)
Meretakis, D., Wuthrich, B.: Extending naive-bayes classifiers using long itemsets. In: KDD, pp. 165–174 (1999)
Merz, C., Murphy, P.: UCI repository of machine learning databases (1996), http://cs.uci.edu/~mlearn/MLRepository.html
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Ratnaparkhi, A.: A simple introduction to maximum entropy models for natural language processing. Technical Report IRCS Report 97-98, Institute for Research in Cognitive Science, University of Pennsylvania (May 1997)
Ratnaparkhi, A.: Maximum Entropy Models for Natural Language Ambiguity Resolution. PhD thesis, Institute for Research in Cognitive Science, University of Pennsylvania (1998)
Rosenfeld, R.: Adaptive Statistical Language Modeling: A Maximum Entropy Approach. PhD thesis, Carnegie Mellon University (1994)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thonangi, R., Pudi, V. (2005). ACME: An Associative Classifier Based on Maximum Entropy Principle. In: Jain, S., Simon, H.U., Tomita, E. (eds) Algorithmic Learning Theory. ALT 2005. Lecture Notes in Computer Science(), vol 3734. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564089_11
Download citation
DOI: https://doi.org/10.1007/11564089_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29242-5
Online ISBN: 978-3-540-31696-1
eBook Packages: Computer ScienceComputer Science (R0)