Using Association Rules for Classification from Databases Having Class Label Ambiguities: A Belief Theoretic Method

  • S. P. Subasingha
  • J. Zhang
  • K. Premaratne
  • M. -L. Shyu
  • M. Kubat
  • K. K. R. G. K. Hewawasam
Part of the Studies in Computational Intelligence book series (SCI, volume 118)

Summary

This chapter introduces a belief theoretic method for classification from databases having class label ambiguities. It uses a set of association rules extracted from such a database. It is assumed that a training data set with an adequate number of pre-classified instances, where each instance is assigned with an integer class label, is available. We use a modified association rule mining (ARM) technique to extract the interesting rules from the training data set and use a belief theoretic classifier based on the extracted rules to classify the incoming feature vectors. The ambiguity modelling capability of belief theory enables our classifier to perform better in the presence of class label ambiguities. It can also address the issue of the training data set being unbalanced or highly skewed by ensuring that an approximately equal number of rules are generated for each class. All these capabilities make our classifier ideally suited for those applications where (1) different experts may have conflicting opinions about the class label to be assigned to a specific training data instance; and (2) the majority of the training data instances are likely to represent a few classes giving rise to highly skewed databases. Therefore, the proposed classifier would be extremely useful in security monitoring and threat classification environments where conflicting expert opinions about the threat level are common and only a few training data instances would be considered to pose a heightened threat level. Several experiments are conducted to evaluate our proposed classifier. These experiments use several databases from the UCI data repository and data sets collected from the airport terminal simulation platform developed at the Distributed Decision Environments (DDE) Laboratory at the Department of Electrical and Computer Engineering, University of Miami. The experimental results show that, while the proposed classifier’s performance is comparable to some existing classifiers when the databases have no class label ambiguities, it provides superior classification accuracy and better efficiency when class label ambiguities are present.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In Proceedings of ACM SIGMOD International Conference on Management of Data, pages 207–216, Washington DC, May 1993Google Scholar
  2. 2.
    R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of International Conference on Very Large Data Bases (VLDB’94), pages 487–499, Santiago de Chile, Chile, September 1994Google Scholar
  3. 3.
    C. L. Blake and C. J. Merz. UCI repository of machine learning databases, 1998Google Scholar
  4. 4.
    T. M. Cover and P. E. Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21–27, January 1967MATHCrossRefGoogle Scholar
  5. 5.
    T. Denoeux. The k-nearest neighbor classification rule based on Dempster-Shafer theory. IEEE Transactions on Systems, Man and Cybernetics, 25(5):804–813, May 1995CrossRefGoogle Scholar
  6. 6.
    S. A. Dudani. The distance-weighted k-nearest-neighbor rule. IEEE Transactions on Systems, Man and Cybernetics, 6(4):325–327, April 1976Google Scholar
  7. 7.
    S. Fabre, A. Appriou, and X. Briottet. Presentation and description of two classification methods using data fusion on sensor management. Information Fusion, 2:49–71, 2001CrossRefGoogle Scholar
  8. 8.
    R. Fagin and J. Y. Halpern. A new approach to updating beliefs. In P. P. Bonissone, M. Henrion, L. N. Kanal, and J. F. Lemmer, editors, Proceedings of Conference on Uncertainty in Artificial Intelligence (UAI’91), pages 347–374. Elsevier Science, New York, NY, 1991Google Scholar
  9. 9.
    E. Fix and J. L. Hodges. Discriminatory analysis: nonparametric discrimination: consistency properties. Technical Report 4, USAF School of Aviation Medicine, Randolph Field, TX, 1951Google Scholar
  10. 10.
    S. L. Hegarat-Mascle, I. Bloch, and D. Vidal-Madjar. Introduction of neighborhood information in evidence theory and application to data fusion of radar and optical images with partial cloud cover. Pattern Recognition, 31(11):1811–1823, November 1998CrossRefGoogle Scholar
  11. 11.
    K. K. R. G. K. Hewawasam, K. Premaratne, M.-L. Shyu, and S. P. Subasingha. Rule mining and classification in the presence of feature level and class label ambiguities. In K. L. Priddy, editor, Intelligent Computing: Theory and Applications III, volume 5803 of Proceedings of SPIE, pages 98–107. March 2005Google Scholar
  12. 12.
    K. K. R. G. K. Hewawasam, K. Premaratne, S. P. Subasingha, and M.-L. Shyu. Rule mining and classification in imperfect databases. In Proceedings of International Conference on Information Fusion (ICIF’05), Philadelphia, PA, July 2005Google Scholar
  13. 13.
    H.-J. Huang and C.-N. Hsu. Bayesian classification for data from the same unknown class. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, 32(2):137–145, April 2002CrossRefGoogle Scholar
  14. 14.
    T. Karban, J. Rauch, and M. Simunek. SDS-rules and association rules. In Proceedings of ACM Symposium on Applied Computing (SAC’04), pages 482–489, Nicosia, Cyprus, March 2004Google Scholar
  15. 15.
    M. A. Klopotek and S. T. Wierzchon. A new qualitative rought-set approach to modeling belief functions. In L. Polkowski and A. Skowron, editors, Proceedings of International Conference on Rough Sets and Current Trends in Computing (RSCTC’98), volume 1424 of Lecture Notes in Computer Science, pages 346–354. Springer, Berlin Heidelberg New York, 1998Google Scholar
  16. 16.
    E. C. Kulasekere, K. Premaratne, D. A. Dewasurendra, M.-L. Shyu, and P. H. Bauer. Conditioning and updating evidence. International Journal of Approximate Reasoning, 36(1):75–108, April 2004MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    T. Y. Lin. Fuzzy partitions II: Belief functions. A probabilistic view. In L. Polkowski and A. Skowron, editors, Proceedings of International Conference on Rough Sets and Current Trends in Computing (RSCTC’98), volume 1424 of Lecture Notes in Computer Science, pages 381–386. Springer, Berlin Heidelberg, New York, 1998.Google Scholar
  18. 18.
    B. Liu, W. Hsu, and Y. M. Ma. Integrating classification and association rule mining. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’98), pages 80–86, New York, NY, August 1998Google Scholar
  19. 19.
    A. A. Nanavati, K. P. Chitrapura, S. Joshi, and R. Krishnapuram. Mining generalized disjunctive association rules. In Proceedings of International Conference on Information and Knowledge Management (CIKM’01), pages 482–489, Atlanta, GA, November 2001Google Scholar
  20. 20.
    S. Parsons and A. Hunter. A review of uncertainty handling formalisms. In A. Hunter and S. Parsons, editors, Applications of Uncertainty Formalisms, volume 1455 of Lecture Notes in Artificial Intelligence, pages 8–37. Springer, Berlin Heidelberg New York, 1998Google Scholar
  21. 21.
    K. Premaratne, J. Zhang, and K. K. R. G. K. Hewawasam. Decision-making in distributed sensor networks: A belief-theoretic Bayes-like theorem. In Proceedings of IEEE International Midwest Symposium on Circuits and Systems (MWSCAS’04), volume II, pages 497–500, Hiroshima, Japan, July 2004Google Scholar
  22. 22.
    J. R. Quinlan. Decision trees and decision-making. IEEE Transactions on Systems, Man and Cybernetics, 20(2):339–346, March/April 1990lGoogle Scholar
  23. 23.
    J. R. Quinlan. C4.5: Programs for Machine Learning. Representation and Reasoning Series. Morgan Kaufmann, San Francisco, CA, 1993Google Scholar
  24. 24.
    G. Shafer. A Mathematical Theory of Evidence. Princeton University Press, Princeton, NJ, 1976MATHGoogle Scholar
  25. 25.
    M.-L. Shyu, S.-C. Chen, and R. L. Kashyap. Generalized affinity-based association rule mining for multimedia database queries. Knowledge and Information Systems (KAIS), An International Journal, 3(3):319–337, August 2001MATHCrossRefGoogle Scholar
  26. 26.
    A. Skowron and J. Grzymala-Busse. From rough set theory to evidence theory. In R. R. Yager, M. Fedrizzi, and J. Kacprzyk, editors, Advances in the Dempster-Shafer Theory of Evidence, pages 193–236. Wiley, New York, NY, 1994Google Scholar
  27. 27.
    P. Smets. Constructing the pignistic probability function in a context of uncertainty. In M. Henrion, R. D. Shachter, L. N. Kanal, and J. F. Lemmer, editors, Proceedings of Conference on Uncertainty in Artificial Intelligence (UAI’89), pages 29–40. North Holland, 1989Google Scholar
  28. 28.
    P. Vannoorenberghe. On aggregating belief decision trees. Information Fusion, 5(3):179–188, September 2004CrossRefGoogle Scholar
  29. 29.
    H. Xiaohua. Using rough sets theory and databases operations to construct a good ensemble of classifiers for data mining applications. In Proceedings of IEEE International Conference on Data Mining (ICDM’01), pages 233–240, San Jose, CA, November/December 2001Google Scholar
  30. 30.
    Y. Yang and T. C. Chiam. Rule discovery based on rough set theory. In Proceedings of International Conference on Information Fusion (ICIF’00), volume 1, pages TUC4/11–TUC4/16, Paris, France, July 2000Google Scholar
  31. 31.
    J. Zhang, S. P. Subasingha, K. Premaratne, M.-L. Shyu, M. Kubat, and K. K. R. G. K. Hewawasam. A novel belief theoretic association rule mining based classifier for handling class label ambiguities. In the Third Workshop in Foundations of Data Mining (FDM’04), in conjunction with the Fourth IEEE International Conference on Data Mining (ICDM04), pp. 213–222, November 1, 2004, Birghton, UKGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • S. P. Subasingha
    • 1
  • J. Zhang
    • 2
  • K. Premaratne
    • 1
  • M. -L. Shyu
    • 1
  • M. Kubat
    • 1
  • K. K. R. G. K. Hewawasam
    • 1
  1. 1.Department of Electrical and Computer EngineeringUniversity of MiamiCoral GablesUSA
  2. 2.Hemispheric Center for Environmental Technology (HCET)Florida International UniversityMiamiUSA

Personalised recommendations