Skip to main content

Using Association Rules for Classification from Databases Having Class Label Ambiguities: A Belief Theoretic Method

  • Chapter
Data Mining: Foundations and Practice

Part of the book series: Studies in Computational Intelligence ((SCI,volume 118))

Summary

This chapter introduces a belief theoretic method for classification from databases having class label ambiguities. It uses a set of association rules extracted from such a database. It is assumed that a training data set with an adequate number of pre-classified instances, where each instance is assigned with an integer class label, is available. We use a modified association rule mining (ARM) technique to extract the interesting rules from the training data set and use a belief theoretic classifier based on the extracted rules to classify the incoming feature vectors. The ambiguity modelling capability of belief theory enables our classifier to perform better in the presence of class label ambiguities. It can also address the issue of the training data set being unbalanced or highly skewed by ensuring that an approximately equal number of rules are generated for each class. All these capabilities make our classifier ideally suited for those applications where (1) different experts may have conflicting opinions about the class label to be assigned to a specific training data instance; and (2) the majority of the training data instances are likely to represent a few classes giving rise to highly skewed databases. Therefore, the proposed classifier would be extremely useful in security monitoring and threat classification environments where conflicting expert opinions about the threat level are common and only a few training data instances would be considered to pose a heightened threat level. Several experiments are conducted to evaluate our proposed classifier. These experiments use several databases from the UCI data repository and data sets collected from the airport terminal simulation platform developed at the Distributed Decision Environments (DDE) Laboratory at the Department of Electrical and Computer Engineering, University of Miami. The experimental results show that, while the proposed classifier’s performance is comparable to some existing classifiers when the databases have no class label ambiguities, it provides superior classification accuracy and better efficiency when class label ambiguities are present.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In Proceedings of ACM SIGMOD International Conference on Management of Data, pages 207–216, Washington DC, May 1993

    Google Scholar 

  2. R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of International Conference on Very Large Data Bases (VLDB’94), pages 487–499, Santiago de Chile, Chile, September 1994

    Google Scholar 

  3. C. L. Blake and C. J. Merz. UCI repository of machine learning databases, 1998

    Google Scholar 

  4. T. M. Cover and P. E. Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21–27, January 1967

    Article  MATH  Google Scholar 

  5. T. Denoeux. The k-nearest neighbor classification rule based on Dempster-Shafer theory. IEEE Transactions on Systems, Man and Cybernetics, 25(5):804–813, May 1995

    Article  Google Scholar 

  6. S. A. Dudani. The distance-weighted k-nearest-neighbor rule. IEEE Transactions on Systems, Man and Cybernetics, 6(4):325–327, April 1976

    Google Scholar 

  7. S. Fabre, A. Appriou, and X. Briottet. Presentation and description of two classification methods using data fusion on sensor management. Information Fusion, 2:49–71, 2001

    Article  Google Scholar 

  8. R. Fagin and J. Y. Halpern. A new approach to updating beliefs. In P. P. Bonissone, M. Henrion, L. N. Kanal, and J. F. Lemmer, editors, Proceedings of Conference on Uncertainty in Artificial Intelligence (UAI’91), pages 347–374. Elsevier Science, New York, NY, 1991

    Google Scholar 

  9. E. Fix and J. L. Hodges. Discriminatory analysis: nonparametric discrimination: consistency properties. Technical Report 4, USAF School of Aviation Medicine, Randolph Field, TX, 1951

    Google Scholar 

  10. S. L. Hegarat-Mascle, I. Bloch, and D. Vidal-Madjar. Introduction of neighborhood information in evidence theory and application to data fusion of radar and optical images with partial cloud cover. Pattern Recognition, 31(11):1811–1823, November 1998

    Article  Google Scholar 

  11. K. K. R. G. K. Hewawasam, K. Premaratne, M.-L. Shyu, and S. P. Subasingha. Rule mining and classification in the presence of feature level and class label ambiguities. In K. L. Priddy, editor, Intelligent Computing: Theory and Applications III, volume 5803 of Proceedings of SPIE, pages 98–107. March 2005

    Google Scholar 

  12. K. K. R. G. K. Hewawasam, K. Premaratne, S. P. Subasingha, and M.-L. Shyu. Rule mining and classification in imperfect databases. In Proceedings of International Conference on Information Fusion (ICIF’05), Philadelphia, PA, July 2005

    Google Scholar 

  13. H.-J. Huang and C.-N. Hsu. Bayesian classification for data from the same unknown class. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, 32(2):137–145, April 2002

    Article  Google Scholar 

  14. T. Karban, J. Rauch, and M. Simunek. SDS-rules and association rules. In Proceedings of ACM Symposium on Applied Computing (SAC’04), pages 482–489, Nicosia, Cyprus, March 2004

    Google Scholar 

  15. M. A. Klopotek and S. T. Wierzchon. A new qualitative rought-set approach to modeling belief functions. In L. Polkowski and A. Skowron, editors, Proceedings of International Conference on Rough Sets and Current Trends in Computing (RSCTC’98), volume 1424 of Lecture Notes in Computer Science, pages 346–354. Springer, Berlin Heidelberg New York, 1998

    Google Scholar 

  16. E. C. Kulasekere, K. Premaratne, D. A. Dewasurendra, M.-L. Shyu, and P. H. Bauer. Conditioning and updating evidence. International Journal of Approximate Reasoning, 36(1):75–108, April 2004

    Article  MATH  MathSciNet  Google Scholar 

  17. T. Y. Lin. Fuzzy partitions II: Belief functions. A probabilistic view. In L. Polkowski and A. Skowron, editors, Proceedings of International Conference on Rough Sets and Current Trends in Computing (RSCTC’98), volume 1424 of Lecture Notes in Computer Science, pages 381–386. Springer, Berlin Heidelberg, New York, 1998.

    Google Scholar 

  18. B. Liu, W. Hsu, and Y. M. Ma. Integrating classification and association rule mining. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’98), pages 80–86, New York, NY, August 1998

    Google Scholar 

  19. A. A. Nanavati, K. P. Chitrapura, S. Joshi, and R. Krishnapuram. Mining generalized disjunctive association rules. In Proceedings of International Conference on Information and Knowledge Management (CIKM’01), pages 482–489, Atlanta, GA, November 2001

    Google Scholar 

  20. S. Parsons and A. Hunter. A review of uncertainty handling formalisms. In A. Hunter and S. Parsons, editors, Applications of Uncertainty Formalisms, volume 1455 of Lecture Notes in Artificial Intelligence, pages 8–37. Springer, Berlin Heidelberg New York, 1998

    Google Scholar 

  21. K. Premaratne, J. Zhang, and K. K. R. G. K. Hewawasam. Decision-making in distributed sensor networks: A belief-theoretic Bayes-like theorem. In Proceedings of IEEE International Midwest Symposium on Circuits and Systems (MWSCAS’04), volume II, pages 497–500, Hiroshima, Japan, July 2004

    Google Scholar 

  22. J. R. Quinlan. Decision trees and decision-making. IEEE Transactions on Systems, Man and Cybernetics, 20(2):339–346, March/April 1990l

    Google Scholar 

  23. J. R. Quinlan. C4.5: Programs for Machine Learning. Representation and Reasoning Series. Morgan Kaufmann, San Francisco, CA, 1993

    Google Scholar 

  24. G. Shafer. A Mathematical Theory of Evidence. Princeton University Press, Princeton, NJ, 1976

    MATH  Google Scholar 

  25. M.-L. Shyu, S.-C. Chen, and R. L. Kashyap. Generalized affinity-based association rule mining for multimedia database queries. Knowledge and Information Systems (KAIS), An International Journal, 3(3):319–337, August 2001

    Article  MATH  Google Scholar 

  26. A. Skowron and J. Grzymala-Busse. From rough set theory to evidence theory. In R. R. Yager, M. Fedrizzi, and J. Kacprzyk, editors, Advances in the Dempster-Shafer Theory of Evidence, pages 193–236. Wiley, New York, NY, 1994

    Google Scholar 

  27. P. Smets. Constructing the pignistic probability function in a context of uncertainty. In M. Henrion, R. D. Shachter, L. N. Kanal, and J. F. Lemmer, editors, Proceedings of Conference on Uncertainty in Artificial Intelligence (UAI’89), pages 29–40. North Holland, 1989

    Google Scholar 

  28. P. Vannoorenberghe. On aggregating belief decision trees. Information Fusion, 5(3):179–188, September 2004

    Article  Google Scholar 

  29. H. Xiaohua. Using rough sets theory and databases operations to construct a good ensemble of classifiers for data mining applications. In Proceedings of IEEE International Conference on Data Mining (ICDM’01), pages 233–240, San Jose, CA, November/December 2001

    Google Scholar 

  30. Y. Yang and T. C. Chiam. Rule discovery based on rough set theory. In Proceedings of International Conference on Information Fusion (ICIF’00), volume 1, pages TUC4/11–TUC4/16, Paris, France, July 2000

    Google Scholar 

  31. J. Zhang, S. P. Subasingha, K. Premaratne, M.-L. Shyu, M. Kubat, and K. K. R. G. K. Hewawasam. A novel belief theoretic association rule mining based classifier for handling class label ambiguities. In the Third Workshop in Foundations of Data Mining (FDM’04), in conjunction with the Fourth IEEE International Conference on Data Mining (ICDM04), pp. 213–222, November 1, 2004, Birghton, UK

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Subasingha, S.P., Zhang, J., Premaratne, K., Shyu, M.L., Kubat, M., Hewawasam, K.K.R.G.K. (2008). Using Association Rules for Classification from Databases Having Class Label Ambiguities: A Belief Theoretic Method. In: Lin, T.Y., Xie, Y., Wasilewska, A., Liau, CJ. (eds) Data Mining: Foundations and Practice. Studies in Computational Intelligence, vol 118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78488-3_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78488-3_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78487-6

  • Online ISBN: 978-3-540-78488-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics