Using Association Rules for Classification from Databases Having Class Label Ambiguities: A Belief Theoretic Method

Subasingha, S. P.; Zhang, J.; Premaratne, K.; Shyu, M. -L.; Kubat, M.; Hewawasam, K. K. R. G. K.

doi:10.1007/978-3-540-78488-3_32

S. P. Subasingha⁶,
J. Zhang⁷,
K. Premaratne⁶,
M. -L. Shyu⁶,
M. Kubat⁶ &
…
K. K. R. G. K. Hewawasam⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 118))

1211 Accesses
2 Citations

Summary

This chapter introduces a belief theoretic method for classification from databases having class label ambiguities. It uses a set of association rules extracted from such a database. It is assumed that a training data set with an adequate number of pre-classified instances, where each instance is assigned with an integer class label, is available. We use a modified association rule mining (ARM) technique to extract the interesting rules from the training data set and use a belief theoretic classifier based on the extracted rules to classify the incoming feature vectors. The ambiguity modelling capability of belief theory enables our classifier to perform better in the presence of class label ambiguities. It can also address the issue of the training data set being unbalanced or highly skewed by ensuring that an approximately equal number of rules are generated for each class. All these capabilities make our classifier ideally suited for those applications where (1) different experts may have conflicting opinions about the class label to be assigned to a specific training data instance; and (2) the majority of the training data instances are likely to represent a few classes giving rise to highly skewed databases. Therefore, the proposed classifier would be extremely useful in security monitoring and threat classification environments where conflicting expert opinions about the threat level are common and only a few training data instances would be considered to pose a heightened threat level. Several experiments are conducted to evaluate our proposed classifier. These experiments use several databases from the UCI data repository and data sets collected from the airport terminal simulation platform developed at the Distributed Decision Environments (DDE) Laboratory at the Department of Electrical and Computer Engineering, University of Miami. The experimental results show that, while the proposed classifier’s performance is comparable to some existing classifiers when the databases have no class label ambiguities, it provides superior classification accuracy and better efficiency when class label ambiguities are present.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In Proceedings of ACM SIGMOD International Conference on Management of Data, pages 207–216, Washington DC, May 1993
Google Scholar
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of International Conference on Very Large Data Bases (VLDB’94), pages 487–499, Santiago de Chile, Chile, September 1994
Google Scholar
C. L. Blake and C. J. Merz. UCI repository of machine learning databases, 1998
Google Scholar
T. M. Cover and P. E. Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21–27, January 1967
Article MATH Google Scholar
T. Denoeux. The k-nearest neighbor classification rule based on Dempster-Shafer theory. IEEE Transactions on Systems, Man and Cybernetics, 25(5):804–813, May 1995
Article Google Scholar
S. A. Dudani. The distance-weighted k-nearest-neighbor rule. IEEE Transactions on Systems, Man and Cybernetics, 6(4):325–327, April 1976
Google Scholar
S. Fabre, A. Appriou, and X. Briottet. Presentation and description of two classification methods using data fusion on sensor management. Information Fusion, 2:49–71, 2001
Article Google Scholar
R. Fagin and J. Y. Halpern. A new approach to updating beliefs. In P. P. Bonissone, M. Henrion, L. N. Kanal, and J. F. Lemmer, editors, Proceedings of Conference on Uncertainty in Artificial Intelligence (UAI’91), pages 347–374. Elsevier Science, New York, NY, 1991
Google Scholar
E. Fix and J. L. Hodges. Discriminatory analysis: nonparametric discrimination: consistency properties. Technical Report 4, USAF School of Aviation Medicine, Randolph Field, TX, 1951
Google Scholar
S. L. Hegarat-Mascle, I. Bloch, and D. Vidal-Madjar. Introduction of neighborhood information in evidence theory and application to data fusion of radar and optical images with partial cloud cover. Pattern Recognition, 31(11):1811–1823, November 1998
Article Google Scholar
K. K. R. G. K. Hewawasam, K. Premaratne, M.-L. Shyu, and S. P. Subasingha. Rule mining and classification in the presence of feature level and class label ambiguities. In K. L. Priddy, editor, Intelligent Computing: Theory and Applications III, volume 5803 of Proceedings of SPIE, pages 98–107. March 2005
Google Scholar
K. K. R. G. K. Hewawasam, K. Premaratne, S. P. Subasingha, and M.-L. Shyu. Rule mining and classification in imperfect databases. In Proceedings of International Conference on Information Fusion (ICIF’05), Philadelphia, PA, July 2005
Google Scholar
H.-J. Huang and C.-N. Hsu. Bayesian classification for data from the same unknown class. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, 32(2):137–145, April 2002
Article Google Scholar
T. Karban, J. Rauch, and M. Simunek. SDS-rules and association rules. In Proceedings of ACM Symposium on Applied Computing (SAC’04), pages 482–489, Nicosia, Cyprus, March 2004
Google Scholar
M. A. Klopotek and S. T. Wierzchon. A new qualitative rought-set approach to modeling belief functions. In L. Polkowski and A. Skowron, editors, Proceedings of International Conference on Rough Sets and Current Trends in Computing (RSCTC’98), volume 1424 of Lecture Notes in Computer Science, pages 346–354. Springer, Berlin Heidelberg New York, 1998
Google Scholar
E. C. Kulasekere, K. Premaratne, D. A. Dewasurendra, M.-L. Shyu, and P. H. Bauer. Conditioning and updating evidence. International Journal of Approximate Reasoning, 36(1):75–108, April 2004
Article MATH MathSciNet Google Scholar
T. Y. Lin. Fuzzy partitions II: Belief functions. A probabilistic view. In L. Polkowski and A. Skowron, editors, Proceedings of International Conference on Rough Sets and Current Trends in Computing (RSCTC’98), volume 1424 of Lecture Notes in Computer Science, pages 381–386. Springer, Berlin Heidelberg, New York, 1998.
Google Scholar
B. Liu, W. Hsu, and Y. M. Ma. Integrating classification and association rule mining. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’98), pages 80–86, New York, NY, August 1998
Google Scholar
A. A. Nanavati, K. P. Chitrapura, S. Joshi, and R. Krishnapuram. Mining generalized disjunctive association rules. In Proceedings of International Conference on Information and Knowledge Management (CIKM’01), pages 482–489, Atlanta, GA, November 2001
Google Scholar
S. Parsons and A. Hunter. A review of uncertainty handling formalisms. In A. Hunter and S. Parsons, editors, Applications of Uncertainty Formalisms, volume 1455 of Lecture Notes in Artificial Intelligence, pages 8–37. Springer, Berlin Heidelberg New York, 1998
Google Scholar
K. Premaratne, J. Zhang, and K. K. R. G. K. Hewawasam. Decision-making in distributed sensor networks: A belief-theoretic Bayes-like theorem. In Proceedings of IEEE International Midwest Symposium on Circuits and Systems (MWSCAS’04), volume II, pages 497–500, Hiroshima, Japan, July 2004
Google Scholar
J. R. Quinlan. Decision trees and decision-making. IEEE Transactions on Systems, Man and Cybernetics, 20(2):339–346, March/April 1990l
Google Scholar
J. R. Quinlan. C4.5: Programs for Machine Learning. Representation and Reasoning Series. Morgan Kaufmann, San Francisco, CA, 1993
Google Scholar
G. Shafer. A Mathematical Theory of Evidence. Princeton University Press, Princeton, NJ, 1976
MATH Google Scholar
M.-L. Shyu, S.-C. Chen, and R. L. Kashyap. Generalized affinity-based association rule mining for multimedia database queries. Knowledge and Information Systems (KAIS), An International Journal, 3(3):319–337, August 2001
Article MATH Google Scholar
A. Skowron and J. Grzymala-Busse. From rough set theory to evidence theory. In R. R. Yager, M. Fedrizzi, and J. Kacprzyk, editors, Advances in the Dempster-Shafer Theory of Evidence, pages 193–236. Wiley, New York, NY, 1994
Google Scholar
P. Smets. Constructing the pignistic probability function in a context of uncertainty. In M. Henrion, R. D. Shachter, L. N. Kanal, and J. F. Lemmer, editors, Proceedings of Conference on Uncertainty in Artificial Intelligence (UAI’89), pages 29–40. North Holland, 1989
Google Scholar
P. Vannoorenberghe. On aggregating belief decision trees. Information Fusion, 5(3):179–188, September 2004
Article Google Scholar
H. Xiaohua. Using rough sets theory and databases operations to construct a good ensemble of classifiers for data mining applications. In Proceedings of IEEE International Conference on Data Mining (ICDM’01), pages 233–240, San Jose, CA, November/December 2001
Google Scholar
Y. Yang and T. C. Chiam. Rule discovery based on rough set theory. In Proceedings of International Conference on Information Fusion (ICIF’00), volume 1, pages TUC4/11–TUC4/16, Paris, France, July 2000
Google Scholar
J. Zhang, S. P. Subasingha, K. Premaratne, M.-L. Shyu, M. Kubat, and K. K. R. G. K. Hewawasam. A novel belief theoretic association rule mining based classifier for handling class label ambiguities. In the Third Workshop in Foundations of Data Mining (FDM’04), in conjunction with the Fourth IEEE International Conference on Data Mining (ICDM04), pp. 213–222, November 1, 2004, Birghton, UK
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL, USA
S. P. Subasingha, K. Premaratne, M. -L. Shyu, M. Kubat & K. K. R. G. K. Hewawasam
Hemispheric Center for Environmental Technology (HCET), Florida International University, Miami, FL, USA
J. Zhang

Authors

S. P. Subasingha
View author publications
You can also search for this author in PubMed Google Scholar
J. Zhang
View author publications
You can also search for this author in PubMed Google Scholar
K. Premaratne
View author publications
You can also search for this author in PubMed Google Scholar
M. -L. Shyu
View author publications
You can also search for this author in PubMed Google Scholar
M. Kubat
View author publications
You can also search for this author in PubMed Google Scholar
K. K. R. G. K. Hewawasam
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, San Jose State University, San Jose, CA, 95192, USA
Tsau Young Lin
Department of Computer Science and Information Systems, Kennesaw State University, Building 11, Room 3060 1000 Chastain Road, Kennesaw, GA, 30144, USA
Ying Xie
Department of Computer Science, The University at Stony Brook, Stony Brook, New York, 11794-4400, USA
Anita Wasilewska
Institute of Information Science, Academia Sinica, No 128, Academia Road, Section 2 Nankang, Taipei, 11529, Taiwan
Churn-Jung Liau

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Subasingha, S.P., Zhang, J., Premaratne, K., Shyu, M.L., Kubat, M., Hewawasam, K.K.R.G.K. (2008). Using Association Rules for Classification from Databases Having Class Label Ambiguities: A Belief Theoretic Method. In: Lin, T.Y., Xie, Y., Wasilewska, A., Liau, CJ. (eds) Data Mining: Foundations and Practice. Studies in Computational Intelligence, vol 118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78488-3_32

Download citation

DOI: https://doi.org/10.1007/978-3-540-78488-3_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78487-6
Online ISBN: 978-3-540-78488-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics