Mining Class Association Rules on Dataset with Missing Data

Nguyen, Hoang-Lam; Nguyen, Loan T. T.; Kozierkiewicz, Adrianna

doi:10.1007/978-3-030-73280-6_9

Hoang-Lam Nguyen^12,13,
Loan T. T. Nguyen^12,13 &
Adrianna Kozierkiewicz¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12672))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

1771 Accesses

Abstract

Many real-world datasets contain missing values, affecting the efficiency of many classification algorithms. However, this is an unavoidable error due to many reasons such as network problems, physical devices, etc. Some classification algorithms cannot work properly with incomplete dataset. Therefore, it is crucial to handle missing values. Imputation methods have been proven to be effective in handling missing data, thus, significantly improve classification accuracy. There are two types of imputation methods. Both have their pros and cons. Single imputation can lead to low accuracy while multiple imputation is time-consuming. One high-accuracy algorithm proposed in this paper is called Classification based on Association Rules (CARs). Classification based on CARs has been proven to yield higher accuracy compared to others. However, there is no investigation on how to mine CARs with incomplete datasets. The goal of this work is to develop an effective imputation method for mining CARs on incomplete datasets. To show the impact of each imputation method, two cases of imputation will be applied and compared in experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

HSIM: A Supervised Imputation Method for Hierarchical Classification Scenario

An effective method for classification with missing values

Article 17 February 2018

A novel algorithm for imputing the missing values in incomplete datasets

Article 07 August 2023

References

Nguyen, N.T.: Advanced Methods for Inconsistent Knowledge Management. Springer, London (2008). https://doi.org/10.1007/978-1-84628-889-0
Book MATH Google Scholar
Zong, W., Huang, G.B.: Face recognition based on extreme learning machine. Neurocomputing 74(16), 2541–2551 (2011)
Article Google Scholar
Adiraju, R.V., Masanipalli, K.K, Reddy, T.D., Pedapalli, R., Chundru, S., Panigrahy, A.K.: An extensive survey on finger and palm vein recognition system. Mater. Today Proc. (2020). https://doi.org/10.1016/j.matpr.2020.08.742
Wei, S., Zheng, X., Chen, D., Chen, C.: A hybrid approach for movie recommendation via tags and ratings. Electron. Commer. Res. Appl. 18, 83–94 (2016)
Article Google Scholar
Ahmad, M.A., Teredesai, A., Eckert, C.: Interpretable machine learning in healthcare. In: Proceedings of 2018 IEEE International Conference on Healthcare Informatics, ICHI 2018, p. 447 (2018)
Google Scholar
Donders, A.R.T., van der Heijden, G.J.M.G., Stijnen, T., Moons, K.G.M.: Review: a gentle introduction to imputation of missing values. J. Clin. Epidemiol. 59(10), 1087–1091 (2006)
Article Google Scholar
Darmawan, I.G.N.: NORM software review: handling missing values with multiple imputation methods. Eval. J. Australas. 2(1), 51–57 (2002)
Article Google Scholar
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data, 3rd edn. Wiley (2019)
Google Scholar
Jadhav, A., Pramod, D., Ramanathan, K.: Comparison of performance of data imputation methods for numeric dataset. Appl. Artif. Intell. 33(10), 913–933 (2019)
Article Google Scholar
Rubin, D.B.: An overview of multiple imputation. In: Proceedings of the Survey Research Methods Section, pp. 79–84. American Statistical Association (1988)
Google Scholar
Gómez-Carracedo, M.P., Andrade, J.M., López-Mahía, P., Muniategui, S., Prada, D.: A practical comparison of single and multiple imputation methods to handle complex missing data in air quality datasets. Chemom. Intell. Lab. Syst. 134, 23–33 (2014)
Google Scholar
Nguyen, N.T.: Consensus systems for conflict solving in distributed systems. Inf. Sci. 147(1–4), 91–122 (2002)
Article Google Scholar
Nguyen, N.T.: Using consensus methods for solving conflicts of data in distributed systems. In: Hlaváč, V., Jeffery, K.G., Wiedermann, J. (eds.) SOFSEM 2000. LNCS, vol. 1963, pp. 411–419. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44411-4_30
Chapter Google Scholar
Musil, C.M., Warner, C.B., Yobas, P.K., Jones, S.L.: A comparison of imputation techniques for handling missing data. West. J. Nurs. Res. 24(7), 815–829 (2002)
Article Google Scholar
Liu, B., Hsu, W., Ma, Y., Ma, B.: Integrating classification and association rule mining. In: Knowledge Discovery and Data Mining, pp. 80–86 (1998)
Google Scholar
Li, W., Han, J., Pei, J.: CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceedings of IEEE International Conference on Data Mining, ICDM, pp. 369–376 (2001)
Google Scholar
Thabtah, F.A., Cowling, P., Peng, Y.: MMAC: a new multi-class, multi-label associative classification approach. In: Proceedings of Fourth IEEE International Conference on Data Mining, ICDM 2004, pp. 217–224 (2004)
Google Scholar
Vo, B., Le, B.: A novel classification algorithm based on association rules mining. In: Richards, D., Kang, B.-H. (eds.) PKAW 2008. LNCS (LNAI), vol. 5465, pp. 61–75. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01715-5_6
Chapter Google Scholar
Nguyen, L.T.T., Vo, B., Hong, T.P., Thanh, H.C.: Classification based on association rules: a lattice-based approach. Expert Syst. Appl. 39(13), 11357–11366 (2012)
Article Google Scholar

Download references

Acknowledgements

This research is funded by International University, VNU-HCM under grant number SV2019-IT-03.

Author information

Authors and Affiliations

School of Computer Science and Engineering, International University, Ho Chi Minh City, Vietnam
Hoang-Lam Nguyen & Loan T. T. Nguyen
Vietnam National University, Ho Chi Minh City, Vietnam
Hoang-Lam Nguyen & Loan T. T. Nguyen
Faculty of Computer Science and Management, Wroclaw University of Science and Technology, Wrocław, Poland
Adrianna Kozierkiewicz

Authors

Hoang-Lam Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Loan T. T. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Adrianna Kozierkiewicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Loan T. T. Nguyen .

Editor information

Editors and Affiliations

Wrocław University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen
King Mongkut's Institute of Technology Ladkrabang, Bangkok, Thailand
Suphamit Chittayasothorn
Nanyang Technological University, Singapore, Singapore
Dusit Niyato
Wrocław University of Science and Technology, Wrocław, Poland
Bogdan Trawiński

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, HL., Nguyen, L.T.T., Kozierkiewicz, A. (2021). Mining Class Association Rules on Dataset with Missing Data. In: Nguyen, N.T., Chittayasothorn, S., Niyato, D., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2021. Lecture Notes in Computer Science(), vol 12672. Springer, Cham. https://doi.org/10.1007/978-3-030-73280-6_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-73280-6_9
Published: 05 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73279-0
Online ISBN: 978-3-030-73280-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mining Class Association Rules on Dataset with Missing Data

Abstract

Access this chapter

Similar content being viewed by others

HSIM: A Supervised Imputation Method for Hierarchical Classification Scenario

An effective method for classification with missing values

A novel algorithm for imputing the missing values in incomplete datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Mining Class Association Rules on Dataset with Missing Data

Abstract

Access this chapter

Similar content being viewed by others

HSIM: A Supervised Imputation Method for Hierarchical Classification Scenario

An effective method for classification with missing values

A novel algorithm for imputing the missing values in incomplete datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation