Skip to main content

Mining Class Association Rules on Dataset with Missing Data

  • Conference paper
  • First Online:
Intelligent Information and Database Systems (ACIIDS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12672))

Included in the following conference series:

  • 1771 Accesses

Abstract

Many real-world datasets contain missing values, affecting the efficiency of many classification algorithms. However, this is an unavoidable error due to many reasons such as network problems, physical devices, etc. Some classification algorithms cannot work properly with incomplete dataset. Therefore, it is crucial to handle missing values. Imputation methods have been proven to be effective in handling missing data, thus, significantly improve classification accuracy. There are two types of imputation methods. Both have their pros and cons. Single imputation can lead to low accuracy while multiple imputation is time-consuming. One high-accuracy algorithm proposed in this paper is called Classification based on Association Rules (CARs). Classification based on CARs has been proven to yield higher accuracy compared to others. However, there is no investigation on how to mine CARs with incomplete datasets. The goal of this work is to develop an effective imputation method for mining CARs on incomplete datasets. To show the impact of each imputation method, two cases of imputation will be applied and compared in experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Nguyen, N.T.: Advanced Methods for Inconsistent Knowledge Management. Springer, London (2008). https://doi.org/10.1007/978-1-84628-889-0

    Book  MATH  Google Scholar 

  2. Zong, W., Huang, G.B.: Face recognition based on extreme learning machine. Neurocomputing 74(16), 2541–2551 (2011)

    Article  Google Scholar 

  3. Adiraju, R.V., Masanipalli, K.K, Reddy, T.D., Pedapalli, R., Chundru, S., Panigrahy, A.K.: An extensive survey on finger and palm vein recognition system. Mater. Today Proc. (2020). https://doi.org/10.1016/j.matpr.2020.08.742

  4. Wei, S., Zheng, X., Chen, D., Chen, C.: A hybrid approach for movie recommendation via tags and ratings. Electron. Commer. Res. Appl. 18, 83–94 (2016)

    Article  Google Scholar 

  5. Ahmad, M.A., Teredesai, A., Eckert, C.: Interpretable machine learning in healthcare. In: Proceedings of 2018 IEEE International Conference on Healthcare Informatics, ICHI 2018, p. 447 (2018)

    Google Scholar 

  6. Donders, A.R.T., van der Heijden, G.J.M.G., Stijnen, T., Moons, K.G.M.: Review: a gentle introduction to imputation of missing values. J. Clin. Epidemiol. 59(10), 1087–1091 (2006)

    Article  Google Scholar 

  7. Darmawan, I.G.N.: NORM software review: handling missing values with multiple imputation methods. Eval. J. Australas. 2(1), 51–57 (2002)

    Article  Google Scholar 

  8. Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data, 3rd edn. Wiley (2019)

    Google Scholar 

  9. Jadhav, A., Pramod, D., Ramanathan, K.: Comparison of performance of data imputation methods for numeric dataset. Appl. Artif. Intell. 33(10), 913–933 (2019)

    Article  Google Scholar 

  10. Rubin, D.B.: An overview of multiple imputation. In: Proceedings of the Survey Research Methods Section, pp. 79–84. American Statistical Association (1988)

    Google Scholar 

  11. Gómez-Carracedo, M.P., Andrade, J.M., López-Mahía, P., Muniategui, S., Prada, D.: A practical comparison of single and multiple imputation methods to handle complex missing data in air quality datasets. Chemom. Intell. Lab. Syst. 134, 23–33 (2014)

    Google Scholar 

  12. Nguyen, N.T.: Consensus systems for conflict solving in distributed systems. Inf. Sci. 147(1–4), 91–122 (2002)

    Article  Google Scholar 

  13. Nguyen, N.T.: Using consensus methods for solving conflicts of data in distributed systems. In: Hlaváč, V., Jeffery, K.G., Wiedermann, J. (eds.) SOFSEM 2000. LNCS, vol. 1963, pp. 411–419. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44411-4_30

    Chapter  Google Scholar 

  14. Musil, C.M., Warner, C.B., Yobas, P.K., Jones, S.L.: A comparison of imputation techniques for handling missing data. West. J. Nurs. Res. 24(7), 815–829 (2002)

    Article  Google Scholar 

  15. Liu, B., Hsu, W., Ma, Y., Ma, B.: Integrating classification and association rule mining. In: Knowledge Discovery and Data Mining, pp. 80–86 (1998)

    Google Scholar 

  16. Li, W., Han, J., Pei, J.: CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceedings of IEEE International Conference on Data Mining, ICDM, pp. 369–376 (2001)

    Google Scholar 

  17. Thabtah, F.A., Cowling, P., Peng, Y.: MMAC: a new multi-class, multi-label associative classification approach. In: Proceedings of Fourth IEEE International Conference on Data Mining, ICDM 2004, pp. 217–224 (2004)

    Google Scholar 

  18. Vo, B., Le, B.: A novel classification algorithm based on association rules mining. In: Richards, D., Kang, B.-H. (eds.) PKAW 2008. LNCS (LNAI), vol. 5465, pp. 61–75. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01715-5_6

    Chapter  Google Scholar 

  19. Nguyen, L.T.T., Vo, B., Hong, T.P., Thanh, H.C.: Classification based on association rules: a lattice-based approach. Expert Syst. Appl. 39(13), 11357–11366 (2012)

    Article  Google Scholar 

Download references

Acknowledgements

This research is funded by International University, VNU-HCM under grant number SV2019-IT-03.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Loan T. T. Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, HL., Nguyen, L.T.T., Kozierkiewicz, A. (2021). Mining Class Association Rules on Dataset with Missing Data. In: Nguyen, N.T., Chittayasothorn, S., Niyato, D., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2021. Lecture Notes in Computer Science(), vol 12672. Springer, Cham. https://doi.org/10.1007/978-3-030-73280-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-73280-6_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-73279-0

  • Online ISBN: 978-3-030-73280-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics