Complexity of Rule Sets in Mining Incomplete Data Using Characteristic Sets and Generalized Maximal Consistent Blocks

Clark, Patrick G.; Gao, Cheng; Grzymala-Busse, Jerzy W.; Mroczek, Teresa; Niemiec, Rafal

doi:10.1007/978-3-319-92639-1_8

Patrick G. Clark²⁰,
Cheng Gao²⁰,
Jerzy W. Grzymala-Busse^20,21,
Teresa Mroczek²¹ &
…
Rafal Niemiec²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10870))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

2415 Accesses
1 Citations

Abstract

In this paper, missing attribute values in incomplete data sets have two possible interpretations, lost values and “do not care” conditions. For rule induction we use characteristic sets and generalized maximal consistent blocks. Therefore we apply four different approaches for data mining. As follows from our previous experiments, where we used an error rate evaluated by ten-fold cross validation as the main criterion of quality, no approach is universally the best. Therefore we decided to compare our four approaches using complexity of rule sets induced from incomplete data sets. We show that the cardinality of rule sets is always smaller for incomplete data sets with “do not care” conditions. Thus the choice between interpretations of missing attribute values is more important than the choice between characteristic sets and generalized maximal consistent blocks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Clark, P.G., Gao, C., Grzymala-Busse, J.W., Mroczek, T.: Characteristic sets and generalized maximal consistent blocks in mining incomplete data. In: Polkowski, L., Yao, Y., Artiemjew, P., Ciucci, D., Liu, D., Ślęzak, D., Zielosko, B. (eds.) IJCRS 2017. LNCS (LNAI), vol. 10313, pp. 477–486. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60837-2_39
Chapter Google Scholar
Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)
Google Scholar
Clark, P.G., Grzymala-Busse, J.W.: Experiments using three probabilistic approximations for rule induction from incomplete data sets. In: Proceedings of the MCCSIS 2012, IADIS European Conference on Data Mining ECDM 2012, pp. 72–78 (2012)
Google Scholar
Grzymala-Busse, J.W.: LERS-a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support Theory and Decision Library, vol. 11, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992). https://doi.org/10.1007/978-94-015-7975-9_1
Chapter Google Scholar
Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundam. Inf. 31, 27–39 (1997)
MATH Google Scholar
Grzymala-Busse, J.W.: MLEM2: a new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)
Google Scholar
Grzymala-Busse, J.W.: Rough set strategies to data with missing attribute values. In: Notes of the Workshop on Foundations and New Directions of Data Mining, in Conjunction with the Third International Conference on Data Mining, pp. 56–63 (2003)
Google Scholar
Grzymała-Busse, J.W.: Generalized parameterized approximations. In: Yao, J.T., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS (LNAI), vol. 6954, pp. 136–145. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24425-4_20
Chapter Google Scholar
Grzymala-Busse, J.W., Ziarko, W.: Data mining based on rough sets. In: Wang, J. (ed.) Data Mining: Opportunities and Challenges, pp. 142–173. Idea Group Publishing, Hershey (2003)
Chapter Google Scholar
Leung, Y., Li, D.: Maximal consistent block technique for rule acquisition in incomplete information systems. Inf. Sci. 153, 85–106 (2003)
Article MathSciNet Google Scholar
Leung, Y., Wu, W., Zhang, W.: Knowledge acquisition in incomplete information systems: a rough set approach. Eur. J. Oper. Res. 168, 164–180 (2006)
Article MathSciNet Google Scholar
Pawlak, Z., Skowron, A.: Rough sets: some extensions. Inf. Sci. 177, 28–40 (2007)
Article MathSciNet Google Scholar
Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. Int. J. Man-Mach. Stud. 29, 81–95 (1988)
Article Google Scholar
Ślȩzak, D., Ziarko, W.: The investigation of the Bayesian rough set model. Int. J. Approx. Reason. 40, 81–91 (2005)
Article MathSciNet Google Scholar
Wong, S.K.M., Ziarko, W.: INFER–an adaptive decision support system based on the probabilistic approximate classification. In: Proceedings of the 6th International Workshop on Expert Systems and their Applications, pp. 713–726 (1986)
Google Scholar
Yao, Y.Y.: Probabilistic rough set approximations. Int. J. Approx. Reason. 49, 255–271 (2008)
Article Google Scholar
Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. Int. J. Man-Mach. Stud. 37, 793–809 (1992)
Article Google Scholar
Ziarko, W.: Variable precision rough set model. J. Comput. Syst. Sci. 46(1), 39–59 (1993)
Article MathSciNet Google Scholar
Ziarko, W.: Probabilistic approach to rough sets. Int. J. Approx. Reason. 49, 272–284 (2008)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS, 66045, USA
Patrick G. Clark, Cheng Gao & Jerzy W. Grzymala-Busse
Department of Expert Systems and Artificial Intelligence, University of Information Technology and Management, 35-225, Rzeszow, Poland
Jerzy W. Grzymala-Busse, Teresa Mroczek & Rafal Niemiec

Authors

Patrick G. Clark
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Gao
View author publications
You can also search for this author in PubMed Google Scholar
Jerzy W. Grzymala-Busse
View author publications
You can also search for this author in PubMed Google Scholar
Teresa Mroczek
View author publications
You can also search for this author in PubMed Google Scholar
Rafal Niemiec
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jerzy W. Grzymala-Busse .

Editor information

Editors and Affiliations

Department of Mine Operating and Prospection, University of Oviedo, Oviedo, Spain
Francisco Javier de Cos Juez
Department of Computer Science, University of Oviedo, Oviedo, Spain
José Ramón Villar
Department of Computer Science, University of Oviedo, Oviedo, Spain
Enrique A. de la Cal
Department of Civil Engineering, University of Burgos, Burgos, Spain
Álvaro Herrero
University of A Coruña, A Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
José António Sáez
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Clark, P.G., Gao, C., Grzymala-Busse, J.W., Mroczek, T., Niemiec, R. (2018). Complexity of Rule Sets in Mining Incomplete Data Using Characteristic Sets and Generalized Maximal Consistent Blocks. In: de Cos Juez, F., et al. Hybrid Artificial Intelligent Systems. HAIS 2018. Lecture Notes in Computer Science(), vol 10870. Springer, Cham. https://doi.org/10.1007/978-3-319-92639-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-92639-1_8
Published: 08 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92638-4
Online ISBN: 978-3-319-92639-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics