Unified Framework for Construction of Rule Based Classification Systems

Liu, Han; Gegov, Alexander; Stahl, Frederic

doi:10.1007/978-3-319-08254-7_10

Han Liu⁴,
Alexander Gegov⁴ &
Frederic Stahl⁵

Part of the book series: Studies in Big Data ((SBD,volume 8))

3096 Accesses
9 Citations

Abstract

Automatic generation of classification rules has been an increasingly popular technique in commercial applications such as Big Data analytics, rule based expert systems and decision making systems. However, a principal problem that arises with most methods for generation of classification rules is the overfitting of training data. When Big Data is dealt with, this may result in the generation of a large number of complex rules. This may not only increase computational cost but also lower the accuracy in predicting further unseen instances. This has led to the necessity of developing pruning methods for the simplification of rules. In addition, classification rules are used further to make predictions after the completion of their generation. As efficiency is concerned, it is expected to find the first rule that fires as soon as possible by searching through a rule set. Thus a suitable structure is required to represent the rule set effectively. In this chapter, the authors introduce a unified framework for construction of rule based classification systems consisting of three operations on Big Data: rule generation, rule simplification and rule representation. The authors also review some existing methods and techniques used for each of the three operations and highlight their limitations. They introduce some novel methods and techniques developed by them recently. These methods and techniques are also discussed in comparison to existing ones with respect to efficient processing of Big Data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufman, Los Altos (1993)
Google Scholar
Michalski, R.S.: On the quasi-minimal solution of the general covering problem. In: Proceedings of the Fifth International Symposium on Information Processing, Bled, Yugoslavia, pp. 125–128 (1969)
Google Scholar
Cendrowska, J.: PRISM: an algorithm for inducing modular rules. Int. J. Man Mach. Stud. 27, 349–370 (1987)
Article MATH Google Scholar
Bramer, M.A.: Automatic Induction of Classification Rules from Examples Using N-Prism, Research and Development in Intelligent Systems, vol. XVI, pp. 99–121. Springer, Cambridge (2000)
Google Scholar
Bramer, M.A.: Using J-pruning to reduce overfitting of classification rules in noisy domains. In: Proceedings of 13th International Conference on Database and Expert Systems Applications—DEXA 2002, Aix-en-Provence, France, 2–6 Sept 2002
Google Scholar
Bramer, M.A.: Principles of Data Mining. Springer, London (2007)
MATH Google Scholar
Smyth, P., Goodman, R.M.: Rule induction using information theory. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 159–176. AAAI Press, California (1991)
Google Scholar
Bramer, M.A.: Using J-pruning to reduce overfitting in classification trees. In: Research and Development in Intelligent Systems, vol. XVIII, pp. 25–38. Springer, Berlin (2002)
Google Scholar
Stahl, F., Bramer, M.A.: Jmax-pruning: a facility for the information theoretic pruning of modular classification rules. Knowl. Based Syst. 29, 12–19 (2012)
Article Google Scholar
Stahl, F., Bramer, M.A.: Induction of modular classification rules: using Jmax-pruning. In: Thirtieth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, Cambridge, 14–16 Dec 2011
Google Scholar
What is big data? http://www.sas.com/big-data/. 7 Dec 2013
Master data management for big data. http://www-01.ibm.com/software/data/infosphere/mdm-big-data/. 7 Dec 2013
Bramer, M.A.: Inducer: a public domain workbench for data mining. Int. J. Syst. Sci. 36(14), 909–919 (2005)
Article MATH Google Scholar
Stahl, F., Bramer, M.A.: Computationally efficient induction of classification rules with the PMCRI and J-PMCRI frameworks. Knowl.-Based Syst. 35, 49–63 (2012)
Article Google Scholar
Bramer, M.A.: An information-theoretic approach to the pre-pruning of classification rules. In: Musen, M., Neumann, B., Studer, R. (eds.) Intelligent Information Processing, pp. 201–212. Kluwer, Dordrecht (2002)
Chapter Google Scholar
Deng, X.: A covering-based algorithm for classification: PRISM. CS831: Knowledge discover in databases (2012)
Google Scholar
Liu, H., Gegov, A.: Induction of modular classification rules by Information Entropy Based Rule Generation. In: V. Sgurev, R. Yager, J. Kacprzyk (Eds.) Innovative issues in intelligent systems. Springer, Berlin (in print)
Google Scholar
Shannon, C.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948). Fonn
Article MATH MathSciNet Google Scholar
Liu, H., Gegov, A., Stahl, F.: J-measure based hybrid pruning for complexity reduction in classification rules. WSEAS Trans. Syst. 12(9), 433–446 (2013)
Google Scholar
Bache, K., Lichman, M.: UCI Machine learning repository. http://archive.ics.uci.edu/ml. Irvine, CA: University of California, School of Information and Computer Science, 2013

Download references

Author information

Authors and Affiliations

School of Computing, University of Portsmouth, Buckingham Building, Lion Terrace, Portsmouth, PO1 3HE, UK
Han Liu & Alexander Gegov
School of Systems Engineering, University of Reading, White Knights, 225, Reading, RG6 6AY, UK
Frederic Stahl

Authors

Han Liu
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Gegov
View author publications
You can also search for this author in PubMed Google Scholar
Frederic Stahl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Han Liu .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada
Witold Pedrycz
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Shyi-Ming Chen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Liu, H., Gegov, A., Stahl, F. (2015). Unified Framework for Construction of Rule Based Classification Systems. In: Pedrycz, W., Chen, SM. (eds) Information Granularity, Big Data, and Computational Intelligence. Studies in Big Data, vol 8. Springer, Cham. https://doi.org/10.1007/978-3-319-08254-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-08254-7_10
Published: 15 July 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08253-0
Online ISBN: 978-3-319-08254-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics