Skip to main content

Unified Framework for Construction of Rule Based Classification Systems

  • Chapter
  • First Online:
Information Granularity, Big Data, and Computational Intelligence

Part of the book series: Studies in Big Data ((SBD,volume 8))

Abstract

Automatic generation of classification rules has been an increasingly popular technique in commercial applications such as Big Data analytics, rule based expert systems and decision making systems. However, a principal problem that arises with most methods for generation of classification rules is the overfitting of training data. When Big Data is dealt with, this may result in the generation of a large number of complex rules. This may not only increase computational cost but also lower the accuracy in predicting further unseen instances. This has led to the necessity of developing pruning methods for the simplification of rules. In addition, classification rules are used further to make predictions after the completion of their generation. As efficiency is concerned, it is expected to find the first rule that fires as soon as possible by searching through a rule set. Thus a suitable structure is required to represent the rule set effectively. In this chapter, the authors introduce a unified framework for construction of rule based classification systems consisting of three operations on Big Data: rule generation, rule simplification and rule representation. The authors also review some existing methods and techniques used for each of the three operations and highlight their limitations. They introduce some novel methods and techniques developed by them recently. These methods and techniques are also discussed in comparison to existing ones with respect to efficient processing of Big Data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufman, Los Altos (1993)

    Google Scholar 

  2. Michalski, R.S.: On the quasi-minimal solution of the general covering problem. In: Proceedings of the Fifth International Symposium on Information Processing, Bled, Yugoslavia, pp. 125–128 (1969)

    Google Scholar 

  3. Cendrowska, J.: PRISM: an algorithm for inducing modular rules. Int. J. Man Mach. Stud. 27, 349–370 (1987)

    Article  MATH  Google Scholar 

  4. Bramer, M.A.: Automatic Induction of Classification Rules from Examples Using N-Prism, Research and Development in Intelligent Systems, vol. XVI, pp. 99–121. Springer, Cambridge (2000)

    Google Scholar 

  5. Bramer, M.A.: Using J-pruning to reduce overfitting of classification rules in noisy domains. In: Proceedings of 13th International Conference on Database and Expert Systems Applications—DEXA 2002, Aix-en-Provence, France, 2–6 Sept 2002

    Google Scholar 

  6. Bramer, M.A.: Principles of Data Mining. Springer, London (2007)

    MATH  Google Scholar 

  7. Smyth, P., Goodman, R.M.: Rule induction using information theory. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 159–176. AAAI Press, California (1991)

    Google Scholar 

  8. Bramer, M.A.: Using J-pruning to reduce overfitting in classification trees. In: Research and Development in Intelligent Systems, vol. XVIII, pp. 25–38. Springer, Berlin (2002)

    Google Scholar 

  9. Stahl, F., Bramer, M.A.: Jmax-pruning: a facility for the information theoretic pruning of modular classification rules. Knowl. Based Syst. 29, 12–19 (2012)

    Article  Google Scholar 

  10. Stahl, F., Bramer, M.A.: Induction of modular classification rules: using Jmax-pruning. In: Thirtieth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, Cambridge, 14–16 Dec 2011

    Google Scholar 

  11. What is big data? http://www.sas.com/big-data/. 7 Dec 2013

  12. Master data management for big data. http://www-01.ibm.com/software/data/infosphere/mdm-big-data/. 7 Dec 2013

  13. Bramer, M.A.: Inducer: a public domain workbench for data mining. Int. J. Syst. Sci. 36(14), 909–919 (2005)

    Article  MATH  Google Scholar 

  14. Stahl, F., Bramer, M.A.: Computationally efficient induction of classification rules with the PMCRI and J-PMCRI frameworks. Knowl.-Based Syst. 35, 49–63 (2012)

    Article  Google Scholar 

  15. Bramer, M.A.: An information-theoretic approach to the pre-pruning of classification rules. In: Musen, M., Neumann, B., Studer, R. (eds.) Intelligent Information Processing, pp. 201–212. Kluwer, Dordrecht (2002)

    Chapter  Google Scholar 

  16. Deng, X.: A covering-based algorithm for classification: PRISM. CS831: Knowledge discover in databases (2012)

    Google Scholar 

  17. Liu, H., Gegov, A.: Induction of modular classification rules by Information Entropy Based Rule Generation. In: V. Sgurev, R. Yager, J. Kacprzyk (Eds.) Innovative issues in intelligent systems. Springer, Berlin (in print)

    Google Scholar 

  18. Shannon, C.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948). Fonn

    Article  MATH  MathSciNet  Google Scholar 

  19. Liu, H., Gegov, A., Stahl, F.: J-measure based hybrid pruning for complexity reduction in classification rules. WSEAS Trans. Syst. 12(9), 433–446 (2013)

    Google Scholar 

  20. Bache, K., Lichman, M.: UCI Machine learning repository. http://archive.ics.uci.edu/ml. Irvine, CA: University of California, School of Information and Computer Science, 2013

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Han Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Liu, H., Gegov, A., Stahl, F. (2015). Unified Framework for Construction of Rule Based Classification Systems. In: Pedrycz, W., Chen, SM. (eds) Information Granularity, Big Data, and Computational Intelligence. Studies in Big Data, vol 8. Springer, Cham. https://doi.org/10.1007/978-3-319-08254-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08254-7_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08253-0

  • Online ISBN: 978-3-319-08254-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics