Skip to main content

Random Prism: An Alternative to Random Forests

  • Conference paper
  • First Online:
Research and Development in Intelligent Systems XXVIII (SGAI 2011)

Abstract

Ensemble learning techniques generate multiple classifiers, so called base classifiers, whose combined classification results are used in order to increase the overall classification accuracy. In most ensemble classifiers the base classifiers are based on the Top Down Induction of Decision Trees (TDIDT) approach. However, an alternative approach for the induction of rule based classifiers is the Prism family of algorithms. Prism algorithms produce modular classification rules that do not necessarily fit into a decision tree structure. Prism classification rulesets achieve a comparable and sometimes higher classification accuracy compared with decision tree classifiers, if the data is noisy and large. Yet Prism still suffers from overfitting on noisy and large datasets. In practice ensemble techniques tend to reduce the overfitting, however there exists no ensemble learner for modular classification rule inducers such as the Prism family of algorithms. This article describes the first development of an ensemble learner based on the Prism family of algorithms in order to enhance Prism’s classification accuracy by reducing overfitting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hadoop, http://hadoop.apache.org/mapreduce/ 2011.

  2. C L Blake and C J Merz. UCI repository of machine learning databases. Technical report, University of California, Irvine, Department of Information and Computer Sciences, 1998.

    Google Scholar 

  3. M A Bramer. Automatic induction of classification rules from examples using N-Prism. In Research and Development in Intelligent Systems XVI, pages 99–121, Cambridge, 2000. Springer-Verlag.

    Chapter  Google Scholar 

  4. M A Bramer. An information-theoretic approach to the pre-pruning of classification rules. In B Neumann M Musen and R Studer, editors, Intelligent Information Processing, pages 201–212. Kluwer, 2002.

    Google Scholar 

  5. M A Bramer. Inducer: a public domain workbench for data mining. International Journal of Systems Science, 36(14):909–919, 2005.

    Article  MATH  Google Scholar 

  6. Leo Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.

    MathSciNet  MATH  Google Scholar 

  7. Leo Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.

    Article  MATH  Google Scholar 

  8. J. Cendrowska. PRISM: an algorithm for inducing modular rules. International Journal of Man-Machine Studies, 27(4):349–370, 1987.

    Article  MATH  Google Scholar 

  9. Philip Chan and Salvatore J Stolfo. Experiments on multistrategy learning by meta learning. In Proc. Second Intl. Conference on Information and Knowledge Management, pages 314–323, 1993.

    Google Scholar 

  10. Philip Chan and Salvatore J Stolfo. Meta-Learning for multi strategy and parallel learning. In Proceedings. Second International Workshop on Multistrategy Learning, pages 150–165, 1993.

    Google Scholar 

  11. Nitesh V. Chawla, Lawrence O. Hall, Kevin W. Bowyer, and W. Philip Kegelmeyer. Learning ensembles from bites: A scalable and accurate approach. J. Mach. Learn. Res., 5:421–451, December 2004.

    Google Scholar 

  12. Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 51:107–113, January 2008.

    Google Scholar 

  13. Saso Dzeroski and Bernard Zenko. Is combining classifiers with stacking better than selecting the best one? Machine Learning, 54:255–273, 2004.

    Article  MATH  Google Scholar 

  14. Jiawei Han and Micheline Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, 2001.

    Google Scholar 

  15. Tin Kam Ho. Random decision forests. Document Analysis and Recognition, International Conference on, 1:278, 1995.

    Google Scholar 

  16. R S Michalski. On the Quasi-Minimal solution of the general covering problem. In Proceedings of the Fifth International Symposium on Information Processing, pages 125–128, Bled, Yugoslavia, 1969.

    Google Scholar 

  17. Domingos P. and Hulten G. Mining high-speed data streams. In In International Conference on Knowledge Discovery and Data Mining, pages 71–81, 2000.

    Google Scholar 

  18. Biswanath Panda, Joshua S. Herbach, Sugato Basu, and Roberto J. Bayardo. Planet: massively parallel learning of tree ensembles with mapreduce. Proc. VLDB Endow., 2:1426–1437, August 2009.

    Google Scholar 

  19. Foster Provost. Distributed data mining: Scaling up and beyond. In Advances in Distributed and Parallel Knowledge Discovery, pages 3–27. MIT Press, 2000.

    Google Scholar 

  20. R J Quinlan. C4.5: programs for machine learning. Morgan Kaufmann, 1993.

    Google Scholar 

  21. Ross J Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, 1986.

    Google Scholar 

  22. P. Smyth and R M Goodman. An information theoretic approach to rule induction from databases. Transactions on Knowledge and Data Engineering, 4(4):301–316, 1992.

    Article  Google Scholar 

  23. F T Stahl, MA Bramer, and M Adda. PMCRI: A parallel modular classification rule induction framework. In MLDM, pages 148–162. Springer, 2009.

    Google Scholar 

  24. Frederic Stahl, Max Bramer, and Mo Adda. J-PMCRI: a methodology for inducing pre-pruned modular classification rules. IFIP Advances in Information and Communication Technology, 331:47–56, 2010.

    Article  Google Scholar 

  25. Frederic Stahl, Max Bramer, and Mo Adda. Parallel rule induction with information theoretic pre-pruning. In Research and Development in Intelligent Systems XXVI, volume 4, pages 151–164. Springerlink, 2010.

    Google Scholar 

  26. Frederic Stahl, Mohamed Medhat Gaber, Max Bramer, and Phillip S. Yu. Distributed hoeffding trees for pocket data mining. In The 2011 International Conference on High Performance Computing and Simulation, Istanbul, Turkey, in Press (2011).

    Google Scholar 

  27. Frederic Stahl, Mohamed Medhat Gaber, Han Liu, Max Bramer, and Phillip S. Yu. Distributed classification for pocket data mining. In 19th International Symposium on Methodologies for Intelligent Systems, Warsaw, Poland, in Press (2011). Springer.

    Google Scholar 

  28. Frederic T. Stahl and Max Bramer. Induction of modular classification rules: Using jmaxpruning. In SGAI Conf.’10, pages 79–92, 2010.

    Google Scholar 

  29. Frederic T. Stahl, Max Bramer, and Mo Adda. Parallel induction of modular classification rules. In SGAI Conf., pages lookup–lookup. Springer, 2008.

    Google Scholar 

  30. Frederic T. Stahl, Mohamed Medhat Gaber, Max Bramer, and Philip S. Yu. Pocket data mining: Towards collaborative data mining in mobile computing environments. In ICTAI (2)’10, pages 323–330, 2010.

    Google Scholar 

  31. I H Witten and F Eibe. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frederic Stahl .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this paper

Cite this paper

Stahl, F., Bramer, M. (2011). Random Prism: An Alternative to Random Forests. In: Bramer, M., Petridis, M., Nolle, L. (eds) Research and Development in Intelligent Systems XXVIII. SGAI 2011. Springer, London. https://doi.org/10.1007/978-1-4471-2318-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-2318-7_1

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-2317-0

  • Online ISBN: 978-1-4471-2318-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics