Random Prism: An Alternative to Random Forests

Stahl, Frederic; Bramer, Max

doi:10.1007/978-1-4471-2318-7_1

Frederic Stahl⁴ &
Max Bramer⁴

Included in the following conference series:

International Conference on Innovative Techniques and Applications of Artificial Intelligence

691 Accesses
8 Citations

Abstract

Ensemble learning techniques generate multiple classifiers, so called base classifiers, whose combined classification results are used in order to increase the overall classification accuracy. In most ensemble classifiers the base classifiers are based on the Top Down Induction of Decision Trees (TDIDT) approach. However, an alternative approach for the induction of rule based classifiers is the Prism family of algorithms. Prism algorithms produce modular classification rules that do not necessarily fit into a decision tree structure. Prism classification rulesets achieve a comparable and sometimes higher classification accuracy compared with decision tree classifiers, if the data is noisy and large. Yet Prism still suffers from overfitting on noisy and large datasets. In practice ensemble techniques tend to reduce the overfitting, however there exists no ensemble learner for modular classification rule inducers such as the Prism family of algorithms. This article describes the first development of an ensemble learner based on the Prism family of algorithms in order to enhance Prism’s classification accuracy by reducing overfitting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hadoop, http://hadoop.apache.org/mapreduce/ 2011.
C L Blake and C J Merz. UCI repository of machine learning databases. Technical report, University of California, Irvine, Department of Information and Computer Sciences, 1998.
Google Scholar
M A Bramer. Automatic induction of classification rules from examples using N-Prism. In Research and Development in Intelligent Systems XVI, pages 99–121, Cambridge, 2000. Springer-Verlag.
Chapter Google Scholar
M A Bramer. An information-theoretic approach to the pre-pruning of classification rules. In B Neumann M Musen and R Studer, editors, Intelligent Information Processing, pages 201–212. Kluwer, 2002.
Google Scholar
M A Bramer. Inducer: a public domain workbench for data mining. International Journal of Systems Science, 36(14):909–919, 2005.
Article MATH Google Scholar
Leo Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.
MathSciNet MATH Google Scholar
Leo Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.
Article MATH Google Scholar
J. Cendrowska. PRISM: an algorithm for inducing modular rules. International Journal of Man-Machine Studies, 27(4):349–370, 1987.
Article MATH Google Scholar
Philip Chan and Salvatore J Stolfo. Experiments on multistrategy learning by meta learning. In Proc. Second Intl. Conference on Information and Knowledge Management, pages 314–323, 1993.
Google Scholar
Philip Chan and Salvatore J Stolfo. Meta-Learning for multi strategy and parallel learning. In Proceedings. Second International Workshop on Multistrategy Learning, pages 150–165, 1993.
Google Scholar
Nitesh V. Chawla, Lawrence O. Hall, Kevin W. Bowyer, and W. Philip Kegelmeyer. Learning ensembles from bites: A scalable and accurate approach. J. Mach. Learn. Res., 5:421–451, December 2004.
Google Scholar
Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 51:107–113, January 2008.
Google Scholar
Saso Dzeroski and Bernard Zenko. Is combining classifiers with stacking better than selecting the best one? Machine Learning, 54:255–273, 2004.
Article MATH Google Scholar
Jiawei Han and Micheline Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, 2001.
Google Scholar
Tin Kam Ho. Random decision forests. Document Analysis and Recognition, International Conference on, 1:278, 1995.
Google Scholar
R S Michalski. On the Quasi-Minimal solution of the general covering problem. In Proceedings of the Fifth International Symposium on Information Processing, pages 125–128, Bled, Yugoslavia, 1969.
Google Scholar
Domingos P. and Hulten G. Mining high-speed data streams. In In International Conference on Knowledge Discovery and Data Mining, pages 71–81, 2000.
Google Scholar
Biswanath Panda, Joshua S. Herbach, Sugato Basu, and Roberto J. Bayardo. Planet: massively parallel learning of tree ensembles with mapreduce. Proc. VLDB Endow., 2:1426–1437, August 2009.
Google Scholar
Foster Provost. Distributed data mining: Scaling up and beyond. In Advances in Distributed and Parallel Knowledge Discovery, pages 3–27. MIT Press, 2000.
Google Scholar
R J Quinlan. C4.5: programs for machine learning. Morgan Kaufmann, 1993.
Google Scholar
Ross J Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, 1986.
Google Scholar
P. Smyth and R M Goodman. An information theoretic approach to rule induction from databases. Transactions on Knowledge and Data Engineering, 4(4):301–316, 1992.
Article Google Scholar
F T Stahl, MA Bramer, and M Adda. PMCRI: A parallel modular classification rule induction framework. In MLDM, pages 148–162. Springer, 2009.
Google Scholar
Frederic Stahl, Max Bramer, and Mo Adda. J-PMCRI: a methodology for inducing pre-pruned modular classification rules. IFIP Advances in Information and Communication Technology, 331:47–56, 2010.
Article Google Scholar
Frederic Stahl, Max Bramer, and Mo Adda. Parallel rule induction with information theoretic pre-pruning. In Research and Development in Intelligent Systems XXVI, volume 4, pages 151–164. Springerlink, 2010.
Google Scholar
Frederic Stahl, Mohamed Medhat Gaber, Max Bramer, and Phillip S. Yu. Distributed hoeffding trees for pocket data mining. In The 2011 International Conference on High Performance Computing and Simulation, Istanbul, Turkey, in Press (2011).
Google Scholar
Frederic Stahl, Mohamed Medhat Gaber, Han Liu, Max Bramer, and Phillip S. Yu. Distributed classification for pocket data mining. In 19th International Symposium on Methodologies for Intelligent Systems, Warsaw, Poland, in Press (2011). Springer.
Google Scholar
Frederic T. Stahl and Max Bramer. Induction of modular classification rules: Using jmaxpruning. In SGAI Conf.’10, pages 79–92, 2010.
Google Scholar
Frederic T. Stahl, Max Bramer, and Mo Adda. Parallel induction of modular classification rules. In SGAI Conf., pages lookup–lookup. Springer, 2008.
Google Scholar
Frederic T. Stahl, Mohamed Medhat Gaber, Max Bramer, and Philip S. Yu. Pocket data mining: Towards collaborative data mining in mobile computing environments. In ICTAI (2)’10, pages 323–330, 2010.
Google Scholar
I H Witten and F Eibe. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, Buckingham Building, Lion Terrace, PO1 3HE, Portsmouth, UK
Frederic Stahl & Max Bramer

Authors

Frederic Stahl
View author publications
You can also search for this author in PubMed Google Scholar
Max Bramer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frederic Stahl .

Editor information

Editors and Affiliations

University of Portsmouth, Lion Terrace, Portsmouth, PO1 3HE, United Kingdom
Max Bramer
School of Computing &, Mathematical Sciences, University of Greenwich, Park Row 30, London, SE10 9LS, United Kingdom
Miltos Petridis
, School of Computing and Informatics, Nottingham Trent University, Burton Street, Nottingham, NG1 4BU, United Kingdom
Lars Nolle

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stahl, F., Bramer, M. (2011). Random Prism: An Alternative to Random Forests. In: Bramer, M., Petridis, M., Nolle, L. (eds) Research and Development in Intelligent Systems XXVIII. SGAI 2011. Springer, London. https://doi.org/10.1007/978-1-4471-2318-7_1

Download citation

DOI: https://doi.org/10.1007/978-1-4471-2318-7_1
Published: 14 October 2011
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2317-0
Online ISBN: 978-1-4471-2318-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics