Summary
Data mining is the overall process of extracting knowledge from data. In the study of how to represent knowledge in data mining context, rules are one of the most used representation form. However, the first issue in data mining is the computational complexity of the rule discovery process due to the huge amount of data. In this sense, this chapter proposes a novel approach based on a previous work that explores Multi-Objective Particle Swarm Optimization (MOPSO) in a rule learning context, called MOPSO-N. MOPSO-N applies MOPSO to search for rules with specific properties exploring Pareto dominance concepts. Besides, these rules can be used as an unordered classifier, so the rules are more intuitive and easier to understand because they can be interpreted independently one of the other. In this chapter, first some extensions to MOPSO-N are presented. These extensions are enhancements to the original algorithm to increase its performance, and to validate them, a wide set of experiments is conducted. Second, the main goal of this chapter, the parallel approach of MOPSO-N, called MOPSO-P, is described. MOPSO-P allows the algorithm to be applied to large datasets. The proposed MOPSO-P is evaluated, and the results showed that MOPSO-P is efficient for mining rules from large datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Asuncion, A., Newman, D.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Bacardit, J., Garrell, J.M.: Bloat control and generalization pressure using the minimum description length principle for a pittsburgh approach learning classifier system. In: Kovacs, T., Llorà, X., Takadama, K., Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2003. LNCS (LNAI), vol. 4399, pp. 59–79. Springer, Heidelberg (2007)
Baronti, F., Starita, A.: Hypothesis Testing with Classifier Systems for Rule-Based Risk Prediction, pp. 24–34 (2007), http://dx.doi.org/10.1007/978-3-540-71783-6_3
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)
Bratton, D., Kennedy, J.: Defining a standard for particle swarm optimization. In: Swarm Intelligence Symposium, pp. 120–127 (2007)
Clark, P., Niblett, T.: Rule induction with CN2: Some recent improvements. In: ECML: European Conference on Machine Learning. Springer, Berlin (1991)
Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123 (1995)
de Carvalho, A.B., Pozo, A.: Non-ordered data mining rules through multi-objective particle swarm optimization: Dealing with numeric and discrete attributes. In: Poceedings of Hybrid Intelligent Systems, 2008. HIS 2008. Eighth International Conference, pp. 495–500 (2008)
de Carvalho, A.B., Pozo, A., Vergilio, S., Lenz, A.: Predicting fault proneness of classes trough a multiobjective particle swarm optimization algorithm. In: Poceedings of 20th IEEE International Conference on Tools with Artificial Intelligence (2008)
Egan, J.: Signal detection theory and ROC analysis. Academic Press, New York (1975)
Fawcett, T.: Using rule sets to maximize ROC performance. In: IEEE International Conference on Data Mining, pp. 131–138. IEEE Computer Society Press, Los Alamitos (2001)
Fawcett, T.: Roc graphs: Notes and practical considerations for researchers (2004)
Ferri, C., Flach, P., Hernandez-Orallo, J.: Learning decision trees using the area under the ROC curve. In: Sammut, C., Hoffmann, A. (eds.) Proceedings of the 19th International Conference on Machine Learning, pp. 139–146. Morgan Kaufmann, San Francisco (2002)
Hanley, McNeil: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1), 29–36 (1982)
Kennedy, J., Eberhart, R.C.: Swarm intelligence. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Knowles, J., Thiele, L., Zitzler, E.: A Tutorial on the Performance Assessment of Stochastic Multiobjective Optimizers. 214, Computer Engineering and Networks Laboratory (TIK), ETH Zurich, Switzerland (February 2006) (revised version)
Martin, B.: Instance-Based learning: Nearest Neighbor With Generalization. PhD thesis, Department of Computer Science, University of Waikato, New Zealand (1995)
Azé, J., Sebag, M., Lucas, N.: ROC-based evolutionary learning: Application to medical data mining. In: Liardet, P., Collet, P., Fonlupt, C., Lutton, E., Schoenauer, M. (eds.) EA 2003. LNCS, vol. 2936, pp. 384–396. Springer, Heidelberg (2004)
Mostaghim, S., Teich, J.: Strategies for finding good local guides in multi-objective particle swarm optimization. In: Proceedings of the 2003 IEEE Swarm Intelligence Symposium, SIS 2003 Swarm Intelligence Symposium, pp. 26–33. IEEE Computer Society, Los Alamitos (2003)
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42(3), 203 (2001)
Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings 15th International Conference on Machine Learning, pp. 445–453. Morgan Kaufmann, San Francisco (1998)
Provost, F.J., Fawcett, T.: Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In: KDD, pp. 43–48 (1997)
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Rakotomamonjy, A.: Optimizing area under roc curve with SVMs. In: Hernández-Orallo, J., Ferri, C., Lachiche, N., Flach, P.A. (eds.) ROCAI, pp. 71–80 (2004)
Reyes-Sierra, M., Coello, C.A.C.: Multi-objective particle swarm optimizers: A survey of‘the state-of-the-art. International Journal of Computational Intelligence Research 2(3), 287–308 (2006)
Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3(2), 149–175 (1995)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Yanbo, Q.X., Wang, J., Coenen, F.: A novel rule ordering approach in classification association rule mining. International Journal of Computational Intelligence Research 2(3), 287–308 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
de Carvalho, A.B., Pozo, A. (2009). Mining Rules: A Parallel Multiobjective Particle Swarm Optimization Approach. In: Coello, C.A.C., Dehuri, S., Ghosh, S. (eds) Swarm Intelligence for Multi-objective Problems in Data Mining. Studies in Computational Intelligence, vol 242. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03625-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-03625-5_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03624-8
Online ISBN: 978-3-642-03625-5
eBook Packages: EngineeringEngineering (R0)