Abstract
An evolutionary approach for finding existing relationships among several variables of a multidimensional time series is presented in this work. The proposed model to discover these relationships is based on quantitative association rules. This algorithm, called QARGA (Quantitative Association Rules by Genetic Algorithm), uses a particular codification of the individuals that allows solving two basic problems. First, it does not perform a previous attribute discretization and, second, it is not necessary to set which variables belong to the antecedent or consequent. Therefore, it may discover all underlying dependencies among different variables. To evaluate the proposed algorithm three experiments have been carried out. As initial step, several public datasets have been analyzed with the purpose of comparing with other existing evolutionary approaches. Also, the algorithm has been applied to synthetic time series (where the relationships are known) to analyze its potential for discovering rules in time series. Finally, a real-world multidimensional time series composed by several climatological variables has been considered. All the results show a remarkable performance of QARGA.
References
Adame-Carnero JA, Bolfvar JP, de la Morena BA (2010) Surface ozone measurements in the southwest of the Iberian Peninsula. Environ Sci Pollut Res 17(2):355–368
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data, pp 207–216
Agirre-Basurko E, Ibarra-Berastegi G, Madariagac I (2006) Regression and multilayer perceptron-based models to forecast hourly o 3 and no 2 levels in the Bilbao area. Environ Model Softw 21:430–446
Aguilar-Ruiz JS, Giráldez R, Riquelme JC (2007) Natural encoding for evolutionary supervised learning. IEEE Trans Evol Comput 11(4):466–479
Alatas B, Akin E (2006) An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules. Soft Comput 10(3):230–237
Alatas B, Akin E (2008) Rough particle swarm optimization and its applications in data mining. Soft Comput 12(12):1205–1218
Alatas B, Akin E, Karci A (2008) MODENAR: multi-objective differential evolution algorithm for mining numeric association rules. Appl Soft Comput 8(1):646–656
Alcalá-Fdez J, Alcalá R, Gacto MJ, Herrera F (2009a) Learning the membership function contexts forming fuzzy association rules by using genetic algorithms. Fuzzy Sets Syst 160(7):905–921
Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009b) Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318. http://sci2s.ugr.es/keel
Alcalá-Fdez J, Flugy-Pape N, Bonarini A, Herrera F (2010) Analysis of the effectiveness of the genetic algorithms based on extraction of association rules. Fundam Inform 98(1):1001–1014
Aumann Y, Lindell Y (2003) A statistical theory for quantitative association rules. J Intell Inf Syst 20(3):255–283
Bellazzi R, Larizza C, Magni P, Bellazzi R (2005) Temporal data mining for the quality assessment of hemodialysis services. Artif Intell Med 34:25–39
Berlanga FJ, Rivera AJ, del Jesus MJ, Herrera F (2010) GP-COACH: genetic programming-based learning of compact and accurate fuzzy rule-based classification systems for high-dimensional problems. Inf Sci 180(8):1183–1200
Brin S, Motwani R, Silverstein C (1997) Beyond market baskets: generalizing association rules to correlations. In: Proceedings of the 1997 ACM SIGMOD international conference on management of data, vol 26, pp 265–276
Chen CH, Hong TP, Tseng V (2009) Speeding up genetic-fuzzy mining by fuzzy clustering. In: Proceedings of the IEEE international conference on fuzzy systems, pp 1695–1699
Chen CH, Hong TP, Tseng V (2010) Genetic-fuzzy mining with multiple minimum supports based on fuzzy clustering. Soft Comput (in press)
del Jesús MJ, Gámez J, Puerta J (2009) Evolutionary and metaheuristics based data mining. Soft Comput Fusion Found Methodol Appl 13:209–212
Elkamel A, Abdul-Wahab S, Bouhamra W, Alper E (2001) Measurement and prediction of ozone levels around a heavily industrialized area: a neural network approach. Adv Environ Res 5:47–59
García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977
Georgii E, Richter L, Ruckert U, Kramer S (2005) Analyzing microarray data using quantitative association rules. BMC Bioinformatics 21(2):123–129
Gupta N, Mangal N, Tiwari K, Pabitra Mitra (2006) Mining quantitative association rules in protein sequences. Lect Notes Artif Intell 3755:273–281
Guvenir HA, Uysal I (2000) Bilkent university function approximation repository. http://funapp.cs.bilkent.edu.tr
Herrera F, Lozano M, Sánchez AM (2004) Hybrid crossover operators for real-coded genetic algorithms: an experimental study. Soft Comput 9(4):280–298
Huang YP, Kao LJ, Sandnes FE (2008) Efficient mining of salinity and temperature association rules from ARGO data. Expert Syst Appl 35:59–68
Kalyanmoy D, Ashish A, Dhiraj J (2002) A computationally efficient evolutionary algorithm for real-parameter optimization. Evol Comput 10(4):371–395
Khan MS, Coenen F, Reid D, Patel R, Archer L (2010) A sliding windows based dual support framework for discovering emerging trends from temporal data. Res Dev Intell Syst Part 2:35–48
Lin MY, Lee SY (2002) Fast discovery of sequential patterns by memory indexing. In: Proceedings of the 4th international conference on data warehousing and knowledge discovery, pp 150–160
Martínez–Álvarez F, Troncoso A, Riquelme JC, Aguilar JS (2011) Energy time series forecasting based on pattern sequence similarity. IEEE Trans Knowl Data Eng (in press)
Mata J, Álvarez J, Riquelme JC (2001) Mining numeric association rules with genetic algorithms. In: Proceedings of the international conference on adaptive and natural computing algorithms, pp 264–267
Mata J, Álvarez JL, Riquelme JC (2002) Discovering numeric association rules via evolutionary algorithm. Lect Notes Artif Intell 2336:40–51
Nam H, Lee K, Lee D (2009) Identification of temporal association rules from time-series microarray data sets. BMC Bioinformatics 10(3):1–9
Nikolaidou V, Mitkas PA (2009) A sequence mining method to predict the bidding strategy of trading agents. Lect Notes Comput Sci 5680:139–151
Orriols-Puig A, Bernadó-Mansilla E (2009) Evolutionary rule-based systems for imbalanced data sets. Soft Comput Fusion Found Methodol Appl 13:213–225
Orriols-Puig A, Casillas J, Bernadó-Mansilla E (2008) First approach toward on-line evolution of association rules with learning classifier systems. In: Proceedings of the 2008 GECCO genetic and evolutionary computation conference, pp 2031–2038
Pei J, Han JW, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu MC (2001) Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of IEEE conference on data engineering, pp 215–224
Ramaswamy S, Mahajan S, Silberschatz A (1998) On the discovery of interesting patterns in association rules. In: Proceedings of the 24th international on very large data bases, pp 368–379
Sheskin D (2006) Handbook of parametric and nonparametric statistical procedures. Chapman and Hall/CRC
Shidara Y, Kudo M, Nakamura A (2008) Classification based on consistent itemset rules. Trans Mach Learn Data Min 1(1):17–30
Tong Q, Yan B, Zhou Y (2005) Mining quantitative association rules on overlapped intervals. Lect Notes Artif Intell 3584:43–50
Tung AKH, Han J, Lu H, Feng L (2003) Efficient mining of intertransaction association rules. IEEE Trans Knowl Data Eng 15(1):43–56
Vannucci M, Colla V (2004) Meaningful discretization of continuous features for association rules mining by means of a som. In: Proceedings of the European symposium on artificial neural networks, pp 489–494
Venturini G (1993) SIA: a supervised inductive algorithm with genetic search for learning attribute based concepts. In: Proceedings of the European conference on machine learning, pp 280–296
Venturini G (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the international conference on very large databases, pp 478–499
Wan D, Zhang Y, Li S (2007) Discovery association rules in time series of hydrology. In: Proceedings of the IEEE international conference on integration technology, pp 653–657
Wang YJ, Xin Q, Coenen F (2008) Hybrid rule ordering in classification association rule mining. Trans Mach Learn Data Min 1(1):17–30
Winarko E, Roddick JF (2007) ARMADA—an algorithm for discovering richer relative temporal association rules from interval-based data. Data Knowl Eng 63:76–90
Wright SP (1992) Adjusted p-values for simultaneous inference. Biometrics 48:1005–1013
Yan X, Zhang C, Zhang S (2009) Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support. Expert Syst Appl Int J 36(2):3066–3076
Acknowledgments
The financial support from the Spanish Ministry of Science and Technology, project TIN2007-68084-C02, and from the Junta de Andalucía, project P07-TIC-02611, is acknowledged. The authors also want to acknowledge the support by the Regional Ministry for the Environment (Consejería de Medio Ambiente) of Andalucía (Spain), that has provided all the pollutant agents time series.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Martínez-Ballesteros, M., Martínez-Álvarez, F., Troncoso, A. et al. An evolutionary algorithm to discover quantitative association rules in multidimensional time series. Soft Comput 15, 2065–2084 (2011). https://doi.org/10.1007/s00500-011-0705-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-011-0705-4