Data mining in the form of rule discovery is a growing field of investigation. A recent addition to this field is the use of evolutionary algorithms in the mining process. While this has been used extensively in the traditional mining of relational databases, it has hardly, if at all, been used in mining sequences and time series. In this paper we describe our method for evolutionary sequence mining, using a specialized piece of hardware for rule evaluation, and show how the method can be applied to several different mining tasks, such as supervised sequence prediction, unsupervised mining of interesting rules, discovering connections between separate time series, and investigating tradeoffs between contradictory objectives by using multiobjective evolution.
Keywordssequence mining knowledge discovery time series genetic programming specialized hardware
- Agrawal, R., & Srikant, R. (1995). Mining sequential patterns. In P. S. Yu & A. S. P. Chen (Eds.), Eleventh International Conference on Data Engineering (pp. 3–14). Taipei, Taiwan: IEEE Computer Society Press.Google Scholar
- Chakrabarti, S., Sarawagi, S., & Dom, B. (1998). Mining surprising patterns using temporal description length. In A. Gupta, O. Shmueli, & J. Widom (Eds.), Proc. 24th Int. Conf. on Very Large databases. VLDB (pp. 606–617). New York, NY: Morgan Kaufmann.Google Scholar
- Chang, C.-C. & Lin, C.-J. (2001). LIBSVM: A library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
- Coello Coello, C. A. (2000). An updated survey of GA-based multiobjective optimization techniques. ACM Computing Surveys, 32:2, 109–143.Google Scholar
- Coello Coello, C. A. (2001). A short tutorial on evolutionary multiobjective optimization. In E. Zitzler, K. Deb, L. Thiele, C. A. C. Coello, & D. Corne (Eds.), First International Conference on Evolutionary Multi-Criterion Optimization (pp. 21–40). Springer-Verlag, Lecture Notes in Computer Science No. 1993.Google Scholar
- Das, G., Lin, K., Mannila, H., Renganathan, G., & Smyth, P. (1998). Rule discovery from time series. In, Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining. KDD (pp. 16–22).Google Scholar
- Freitas, A. A. (2002) Data Mining and Knowledge Discovery with Evolutionary Algorithms Springer-Verlag.Google Scholar
- Halaas, A., Svingen, B., Nedland, M., Sætrom, P., Sn⊘ve, O., & Birkeland, O. (2004). A recursive MISD architecture for pattern matching. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 12:7, 727–734.Google Scholar
- Hetland, M. L. (2004). A survey of recent methods for efficient retrieval of similar time sequences. In M. Last, A. Kandel, & H. Bunke (Eds.), Data Mining in Time Series Databases World Scientific (2004).Google Scholar
- Hetland, M. L., & Sætrom, P. (2002). Temporal rule discovery using genetic programming and specialized hardware. In Proc. 4th Int. Conf. on Recent Advances in Soft Computing. RASCGoogle Scholar
- Hetland, M. L., & Sætrom, P. (2003a). A comparison of hardware and software in sequence rule evolution. In Proceedings of the Eighth Scandinavian Conference on Artificial Intelligence. SCAIGoogle Scholar
- Hetland, M. L., & Sætrom, P. (2003b) The role of discretization parameters in sequence rule evolution. In Proc. 7th Int. Conf. on Knowledge-Based Intelligent Information & Engineering Systems. KESGoogle Scholar
- Hilderman, R. J. & Hamilton, H. J. (1999). Knowledge discovery and interestingness measures: A survey. Technical Report CS 99-04, Department of Computer Science, University of Regina, Saskatchewan, Canada.Google Scholar
- Höppner, F., & Klawonn, F. (2001). Finding informative rules in interval sequences. In Lecture Notes in Computer Science, vol. 2189 (pp. 125–134).Google Scholar
- Interagon A.S. (2002). The interagon query language: A reference guide. http://www.interagon.com/pub/whitepapers/IQL.reference-latest.pdf.
- Keogh, E., & Folias, T. (2002). The UCR, time series data mining archive. http://www.cs.ucr.edu/~eamonn/TSDMA.
- Keogh, E. J., Lonardi, S., & Chiu, B. (2002). Finding surprising patterns in a time series database in linear time and space. In Proc. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining. KDD (pp. 550–556).Google Scholar
- Koza, J. R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. Cambridge, MA: The MIT Press.Google Scholar
- Last, M., Klein, Y., & Kandel, A. (2001). Knowledge discovery in time series databases. IEEE Trans, on Systems. Man. and Cybernetics, 31B:1, 160–169.Google Scholar
- Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. San Diego, CA.Google Scholar
- Mannila, H., Toivonen, H., & Verkamo, A. I. (1997). Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1:3, 259–289.Google Scholar
- Martin, R. D. & Yohai, V. (2001). Data mining for unusual movements in temporal data. In Proc. KDD Workshop on Temporal Data Mining.Google Scholar
- Noda, E., Freitas, A. A., & Lopes, H. S. (1999). Discovering interesting prediction rules with a genetic algorithm. In P. Angeline (Ed.), Proc. Conference on Evolutionary Computation (CEC-99) (pp. 1322–1329). Washington DC, USA: IEEE.Google Scholar
- Povinelli, R. J. (2000). Using genetic algorithms to find temporal patterns indicative of time series events. In GECCO 2000 Workshop: Data Mining with Evolutionary Algorithms (pp. 80–84).Google Scholar
- Smyth, P. & Goodman, R. M. (1991). Rule induction using information theory. In G. Piatetsky-Shapiro & W. J. Frawley (Eds.), Knowledge Discovery in Databases (pp. 159–176). Cambridge, MA: The MIT Press.Google Scholar
- Sun, R. & Giles, C. L. (Eds.) (2000) Sequence Learning: Paradigms. Algorithms, and Applications, No. 1828 in Lecture Notes in Artificial Intelligence. Springer-Verlag.Google Scholar
- Sætrom, P. & Hetland, M. L. (2003a). Multiobjective evolution of temporal rules. In Proc. 8th Scandinavian Conf. on Artificial Intelligence. SCAI IOS Press.Google Scholar
- Sætrom, P. & Hetland, M. L. (2003b). Unsupervised temporal rule mining with genetic programming and specialized hardware. In Proc. 2003 Int. Conf. on Machine Learning and Applications. ICMLAGoogle Scholar
- Weiss, G. M. & Hirsh, H. (1998). Learning to predict rare events in event sequences. In R. Agrawal, P. Stolorz, & G. Piatetsky-Shapiro (Eds.), Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining. KDD (pp. 359–363). New York, NY: AAAI Press, Menlo Park, CA.Google Scholar
- Witten, I. H. & Frank, E. (2000), Data Mining: Practical Machine Learning Tools with Java Implementations. San Francisco: Morgan Kaufmann. Software available at http://www.cs.waikato.ac.nz/~ml/weka.
- Zemke, S. (1998). Nonlinear index prediction. In R. N. Mantegna (Ed.), Proc. Int. Workshop on Econophysics and Statistical Finance, Vol. 269, pp. 177–183. Palermo, Italy: Elsevier Science.Google Scholar
- Zitzler, E., Laumanns, M., & Thiele, L. (2001). SPEA2: Improving the strength pareto evolutionary algorithm. Technical Report 103, Computer Engineering and Networks Laboratory (TIK), Swiss Federal Institute of Technology (ETH) Zurich, Gloriastrasse 35, CH-8092 Zurich, Switzerland.Google Scholar