Abstract
Time-stamped data occur frequently in real-world databases. The goal of analysing time-stamped data is very often to find a small group of objects (customers, machine parts,...) which is important for the business at hand. In contrast, the majority of objects obey well-known rules and is not of interest for the analysis. In terms of a classification task, the small group means that there are very few positive examples and within them, there is some sort of a structure such that the small group differs significantly from the majority. We may consider such a learning task learning a local pattern.
Depending on the goal of the data analysis, different aspects of time are relevant, e.g., the particular date, the duration of a certain state, or the number of different states. From the given data, we may generate features that allow us to express the aspect of interest. Here, we investigate the aspect of state change and its representation for learning local patterns in time-stamped data. Besides a simple Boolean representation indicating a change, we use frequency features from information retrieval. We transfer Joachim’s theory for text classification to our task and investigate its fit to local pattern learning. The approach has been implemented within the MiningMart system and was successfully applied to real-world insurance data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Joachims, T.: Learning to Classify Text using Support Vector Machines. Kluwer International Series in Engineering and Computer Science, vol. 668. Kluwer, Dordrecht (2002)
Hand, D., Bolton, R., Adams, N.: Determining hit rate in pattern search. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, p. 36. Springer, Heidelberg (2002)
Hand, D.: Pattern detection and discovery. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, p. 1. Springer, Heidelberg (2002)
Siebes, A., Struzik, Z.: Complex data: Mining using patterns. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, p. 24. Springer, Heidelberg (2002)
Morik, K.: Detecting interesting instances. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, pp. 13–23. Springer, Heidelberg (2002)
Cohen, P., Heeringa, B., Adams, N.M.: An unsupervised algorithm for segmenting categorical timeseries into episodes. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, pp. 1–12. Springer, Heidelberg (2002)
Box, G.E.P., Jenkins, G.M., Reinsel, G.C.: Time Series Analysis. Forecasting and Control, 3rd edn. Prentice Hall, Englewood Cliffs (1994)
Schlittgen, R., Streitberg, B.H.J.: Zeitreihenanalyse, 9th edn. Oldenburg (2001)
Keogh, E., Pazzani, M.: Scaling up dynamic time warping for datamining applications. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 285–289. ACM Press, New York (2000)
Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993)
Oates, T., Firoiu, L., Cohen, P.R.: Using dynamic time warping to bootstrap HMM-based clustering of time series. In: Sun, R., Giles, C.L. (eds.) IJCAI-WS 1999. LNCS (LNAI), vol. 1828, pp. 35–52. Springer, Heidelberg (2001)
Geurts, P.: Pattern extraction for time series classification. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 115–127. Springer, Heidelberg (2001)
Lausen, G., Savnik, I., Dougarjapov, A.: Msts: A system for mining sets of time series. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 289–298. Springer, Heidelberg (2000)
Das, G., Lin, K.I., Mannila, H., Renganathan, G., Smyth, P.: Rule Discovery from Time Series. In: Agrawal, R., Stolorz, P.E., Piatetsky-Shapiro, G. (eds.) Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD 1998), New York City, pp. 16–22. AAAI Press, Menlo Park (1998)
Guralnik, V., Srivastava, J.: Event detection from time series data. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, San Diego, USA, pp. 33–42 (1999)
Morik, K., Wessel, S.: Incremental signal to symbol processing. In: Morik, K., Kaiser, M., Klingspor, V. (eds.) Making Robots Smarter – Combining Sensing and Action through Robot Learning, pp. 185–198. Kluwer Academic Publ., Dordrecht (1999)
Salatian, A., Hunter, J.: Deriving trends in historical and real-time continuously sampled medical data. Journal of Intelligent Information Systems 13, 47–71 (1999)
Agrawal, R., Psaila, G., Wimmers, E.L., Zaït, M.: Querying shapes of histories. In: Proceedings of 21st International Conference on Very Large Data Bases, pp. 502–514. Morgan Kaufmann, San Francisco (1995)
Domeniconi, C., shing Perng, C., Vilalta, R., Ma, S.: A classification approach for prediction of target events in temporal sequences. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, p. 125. Springer, Heidelberg (2002)
Blockeel, H., Fürnkranz, J., Prskawetz, A., Billari, F.: Detecting temporal change in event sequences: An application to demographic data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 29–41. Springer, Heidelberg (2001)
Mannila, H., Toivonen, H., Verkamo, A.: Discovering frequent episode in sequences. In: Procs. of the 1st Int. Conf. on Knowledge Discovery in Databases and Data Mining. AAAI Press, Menlo Park (1995)
Mannila, H., Toivonen, H., Verkamo, A.: Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery 1, 259–290 (1997)
Höppner, F.: Discovery of Core Episodes from Sequences. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, pp. 1–12. Springer, Heidelberg (2002)
Allen, J.F.: Towards a general theory of action and time. Artificial Intelligence 23, 123–154 (1984)
Agrawal, R., Imielinski, T., Swami, A.: Database mining: A performance perspektive. IEEE Transactions on Knowledge and Data Engineering 5, 914–925 (1993)
Nunez, M.: Learning patterns of behavior by observing system events. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 323–330. Springer, Heidelberg (2000)
Klingspor, V., Morik, K.: Learning understandable concepts for robot navigation. In: Morik, K., Klingspor, V., Kaiser, M. (eds.) Making Robots Smarter – Combining Sensing and Action through Robot Learning. Kluwer, Dordrecht (1999)
Rieger, A.D.: Program Optimization for Temporal Reasoning within a Logic Programming Framework. PhD thesis, Universität Dortmund, Dortmund, Germany (1998)
Bettini, C., Jajodia, S., Wang, S.: Time Granularities in Databases, Data Mining, and Temporal Reasoning. Springer, Heidelberg (2000)
Morik, K.: The representation race - preprocessing for handling time phenomena. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 4–19. Springer, Heidelberg (2000)
Salton, G., Buckley, C.: Term weighting approaches in automatic text retrieval. Information Processing and Management 24, 513–523 (1988)
Kietz, J.U., Vaduva, A., Zücker, R.: Mining Mart: Combining Case-Based- Reasoning and Multi-Strategy Learning into a Framework to reuse KDDApplication. In: Michalki, R., Brazdil, P. (eds.) Proceedings of the fifth International Workshop on Multistrategy Learning (MSL 2000), Guimares, Portugal (2000)
Fisseler, J.: Anwendung eines Data Mining-Verfahrens auf Versicherungsdaten. Master’s thesis, Fachbereich Informatik, Universität Dortmund (2003)
Zipf, G.K.: Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley, Reading (1949)
Mandelbrot, B.: A note on a class of skew distribution functions: Analysis and critique of a paper by H.A.Simon. Informationi and Control 2, 90–99 (1959)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Liu, H., Motoda, H.: Feature Extraction, Construction, and Selection: A Data Mining Perspective. Kluwer, Dordrecht (1998)
Ritthoff, O., Klinkenberg, R., Fischer, S., Mierswa, I.: A hybrid approach to feature selection and generation using an evolutionary algorithm. Technical Report CI- 127/02, Collaborative Research Center 531, University of Dortmund, Dortmund, Germany (2002); ISSN 1433-3325
Morik, K., Scholz, M.: The MiningMart Approach to Knowledge Discovery in Databases. In: Zhong, N., Liu, J. (eds.) Intelligent Technologies for Information Analysis. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Morik, K., Köpcke, H. (2005). Features for Learning Local Patterns in Time-Stamped Data. In: Morik, K., Boulicaut, JF., Siebes, A. (eds) Local Pattern Detection. Lecture Notes in Computer Science(), vol 3539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11504245_7
Download citation
DOI: https://doi.org/10.1007/11504245_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26543-6
Online ISBN: 978-3-540-31894-1
eBook Packages: Computer ScienceComputer Science (R0)