Skip to main content

Features for Learning Local Patterns in Time-Stamped Data

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3539))

Abstract

Time-stamped data occur frequently in real-world databases. The goal of analysing time-stamped data is very often to find a small group of objects (customers, machine parts,...) which is important for the business at hand. In contrast, the majority of objects obey well-known rules and is not of interest for the analysis. In terms of a classification task, the small group means that there are very few positive examples and within them, there is some sort of a structure such that the small group differs significantly from the majority. We may consider such a learning task learning a local pattern.

Depending on the goal of the data analysis, different aspects of time are relevant, e.g., the particular date, the duration of a certain state, or the number of different states. From the given data, we may generate features that allow us to express the aspect of interest. Here, we investigate the aspect of state change and its representation for learning local patterns in time-stamped data. Besides a simple Boolean representation indicating a change, we use frequency features from information retrieval. We transfer Joachim’s theory for text classification to our task and investigate its fit to local pattern learning. The approach has been implemented within the MiningMart system and was successfully applied to real-world insurance data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Joachims, T.: Learning to Classify Text using Support Vector Machines. Kluwer International Series in Engineering and Computer Science, vol. 668. Kluwer, Dordrecht (2002)

    Google Scholar 

  2. Hand, D., Bolton, R., Adams, N.: Determining hit rate in pattern search. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, p. 36. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  3. Hand, D.: Pattern detection and discovery. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, p. 1. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  4. Siebes, A., Struzik, Z.: Complex data: Mining using patterns. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, p. 24. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Morik, K.: Detecting interesting instances. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, pp. 13–23. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  6. Cohen, P., Heeringa, B., Adams, N.M.: An unsupervised algorithm for segmenting categorical timeseries into episodes. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, pp. 1–12. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Box, G.E.P., Jenkins, G.M., Reinsel, G.C.: Time Series Analysis. Forecasting and Control, 3rd edn. Prentice Hall, Englewood Cliffs (1994)

    MATH  Google Scholar 

  8. Schlittgen, R., Streitberg, B.H.J.: Zeitreihenanalyse, 9th edn. Oldenburg (2001)

    Google Scholar 

  9. Keogh, E., Pazzani, M.: Scaling up dynamic time warping for datamining applications. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 285–289. ACM Press, New York (2000)

    Chapter  Google Scholar 

  10. Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993)

    Google Scholar 

  11. Oates, T., Firoiu, L., Cohen, P.R.: Using dynamic time warping to bootstrap HMM-based clustering of time series. In: Sun, R., Giles, C.L. (eds.) IJCAI-WS 1999. LNCS (LNAI), vol. 1828, pp. 35–52. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  12. Geurts, P.: Pattern extraction for time series classification. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 115–127. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Lausen, G., Savnik, I., Dougarjapov, A.: Msts: A system for mining sets of time series. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 289–298. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  14. Das, G., Lin, K.I., Mannila, H., Renganathan, G., Smyth, P.: Rule Discovery from Time Series. In: Agrawal, R., Stolorz, P.E., Piatetsky-Shapiro, G. (eds.) Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD 1998), New York City, pp. 16–22. AAAI Press, Menlo Park (1998)

    Google Scholar 

  15. Guralnik, V., Srivastava, J.: Event detection from time series data. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, San Diego, USA, pp. 33–42 (1999)

    Google Scholar 

  16. Morik, K., Wessel, S.: Incremental signal to symbol processing. In: Morik, K., Kaiser, M., Klingspor, V. (eds.) Making Robots Smarter – Combining Sensing and Action through Robot Learning, pp. 185–198. Kluwer Academic Publ., Dordrecht (1999)

    Google Scholar 

  17. Salatian, A., Hunter, J.: Deriving trends in historical and real-time continuously sampled medical data. Journal of Intelligent Information Systems 13, 47–71 (1999)

    Article  Google Scholar 

  18. Agrawal, R., Psaila, G., Wimmers, E.L., Zaït, M.: Querying shapes of histories. In: Proceedings of 21st International Conference on Very Large Data Bases, pp. 502–514. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  19. Domeniconi, C., shing Perng, C., Vilalta, R., Ma, S.: A classification approach for prediction of target events in temporal sequences. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, p. 125. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  20. Blockeel, H., Fürnkranz, J., Prskawetz, A., Billari, F.: Detecting temporal change in event sequences: An application to demographic data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 29–41. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  21. Mannila, H., Toivonen, H., Verkamo, A.: Discovering frequent episode in sequences. In: Procs. of the 1st Int. Conf. on Knowledge Discovery in Databases and Data Mining. AAAI Press, Menlo Park (1995)

    Google Scholar 

  22. Mannila, H., Toivonen, H., Verkamo, A.: Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery 1, 259–290 (1997)

    Article  Google Scholar 

  23. Höppner, F.: Discovery of Core Episodes from Sequences. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, pp. 1–12. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  24. Allen, J.F.: Towards a general theory of action and time. Artificial Intelligence 23, 123–154 (1984)

    Article  MATH  Google Scholar 

  25. Agrawal, R., Imielinski, T., Swami, A.: Database mining: A performance perspektive. IEEE Transactions on Knowledge and Data Engineering 5, 914–925 (1993)

    Article  Google Scholar 

  26. Nunez, M.: Learning patterns of behavior by observing system events. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 323–330. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  27. Klingspor, V., Morik, K.: Learning understandable concepts for robot navigation. In: Morik, K., Klingspor, V., Kaiser, M. (eds.) Making Robots Smarter – Combining Sensing and Action through Robot Learning. Kluwer, Dordrecht (1999)

    Google Scholar 

  28. Rieger, A.D.: Program Optimization for Temporal Reasoning within a Logic Programming Framework. PhD thesis, Universität Dortmund, Dortmund, Germany (1998)

    Google Scholar 

  29. Bettini, C., Jajodia, S., Wang, S.: Time Granularities in Databases, Data Mining, and Temporal Reasoning. Springer, Heidelberg (2000)

    MATH  Google Scholar 

  30. Morik, K.: The representation race - preprocessing for handling time phenomena. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 4–19. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  31. Salton, G., Buckley, C.: Term weighting approaches in automatic text retrieval. Information Processing and Management 24, 513–523 (1988)

    Article  Google Scholar 

  32. Kietz, J.U., Vaduva, A., Zücker, R.: Mining Mart: Combining Case-Based- Reasoning and Multi-Strategy Learning into a Framework to reuse KDDApplication. In: Michalki, R., Brazdil, P. (eds.) Proceedings of the fifth International Workshop on Multistrategy Learning (MSL 2000), Guimares, Portugal (2000)

    Google Scholar 

  33. Fisseler, J.: Anwendung eines Data Mining-Verfahrens auf Versicherungsdaten. Master’s thesis, Fachbereich Informatik, Universität Dortmund (2003)

    Google Scholar 

  34. Zipf, G.K.: Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley, Reading (1949)

    Google Scholar 

  35. Mandelbrot, B.: A note on a class of skew distribution functions: Analysis and critique of a paper by H.A.Simon. Informationi and Control 2, 90–99 (1959)

    Article  MATH  MathSciNet  Google Scholar 

  36. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  37. Liu, H., Motoda, H.: Feature Extraction, Construction, and Selection: A Data Mining Perspective. Kluwer, Dordrecht (1998)

    MATH  Google Scholar 

  38. Ritthoff, O., Klinkenberg, R., Fischer, S., Mierswa, I.: A hybrid approach to feature selection and generation using an evolutionary algorithm. Technical Report CI- 127/02, Collaborative Research Center 531, University of Dortmund, Dortmund, Germany (2002); ISSN 1433-3325

    Google Scholar 

  39. Morik, K., Scholz, M.: The MiningMart Approach to Knowledge Discovery in Databases. In: Zhong, N., Liu, J. (eds.) Intelligent Technologies for Information Analysis. Springer, Heidelberg (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Morik, K., Köpcke, H. (2005). Features for Learning Local Patterns in Time-Stamped Data. In: Morik, K., Boulicaut, JF., Siebes, A. (eds) Local Pattern Detection. Lecture Notes in Computer Science(), vol 3539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11504245_7

Download citation

  • DOI: https://doi.org/10.1007/11504245_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26543-6

  • Online ISBN: 978-3-540-31894-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics