Finding Persisting States for Knowledge Discovery in Time Series

  • Fabian Mörchen
  • Alfred Ultsch
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

Knowledge Discovery in time series usually requires symbolic time series. Many discretization methods that convert numeric time series to symbolic time series ignore the temporal order of values. This often leads to symbols that do not correspond to states of the process generating the time series. We propose a new method for meaningful unsupervised discretization of numeric time series called “Persist”, based on the Kullback-Leibler divergence between the marginal and the self-transition probability distributions of the discretization symbols. In evaluations with artificial and real life data it clearly outperforms existing methods.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. DAW, C.S., FINNEY, C.E.A., and TRACY, E.R. (2003): A review of symbolic analysis of experimental data. Review of Scientific Instruments, 74:0 916–930.Google Scholar
  2. GUIMARAES, G. and ULTSCH, A. (1999): A method for temporal knowledge conversion In Proc. 3rd Int. Symp. Intelligent Data Analysis, 369–380.Google Scholar
  3. HARMS, S. K. and DEOGUN, J. (2004): Sequential association rule mining with time lags. Journal of Intelligent Information Systems (JIIS), 22:1, 7–22.Google Scholar
  4. HETLAND, M.L. and SAETROM, P. (2003): The role of discretization parameters in sequence rule evolution. In Proc. 7th Int. KES Conf., 518–525.Google Scholar
  5. HÖPPNER, F. (2002): Learning dependencies in multivariate time series. Proc. ECAI Workshop, Lyon, France, 25–31.Google Scholar
  6. KEOGH, E. (2002): The UCR Time Series Data Mining Archive http://www.cs.ucr.edu/~eamonn/TSDMA/index.htmlGoogle Scholar
  7. KEOGH, E., LONARDI, S., and CHIU, B. (2002): Finding Surprising Patterns in a Time Series Database in Linear Time and Space In Proc. 8th ACM SIGKDD, 550–556.Google Scholar
  8. KEOGH, E., CHU, S., HART, D., and PAZZANI, M. (2004): Segmenting time series: A survey and novel approach. Data Mining in Time Series Databases, World Scientific, 1–22.Google Scholar
  9. KULLBACK, S. and LEIBLER, R.A. (1951): On information and sufficiency Annals of Mathematical Statistics, 22, 79–86.MathSciNetGoogle Scholar
  10. LIN, J., KEOGH, E., LONARDI, S., and CHIU, B. (2003): A symbolic representation of time series, with implications for streaming algorithms. In Proc. 8th ACM SIGMOD, DMKD workshop, 2–11.Google Scholar
  11. LIU, H., HUSSAIN, F., TAN, C.L., and DASH, M. (2002): Discretization: An Enabling Technique. Data Mining and Knowledge Discovery, 4:6, 393–423.MathSciNetGoogle Scholar
  12. MÖRCHEN, F. and ULTSCH, A. (2004): Discovering Temporal Knowlegde in Multivariate Time Series In Proc. GfKl, Dortmund, Germany, 272–279.Google Scholar
  13. MÖRCHEN, F., ULTSCH, A., and HOOS, O. (2005): Extracting interpretable muscle activation patterns with time series knowledge mining. Intl. Journal of Knowledge-Based & Intelligent Engineering Systems (to appear).Google Scholar
  14. RODRIGUEZ, J.J., ALSONSO, C.J., and BOSTRÖM, H. (2000): Learning First Order Logic Time Series Classifiers In Proc. 10th Intl. Conf. on Inductive Logic Programming, 260–275.Google Scholar
  15. RABINER, L. R. (1989): A tutorial on hidden markov models and selected applications in speech recognition. In Proc. of IEEE, 77(2):0 257–286.Google Scholar
  16. ULTSCH, A. (2003): Pareto Density Estimation: Probability Density Estimation for Knowledge Discovery. In Proc. GfKl, Cottbus, Germany, 91–102.Google Scholar
  17. VAN WIJK, J. J., VAN SELOW, E. R. (1999): Cluster and Calendar Based Visualization of Time Series Data. In Proc. INFOVIS, 4–9.Google Scholar

Copyright information

© Springer Berlin · Heidelberg 2006

Authors and Affiliations

  • Fabian Mörchen
    • 1
  • Alfred Ultsch
    • 1
  1. 1.Data Bionics Research GroupPhilipps-University MarburgMarburgGermany

Personalised recommendations