Skip to main content

Advertisement

Log in

Efficient mining of understandable patterns from multivariate interval time series

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

We present a new method for the understandable description of local temporal relationships in multivariate data, called Time Series Knowledge Mining (TSKM). We define the Time Series Knowledge Representation (TSKR) as a new language for expressing temporal knowledge in time interval data. The patterns have a hierarchical structure, with levels corresponding to the temporal concepts duration, coincidence, and partial order. The patterns are very compact, but offer details for each element on demand. In comparison with related approaches, the TSKR is shown to have advantages in robustness, expressivity, and comprehensibility. The search for coincidence and partial order in interval data can be formulated as instances of the well known frequent itemset problem. Efficient algorithms for the discovery of the patterns are adapted accordingly. A novel form of search space pruning effectively reduces the size of the mining result to ease interpretation and speed up the algorithms. Human interaction is used during the mining to analyze and validate partial results as early as possible and guide further processing steps. The efficacy of the methods is demonstrated using two real life data sets. In an application to sports medicine the results were recognized as valid and useful by an expert of the field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Afrati F, Gionis A, Mannila H (2004) Approximating a collection of frequent sets. In: Kim W, Kohavi R, Gehrke J, DuMouchel W (eds) Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04). ACM Press, pp 12–19

  • Aggarwal CC (2001) A human-computer cooperative system for effective high dimensional clustering. In: Provost F, Srikant R (eds) Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data Mining (KDD’01). ACM Press, pp 221–226

  • Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In Buneman P, Jajodia S (eds) Proceedings of the 1993 ACM SIGMOD international conference on management of data. ACM Press, pp 207–216

  • Aiello M, Monz C, Todoran L and Worring M (2002). Document understanding for a broad class of documents. Int J Document Anal Recog 5(1): 1–16

    Article  MATH  Google Scholar 

  • Allen JF (1983). Maintaining knowledge about temporal intervals. Commun ACM 26(11): 832–843

    Article  MATH  Google Scholar 

  • Ankerst M, Ester M, Kriegel H-P (2000) Towards an effective cooperation of the user and the computer for classification. In: Ramakrishnan R, Stolfo S, Bayardo R, Parsa I (eds) Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’00). ACM Press, pp 179–188

  • Bayardo RJ (1998) Efficiently mining long patterns from databases. In: Tiwary A, Franklin M (eds) Proceedings of the 17th ACM SIGMOD symposium on principles of database systems (PODS’98). ACM Press, pp 85–93

  • Bellazi R, Larizza C, Magni P and Bellazi R (2005). Temporal data mining for the quality assessment of hemodialysis services. Artif Intell Med 34: 25–39

    Article  Google Scholar 

  • Boulicaut J-F, Bykowski A and Rigotti C (2003). Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Min Knowl Disc 7(1): 5–22

    Article  Google Scholar 

  • Bykowski A, Rigotti C (2001) A condensed representation to find frequent patterns. In: Fan W (ed) Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS’01). ACM Press, pp 267–273

  • Calders T, Goethals B (2003) Minimal k-free representations of frequent sets. In: Lavrac N, Gamberger D, Blockeel H, Todorovski L (eds) Proceedings of the 7th European conference on principles and practice of knowledge discovery in databases (PKDD’03). Springer, pp 71–82

  • Casas-Garriga G (2005) Summarizing sequential data with closed partial orders. In: Kargupta H, Srivastava J, Kamath C, Goodman A (eds) Proceedings of the 5th SIAM international conference on data mining (SDM’05). SIAM, pp 380–391

  • Chen G, Wu X, Zhu X (2006) Mining sequential patterns across data streams. Technical Report CS-05-04, University of Vermont, Burlington, VT, USA

  • Cheng J, Ke Y, Ng W (2006) δ-Tolerance closed frequent itemsets. In: Proceedings of the 6th IEEE international conference on data mining (ICDM’06). IEEE Press, pp 139–148

  • Cohen PR (2001) Fluent learning: elucidating the structure of episodes. In: Hoffmann F, Hand D, Adams N, Fisher D, Guimarães G (eds) Proceedings of the 4th international conference in intelligent data analysis (IDA’01). Springer, pp 268–277

  • Dubois D, Hüllermeier E and Prade H (2006). A systematic approach to the assessment of fuzzy association rules. Data Min Knowl Disc 13(2): 167–192

    Article  Google Scholar 

  • Fern A (2004) Learning models and formulas of a temporal event logic. PhD thesis, Purdue University, West Lafayette, IN, USA

  • Gionis A, Mannila H, Terzi E (2004) Clustered segmentations. In: Workshop on mining temporal and sequential data, 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04)

  • Grice H (1989) Studies in the way of words. Harvard University Press

  • Guimarães G (1998) Eine Methode zur Entdeckung von komplexen Mustern in Zeitreihen mit Neuronalen Netzen und deren Überführung in eine symbolische Wissensrepräsentation. PhD thesis, Philipps-University Marburg, Germany (German)

  • Guimarães G., Ultsch A (1997) A symbolic representation for pattern in time series using definitive clause grammars. In: Klar R, Opitz O (eds) Proceedings of the 20th annual conference of the german classification society (GfKl’96). Springer, pp 105–111

  • Guimarães G, Ultsch A (1999) A method for temporal knowledge conversion. In: Hand DJ, Kok JN, Berthold MR (eds) Proceedings of the 3rd international conference in intelligent data analysis (IDA’99). Springer, pp 369–380

  • Hoos O (2003). Bewegungsstruktur, Bewegungstechnik und Geschwindigkeitsregulation im ausdauerorientierten Inline-Skating. Görich & Weiershäuser, Marburg, Germany

    Google Scholar 

  • Höppner F (2001) Discovery of temporal patterns – learning rules about the qualitative behaviour of time series. In: Raedt LD, Siebes A (eds) Proceedings of the 5th European conference on principles of data mining and knowledge discovery (PKDD’01). Springer, pp 192–203

  • Höppner F (2003) Knowledge discovery from sequential data. PhD thesis, Technical University Braunschweig, Germany

  • Höppner F and Klawonn F (2002). Finding informative rules in interval sequences. Intell. Data Anal 6(3): 237–255

    MATH  Google Scholar 

  • Kam P-S, Fu AW-C (2000) Discovering temporal patterns for interval-based events. In: Kambayashi Y, Mohania MK, Tjoa AM (eds) Proceedings of the 2nd international conference on data warehousing and knowledge discovery (DaWaK’00). Springer, pp 317–326

  • Keogh E, Chu S, Hart D and Pazzani M (2004). Segmenting time series: a survey and novel approach. In: Last, M, Kandel, A, and Bunke, H (eds) Data mining in time series databases, chapter 1, pp 1–22. World Scientific, Singapore pp

    Google Scholar 

  • Kryszkiewicz M (2001) Concise representation of frequent patterns based on disjunction-free generators. In: Cercone N, Lin T, Wu X (eds) Proceedings of the 1st IEEE international conference on data mining (ICDM’01). IEEE Press, pp 305–312

  • Last M, Klein Y and Kandel A (2001). Knowledge discovery in time series databases. IEEE Trans Syst Man Cybernet 31(1): 160–169

    Article  Google Scholar 

  • Lin M-Y, Lee S-Y (2002) Fast discovery of sequential patterns by memory indexing. In: Kambayashi Y, Winiwarter W, Arikawa M (eds) Proceedings of the 4th international conference on data warehousing and knowledge discovery (DaWaK’02). Springer, pp 150–160

  • Lin J, Keogh E, Lonardi S, Patel P (2002) Finding motifs in time series. In: Hand D, Keim D, Ng R (eds) Workshop on temporal data mining, 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02).

  • Lin J, Keogh E, Lonardi S, Lankford JP, Nystrom DM (2004) Visually mining and monitoring massive time series. In: Kim W, Kohavi R, Gehrke J, DuMouchel W (eds) Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04). ACM Press, pp 460–469

  • Lucchese C, Orlando S and Perego R (2006). Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1): 21–36

    Article  Google Scholar 

  • Mannila H, Toivonen H, Verkamo I (1995) Discovery of frequent episodes in event sequences. In: Fayyad UM, Uthurusamy R (eds) Proceedings of the 1st international conference on knowledge discovery and data mining (KDD’96). AAAI Press, pp 210–215

  • Mooney C, Roddick JF (2004) Mining relationships between interacting episodes. In: Berry MW, Dayal U, Kamath C, Skillicorn DB (eds) Proceedings of the 4th SIAM international conference on data mining (SDM’04). SIAM

  • Mörchen F (2006a) Algorithms for time series knowledge mining. In: Eliassi-Rad T, Ungar LH, Craven M, Gunopulos D (eds) Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06). ACM Press, pp 668–673

  • Mörchen F (2006b) A better tool than Allen’s relations for expressing temporal knowledge in interval data. In: Li T, Perng C, Wang H, Domeniconi C (eds) Workshop on temporal data mining at the 12th ACM SIGKDD international conference on knowledge discovery and data mining. pp 25–34

  • Mörchen F (2006c) Time series knowledge mining. PhD thesis Philipps-University Marburg Germany

  • Mörchen F, Ultsch A (2005) Optimizing time series discretization for knowledge discovery. In: Grossman R, Bayardo R, Bennett KP (eds) Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’05). ACM Press, pp 660–665

  • Mörchen F, Ultsch A, Hoos O (2004) Discovering interpretable muscle activation patterns with the Temporal Data Mining Method. In: Boulicaut J-F, Esposito F, Giannotti F, Pedreschi D (eds) Proceedings of the 8th European conference on principles and practice of knowledge discovery in databases (PKDD’04). Lecture notes in computer science. Springer, pp 512–514

  • Mörchen F, Ultsch A and Hoos O (2006). Extracting interpretable muscle activation patterns with time series knowledge mining. Int J Knowl-Based Intell Eng Syst 9(3): 197–208

    Google Scholar 

  • Palpanas T, Cardle M, Gunopulos D, Keogh E, Zordan VB (2004a) Indexing large human motion databases. In: Nascimento MA, Özsu MT, Kossmann D, Miller RJ, Blakeley JA, Schiefer KB (eds) Proceedings of the 30th international conference on very large data bases (VLDB’04). Morgan Kaufmann, pp 780–791

  • Palpanas T, Vlachos M, Keogh E, Gunopulos D, Truppel W (2004b) Online amnesic approximation of streaming time series. In: Proceedings of the 20th international conference on data engineering (ICDE’04). IEEE Press, pp 338–349

  • Papadimitriou S, Sun J, Faloutsos C (2005) Streaming pattern discovery in multiple time-series. In: Böhm K, Jensen CS, Haas LM, Kersten ML, Larson P-Å, Ooi BC (eds) Proceedings of the 31st international conference on very large data bases (VLDB’05). Morgan Kaufmann, pp 697–708

  • Papaterou P, Kollios G, Sclaroff S, Gunopoulos D (2005) Discovering frequent arrangements of temporal intervals. In: Proceedings of the 5th IEEE international conference on data mining (ICDM’05). IEEE Press, pp 354–361

  • Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceeding of the 7th international conference on database theory (ICDT’99). Springer, pp 398–416

  • Pei J, Tung AK, Han J (2001) Fault-tolerant frequent pattern mining: problems and challenges. In: Workshop on research issues in data mining and knowledge discovery, 20th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS’01). IEEE Press

  • Pei J, Dong G, Zou W, Han J (2002) On computing condensed frequent pattern bases. In: Proceedings of the 2nd IEEE international conference on data mining (ICDM’02). IEEE Press, pp 378–385

  • Pei J, Liu J, Wang H, Wang K, Yu PS, Wang J (2005) Efficiently mining frequent closed partial orders. In: Proceedings of the 5th IEEE international conference on data mining (ICDM’05). IEEE Press, pp 753–756

  • Pei J, Wang H, Liu J, Wang K, Wang J and Yu PS (2006). Discovering frequent closed partial orders from strings. IEEE Trans Knowl Data Eng 18(11): 1467–1481

    Article  Google Scholar 

  • Pudi V, Haritsa JR (2003) Generalized closed itemsets for association rule mining. In: Dayal U, Ramamritham K, Vijayaraman TM (eds) Proceedings of the 19th international conference on data engineering (ICDE’03). IEEE Press, pp 714–716

  • Rainsford C, Roddick J (1999) Adding temporal semantics to association rules. In: Zytkow JM, Rauch J (eds) Proceedings of the 3rd European conference on principles of data mining and knowledge discovery (PKDD’99). Springer, pp 504–509

  • Roddick JF and Mooney CH (2005). Linear temporal sequences and their interpretation using midpoint relationships. IEEE Trans Knowl Data Eng 17(1): 133–135

    Article  Google Scholar 

  • Schwalb E, Vila L (1997) Temporal constraints: a survey. Technical report, ICS, University of California at Irvine, CA, USA

  • Seppänen JK, Mannila H (2004) Dense itemsets. In: Kim W, Kohavi R, Gehrke J, DuMouchel W (eds) Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04). ACM Press, pp 683–688

  • Shneiderman B (1996) The eyes have it: a task by data type taxonomy for information visualizations. In: Proceedings of the 1996 IEEE symposium on visual languages. IEEE Press, p 336

  • Siskind JM (2001). Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. J Artif Intell Res 15: 31–90

    MATH  Google Scholar 

  • Sripada SG, Reiter E, Hunter J (2003) Generating English summaries of time series data using the Gricean maxims. In: Getoor L, Senator TE, Domingos P, Faloutsos C (eds) Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’03). ACM Press, pp 187–196

  • Ultsch A (1996) Eine unifikationsbasierte Grammatik zur Beschreibung von komplexen Mustern in multivariaten Zeitreihen. Personal notes (German)

  • Ultsch A (1999) Data mining and knowledge discovery with emergent self-organizing feature maps for multivariate time series. In: Oja E, Kaski S (eds) Kohonen Maps. Elsevier, pp 33–46

  • Ultsch A (2004) Unification-based temporal grammar. Technical Report 37, Department of Mathematics and Computer Science, Philipps-University Marburg, Germany

  • Vilain M, Kautz HA, van Beek PG (1989) Constraint propagation algorithms for temporal reasoning: a revised report. In: Readings in qualitative reasoning about physical systems. Morgan Kaufmann, San Francisco, USA, pp 373–381

  • Villafane R, Hua KA, Tran D and Maulik B (2000). Knowledge discovery from series of interval events. J Intell Inform Syst 15(1): 71–89

    Article  Google Scholar 

  • Wang J, Han J (2004) BIDE: efficient mining of frequent closed sequences. In: Proceedings of the 20th international conference on data engineering (ICDE’04). IEEE Press, pp 79–90

  • Winarko E, Roddick JF (2007) ARMADA – an algorithm for discovering richer relative temporal association rules from interval-based data. Data Knowl Eng

  • Yahia SB, Hamrouni T and Mephu Nguifo E (2006). Frequent closed itemset based algorithms: A thorough structural and analytical survey. ACM SIGKDD Explor Newslett 8(1): 93–104

    Article  Google Scholar 

  • Yan X, Han J, Afshar R (2003) CloSpan: mining closed sequential patterns in large datasets. In: Barbará D, Kamath C (eds) Proceedings of the 3rd SIAM international conference on data mining (SDM’03). SIAM, pp 166–177

  • Yan X, Cheng H, Han J, Xin D (2005) Summarizing itemset patterns: a profile-based approach. In: Grossman R, Bayardo R, Bennett KP (eds) Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’05). ACM Press, pp 314–323

  • Yang C, Fayyad U, Bradley PS (2001) Efficient discovery of error-tolerant frequent itemsets in high dimensions. In: Provost F, Srikant R (eds) Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’01). ACM Press, pp 194–203

  • Zaki MJ, Hsiao C-J (2002) CHARM: an efficient algorithm for closed itemset mining. In: Grossman RL, Han J, Kumar V, Mannila H, Motwani R (eds) Proceedings of the 2nd SIAM international conference on data mining (SDM’02). SIAM, pp 457–473

  • Zaki MJ and Hsiao C-J (2005). Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4): 462–478

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabian Mörchen.

Additional information

Communicated by Johannes Gehrke.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mörchen, F., Ultsch, A. Efficient mining of understandable patterns from multivariate interval time series. Data Min Knowl Disc 15, 181–215 (2007). https://doi.org/10.1007/s10618-007-0070-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-007-0070-1

Keywords

Navigation