# Finding the longest common sub-pattern in sequences of temporal intervals

- 531 Downloads
- 7 Citations

## Abstract

We study the problem of finding the longest common sub-pattern (LCSP) shared by two sequences of temporal intervals. In particular we are interested in finding the LCSP of the corresponding arrangements. Arrangements of temporal intervals are a powerful way to encode multiple concurrent labeled events that have a time duration. Discovering commonalities among such arrangements is useful for a wide range of scientific fields and applications, as it can be seen by the number and diversity of the datasets we use in our experiments. In this paper, we define the problem of LCSP and prove that it is NP-complete by demonstrating a connection between graphs and arrangements of temporal intervals. This connection leads to a series of interesting open problems. In addition, we provide an exact algorithm to solve the LCSP problem, and also propose and experiment with three polynomial time and space under-approximation techniques. Finally, we introduce two upper bounds for LCSP and study their suitability for speeding up 1-NN search. Experiments are performed on seven datasets taken from a wide range of real application domains, plus two synthetic datasets. Lastly, we describe several application cases that demonstrate the need and suitability of LCSP.

## Keywords

Temporal intervals Longest common sub-pattern Event-interval sequences## References

- Abraham T, Roddick JF (1999) Incremental meta-mining from large temporal data sets. In: ER ’98: Proceedings of the workshops on data warehousing and data mining, pp 41–54Google Scholar
- Ale JM, Rossi GH (2000) An approach to discovering temporal association rules. In: Proceedings of the 15th ACM symposium on applied computing, pp 294–300Google Scholar
- Allen J, Ferguson G (1994) Actions and events in interval temporal logic. J Log Comput 4:531–579MathSciNetCrossRefMATHGoogle Scholar
- Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843CrossRefMATHGoogle Scholar
- Berendt B (1996) Explaining preferred mental models in Allen inferences with a metrical model of imagery. In: Proceedings of the 18th annual conference of the cognitive science society, pp 489–494Google Scholar
- Bergen B, Chang N (2005) Embodied construction grammar in simulation-based language understanding. Constr Gramm 3:147–190CrossRefGoogle Scholar
- Chen X, Petrounias I (1999) Mining temporal features in association rules. In: Proceedings of the 3rd European conference on principles and practice of knowledge discovery in databases. Springer-Verlag, New York, pp 295–300Google Scholar
- Chen YC, Peng WC, Le SY (2011) CEMiner—an effcient algorithms for mining closed patterns from interval-based data. In: Proceedings of the IEEE international conference on data mining (ICDM)Google Scholar
- Cormen TH, Rivest RL, Leiserson CE, Stein C (2001) Introduction to algorithms. MIT Press, CambridgeMATHGoogle Scholar
- Feige U, Goldwasser S, Lovasz L, Safra S, Szegedy M (1991) Approximating clique is almost NP-complete. In: Proceedings of the 32nd annual IEEE symposium on foundations of computer science, pp 2–12Google Scholar
- Fradkin D, Moerchen F (2010) Margin-closed frequent sequential pattern mining. In: Proceedings of the ACM SIGKDD workshop on useful patterns. ACM, New York, UP ’10, pp 45–54. doi: 10.1145/1816112.1816119
- Giannotti F, Nanni M, Pedreschi D (2006) Efficient mining of temporally annotated sequences. In: Proceedings of the 6th SIAM data mining conference, vol 124, pp 348–359Google Scholar
- Håstad J (1996) Clique is hard to approximate within \(n^{1-\epsilon }\). In: FOCS, pp 627–636Google Scholar
- Höppner F (2001) Discovery of temporal patterns—learning rules about the qualitative behaviour of time series. In: Proceedings of the 5th European conference on principles of knowledge discovery in databases, pp 192–203Google Scholar
- Höppner F, Klawonn F (2001) Finding informative rules in interval sequences. In: Proceedings of the 4th international symposium on advances in intelligent data analysis, pp 123–132Google Scholar
- Hwang SY, Wei CP, Yang WS (2004) Discovery of temporal patterns from process instances. Comput Ind 53(3):345–364CrossRefGoogle Scholar
- Jiang D, Pei J (2009) Mining frequent cross-graph quasi-cliques. ACM Trans Knowl Discov Data 2(4):16:1–16:42MathSciNetCrossRefGoogle Scholar
- Kam P, Fu AW (2000) Discovering temporal patterns for interval-based events. In: Proceedings of the 2nd international conference on data warehousing and knowledge discovery, pp 317–326Google Scholar
- Kosara R, Miksch S (2001) Visualizing complex notions of time. Stud Health Technol Inf 1:211–215Google Scholar
- Kostakis O, Papapetrou P, Hollmén J (2011) Artemis: assessing the similarity of event-interval sequences. In: Proceedings of the conference on machine learning and knowledge discovery in databases (ECML/PKDD 2011), pp 229–244Google Scholar
- Kotsifakos A, Papapetrou P, Athitsos V (2013) IBSM: interval-based sequence matching. In: Proceedings of the SIAM conference on data mining (SDM), pp 596–604Google Scholar
- Lam HT, Mrchen F, Fradkin D, Calders T (2014) Mining compressing sequential patterns. Stat Anal Data Min 7(1):34–52. doi: 10.1002/sam.11192 MathSciNetCrossRefGoogle Scholar
- Laxman S, Sastry P, Unnikrishnan K (2007) Discovering frequent generalized episodes when events persist for different durations. IEEE Trans Knowl Data Eng 19(9):1188–1201. doi: 10.1109/TKDE.2007.1055 CrossRefGoogle Scholar
- Lin JL (2003) Mining maximal frequent intervals. In: Proceedings of the 18th ACM symposium on applied computing, pp 624–629Google Scholar
- Liu G, Wong L (2008) Effective pruning techniques for mining quasi-cliques. In: Proceedings of the European conference on machine learning and knowledge discovery in databases: part II. Springer-Verlag, Berlin, ECML PKDD ’08, pp 33–49. doi: 10.1007/978-3-540-87481-2_3
- Mooney C, Roddick JF (2004) Mining relationships between interacting episodes. In: Proceedings of the 4th SIAM international conference on data miningGoogle Scholar
- Mörchen F (2007) Unsupervised pattern mining from symbolic temporal data. SIGKDD Explor Newsl 9:41–55CrossRefGoogle Scholar
- Mörchen F, Fradkin D (2010) Robust mining of time intervals with semi-interval partial order patterns. In: Proceedings of the 10th SIAM international conference on data mining, pp 315–326Google Scholar
- Pachet F, Ramalho G, Carrive J (1996) Representing temporal musical objects and reasoning in the MusES system. J New Music Res 25(3):252–275CrossRefGoogle Scholar
- Papapetrou P, Kollios G, Sclaroff S, Gunopulos D (2009) Mining frequent arrangements of temporal intervals. Knowl Inf Syst 21:133–171CrossRefGoogle Scholar
- Patel D, Hsu W, Lee M (2008) Mining relationships among interval-based events for classification. In: Proceedings of the 28th ACM SIGMOD international conference on management of data, ACM, pp 393–404Google Scholar
- Paterson M, Dancik V (1994) Longest common subsequences. In: Proceedings of the 19th MFCS, number 841 in LNCS, pp 127–142Google Scholar
- Pissinou N, Radev I, Makki K (2001) Spatio-temporal modeling in video and multimedia geographic information systems. GeoInformatica 5(4):375–409CrossRefMATHGoogle Scholar
- Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197CrossRefGoogle Scholar
- Tsourakakis CE, Bonchi F, Gionis A, Gullo F, Tsiarli MA (2013) Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp 104–112Google Scholar
- Villafane R, Hua KA, Tran D, Maulik B (2000) Knowledge discovery from series of interval events. Intell Inf Syst 15(1):71–89CrossRefGoogle Scholar
- Vitter JS (1985) Random sampling with a reservoir. ACM Trans Math Softw (TOMS) 11(1):37–57MathSciNetCrossRefMATHGoogle Scholar
- Vlachos M, Hadjieleftheriou M, Gunopulos D, Keogh EJ (2006) Indexing multidimensional time-series. VLDB J 15(1):1–20CrossRefGoogle Scholar
- Winarko E, Roddick JF (2007) Armada—an algorithm for discovering richer relative temporal association rules from interval-based data. Data Knowl Eng 63(1):76–90. doi: 10.1016/j.datak.2006.10.009 CrossRefGoogle Scholar
- Wu SY, Chen YL (2007) Mining nonambiguous temporal patterns for interval-based events. IEEE Trans Knowl Data Eng 19(6):742–758. doi: 10.1109/TKDE.2007.190613 CrossRefGoogle Scholar