Advertisement

Journal of Intelligent Information Systems

, Volume 42, Issue 2, pp 283–305 | Cite as

A contribution to the discovery of multidimensional patterns in healthcare trajectories

  • Elias EghoEmail author
  • Nicolas Jay
  • Chedy Raïssi
  • Dino Ienco
  • Pascal Poncelet
  • Maguelonne Teisseire
  • Amedeo Napoli
Article

Abstract

Sequential pattern mining is aimed at extracting correlations among temporal data. Many different methods were proposed to either enumerate sequences of set valued data (i.e., itemsets) or sequences containing dimensional items. However, in real-world scenarios, data sequences are described as combination of both multidimensional items and itemsets. These heterogeneous descriptions cannot be handled by traditional approaches. In this paper we propose a new approach called MMISP (Mining Multidimensional Itemset Sequential Patterns) to extract patterns from complex sequential database including both multidimensional items and itemsets. The novelties of the proposal lies in: (i) the way in which the data are efficiently compressed; (ii) the ability to reuse and adopt sequential pattern mining algorithms and (iii) the extraction of new kind of patterns. We introduce a case-study on real-world data from a regional healthcare system and we point out the usefulness of the extracted patterns. Additional experiments on synthetic data highlights the efficiency and scalability of the approach MMISP.

Keywords

Complex sequential patterns Multidimensional sequential patterns Data mining Complex data 

References

  1. Agrawal, R., & Srikant, R. (1995). Mining sequential patterns. In Proceedings of the eleventh international conference on data engineering, ICDE’95 (pp. 3–14) Washington: IEEE Computer Society.Google Scholar
  2. Ayres, J., Flannick, J., Gehrke, J., Yiu, T. (2002). Sequential pattern mining using a bitmap representation. In KDD (pp. 429–435).Google Scholar
  3. Chiu, D.-Y., Wu, Y.-H., Chen, A. LP. (2004). An efficient algorithm for mining frequent sequences by a new strategy without support counting. In ICDE (pp. 375–386).Google Scholar
  4. Fetter, R. B., Shin, Y., Freeman, J. L., Averill, R. F., Thompson, J. D. (1980). Case mix definition by diagnosis-related groups. Medical Care, 18(2), 1–53.Google Scholar
  5. Masseglia, F., Cathala, F., Poncelet, P. (1998). The PSP approach for mining sequential patterns. In PKDD (pp. 176–184).Google Scholar
  6. Mooney, C. H., & Roddick, J. F. (2013). Sequential pattern mining—approaches and algorithms. ACM Computing Surveys, 45(2), 19:1–19:39.CrossRefGoogle Scholar
  7. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M. (2001). Prefixspan: mining sequential patterns by prefix-projected growth. In ICDE (pp. 215–224).Google Scholar
  8. Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., Dayal, U. (2001). Multi-dimensional sequential pattern mining. In CIKM (pp. 81–88).Google Scholar
  9. Plantevit, M., Laurent, A., Laurent, D., Teisseire, M., Choong, Y. W. (2010). Mining multidimensional and multilevel sequential patterns. TKDD, 4(1), 1–37.CrossRefGoogle Scholar
  10. Salvemini, E., Fumarola, F., Malerba, D., Han, J. (2011). Fast sequence mining based on sparse id-lists. In Proceedings of the 19th international conference on foundations of intelligent systems, ISMIS’11 (pp. 316–325). Berlin: Springer.Google Scholar
  11. Srikant, R., & Agrawal, R. (1996). Mining sequential patterns: generalizations and performance improvements. In Proceedings of the 5th international conference on extending database technology: advances in database technology, EDBT’96 (pp. 3–17). London: Springer.Google Scholar
  12. Yan, X., Han, J., Afshar, R. (2003). Clospan: mining closed sequential patterns in large datasets. In SDM (pp. 166–177).Google Scholar
  13. Yang, Z., Kitsuregawa, M., Wang, Y. (2006). Paid: mining sequential patterns by passed item deduction in large databases. In IDEAS (pp. 113–120).Google Scholar
  14. Yu, C.-C., & Chen, Y.-L. (2005). Mining sequential patterns from multidimensional sequence data. IEEE Transactions on Knowledge and Data Engineering, 17(1), 136–140.CrossRefGoogle Scholar
  15. Yu, C., & Jagadish, H. V. (2007). Querying complex structured databases. In Proceedings of the 33rd international conference on very large data bases, VLDB’07 (pp. 1010–1021). VLDB Endowment.Google Scholar
  16. Zaki, M. J. (2001). Spade: an efficient algorithm for mining frequent sequences. Machine Learning, 42(1–2), 31–60.CrossRefzbMATHGoogle Scholar
  17. Zhang, C., Hu, K., Chen, Z., Chen, L., Dong, Y. (2007). Approxmgmsp: a scalable method of mining approximate multidimensional sequential patterns on distributed system. In FSKD (2) (pp. 730–734).Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Elias Egho
    • 1
    Email author
  • Nicolas Jay
    • 1
  • Chedy Raïssi
    • 1
  • Dino Ienco
    • 2
  • Pascal Poncelet
    • 2
  • Maguelonne Teisseire
    • 2
  • Amedeo Napoli
    • 1
  1. 1.LORIA (CNRS - Université de Lorraine)/Inria Nancy Grand EstNancyFrance
  2. 2.Irstea, UMR TETIS/LIRMM, University of Montpellier 2MontpellierFrance

Personalised recommendations