Advertisement

Healthcare Trajectory Mining by Combining Multidimensional Component and Itemsets

  • Elias Egho
  • Chedy Raïssi
  • Dino Ienco
  • Nicolas Jay
  • Amedeo Napoli
  • Pascal Poncelet
  • Catherine Quantin
  • Maguelonne Teisseire
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7765)

Abstract

Sequential pattern mining is aimed at extracting correlations among temporal data. Many different methods were proposed to either enumerate sequences of set valued data (i.e., itemsets) or sequences containing multidimensional items. However, in real-world scenarios, data sequences are described as events of both multidimensional items and set valued information. These rich heterogeneous descriptions cannot be exploited by traditional approaches. For example, in healthcare domain, hospitalizations are defined as sequences of multi-dimensional attributes (e.g. Hospital or Diagnosis) associated with two sets, set of medical procedures (e.g. \( \lbrace \) Radiography, Appendectomy \(\rbrace\)) and set of medical drugs (e.g. \(\lbrace \) Aspirin, Paracetamol \(\rbrace\)) . In this paper we propose a new approach called MMISP (Mining Multidimensional Itemset Sequential Patterns) to extract patterns from a complex sequences including both dimensional items and itemsets. The novelties of the proposal lies in: (i) the way in which the data can be efficiently compressed; (ii) the ability to reuse and adopt sequential pattern mining algorithms and (iii) the extraction of new kind of patterns. We introduce as a case-study, experimented on real data aggregated from a regional healthcare system and we point out the usefulness of the extracted patterns. Additional experiments on synthetic data highlights the efficiency and scalability of our approach.

Keywords

Sequential Patterns Multi-dimensional Sequential Patterns Data Mining 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, ICDE 1995, pp. 3–14. IEEE Computer Society, Washington, DC (1995)CrossRefGoogle Scholar
  2. 2.
    Appice, A., Berardi, M., Ceci, M., Malerba, D.: Mining and Filtering Multilevel Spatial Association Rules with ARES (2005)Google Scholar
  3. 3.
    Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: KDD, pp. 429–435 (2002)Google Scholar
  4. 4.
    Chiu, D.-Y., Wu, Y.-H., Chen, A.L.P.: An efficient algorithm for mining frequent sequences by a new strategy without support counting. In: ICDE, pp. 375–386 (2004)Google Scholar
  5. 5.
    Cohen, J., Eshleman, J., Hagenbuch, B., Kent, J., Pedrotti, C., Sherry, G., Waas, F.: Online expansion of largescale data warehouses. In: PVLDB, vol. 4(12), pp. 1249–1259 (2011)Google Scholar
  6. 6.
    Egho, E., Jay, N., Raïssi, C., Napoli, A.: A FCA-based analysis of sequential care trajectories. In: Napoli, A., Vychodil, V. (eds.) The Eighth International Conference on Concept Lattices and their Applications - CLA 2011, Nancy, France. INRIA Nancy Grand Est - LORIA (October 2011)Google Scholar
  7. 7.
    Han, J., Fu, Y.: Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering 11(5), 798–805 (1999)CrossRefGoogle Scholar
  8. 8.
    Masseglia, F., Cathala, F., Poncelet, P.: The PSP approach for mining sequential patterns. In: Żytkow, J.M. (ed.) PKDD 1998. LNCS, vol. 1510, pp. 176–184. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  9. 9.
    Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Prefixspan: Mining sequential patterns by prefix-projected growth. In: ICDE, pp. 215–224 (2001)Google Scholar
  10. 10.
    Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., Dayal, U.: Multi-dimensional sequential pattern mining. In: CIKM, pp. 81–88 (2001)Google Scholar
  11. 11.
    Plantevit, M., Laurent, A., Laurent, D., Teisseire, M., Choong, Y.W.: Mining multidimensional and multilevel sequential patterns. TKDD 4(1), 1–37 (2010)CrossRefGoogle Scholar
  12. 12.
    Salvemini, E., Fumarola, F., Malerba, D., Han, J.: FAST sequence mining based on sparse id-lists. In: Kryszkiewicz, M., Rybinski, H., Skowron, A., Raś, Z.W. (eds.) ISMIS 2011. LNCS, vol. 6804, pp. 316–325. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  13. 13.
    Srikant, R., Agrawal, R.: Mining sequential patterns: Generalizations and performance improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)Google Scholar
  14. 14.
    Yan, X., Han, J., Afshar, R.: Clospan: Mining closed sequential patterns in large datasets. In: SDM, pp. 166–177 (2003)Google Scholar
  15. 15.
    Yang, Z., Kitsuregawa, M., Wang, Y.: Paid: Mining sequential patterns by passed item deduction in large databases. In: IDEAS, pp. 113–120 (2006)Google Scholar
  16. 16.
    Yu, C.-C., Chen, Y.-L.: Mining sequential patterns from multidimensional sequence data. IEEE Trans. Knowl. Data Eng. 17(1), 136–140 (2005)CrossRefGoogle Scholar
  17. 17.
    Zaki, M.J.: Spade: An efficient algorithm for mining frequent sequences. Mach. Learn. 42(1-2), 31–60 (2001)CrossRefzbMATHGoogle Scholar
  18. 18.
    Zhang, C., Hu, K., Chen, Z., Chen, L., Dong, Y.: Approxmgmsp: A scalable method of mining approximate multidimensional sequential patterns on distributed system. In: FSKD (2), pp. 730–734 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Elias Egho
    • 1
  • Chedy Raïssi
    • 4
  • Dino Ienco
    • 2
    • 3
  • Nicolas Jay
    • 1
  • Amedeo Napoli
    • 1
  • Pascal Poncelet
    • 2
    • 3
  • Catherine Quantin
    • 5
  • Maguelonne Teisseire
    • 2
    • 3
  1. 1.Orpailleur TeamLORIAFrance
  2. 2.Irstea, UMR TETISMontpellierFrance
  3. 3.LIRMMUniv. Montpellier 2MontpellierFrance
  4. 4.INRIAFrance
  5. 5.Department of Biostatistics and Medical InformationDijonFrance

Personalised recommendations