Reliable Early Classification on Multivariate Time Series with Numerical and Categorical Attributes

  • Yu-Feng Lin
  • Hsuan-Hsu Chen
  • Vincent S. Tseng
  • Jian Pei
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9077)

Abstract

Early classification on multivariate time series has recently emerged as a novel and important topic in data mining fields with wide applications such as early detection of diseases in healthcare domains. Most of the existing studies on this topic focused only on univariate time series, while some very recent works exploring multivariate time series considered only numerical attributes and are not applicable to multivariate time series containing both of numerical and categorical attributes. In this paper, we present a novel methodology named REACT (Reliable EArly ClassificaTion), which is the first work addressing the issue of constructing an effective classifier on multivariate time series with numerical and categorical attributes in serial manner so as to guarantee stability of accuracy compared to the classifiers using full-length time series. Furthermore, we also employ the GPU parallel computing technique to develop an extended mechanism for building the early classifier efficiently. Experimental results on real datasets show that REACT significantly outperforms the state-of-the-art method in terms of accuracy and earliness, and the GPU implementation is verified to substantially enhance the efficiency by several orders of magnitudes.

Keywords

Early classification Multivariate time series Serial classifier Numerical and categorical attributes Shapelets GPU 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bache, K., Lichman, M.: UCI machine learning repository. University of California, Irvine (2013)Google Scholar
  2. 2.
    Batal, I., Hauskrecht, M.: Constructing classification features using minimal predictive patterns. In: 10th CIKM, New York, pp. 869–878 (2010)Google Scholar
  3. 3.
    Baranzini, S.E., Mousavi, P., Rio, J., Caillier, S.J., Stillman, A., Villoslada, P., Wyatt, M.M., Comabella, M., Greller, L.D., Somogyi, R., Oksenberg, J.R.: Transcription-based prediction of response to IFNβ using supervised computational methods. PLos Biology 3(1), 166–176 (2005)CrossRefGoogle Scholar
  4. 4.
    Chang, K.W., Deka, B., Hwu, W.M.H., Roth, D.: Efficient Pattern-Based Time Series Classification on GPU. In: ICDM, Belgium, pp. 131–140 (2012)Google Scholar
  5. 5.
    Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. PVLDB 1(2) (2008)Google Scholar
  6. 6.
    Gao, C., Wang, J.: Efficient itemset generator discovery over a stream sliding window. In: 9th CIKM, Hong Kong, pp. 355–364 (2009)Google Scholar
  7. 7.
    Ghalwash, M.F., Radosavljevic, V., Obradovic, Z.: Extraction of Interpretable Multivariate Patterns for Early Diagnostics. In: 13th ICDM, Dallas, pp. 201–210 (2013)Google Scholar
  8. 8.
    Ghalwash, M.F., Obradovic, Z.: Early classification of multivariate temporal observations by extraction of interpretable shapelets. BMC Bioinformatics 13(195) (2012)Google Scholar
  9. 9.
    Griffin, M.P., Moorman, J.R.: Toward the early diagnosis of neonatal sepsis and sepsis-like illness using novel heart rate analysis. PEDIATRICS 107(1), 97–104 (2001)CrossRefGoogle Scholar
  10. 10.
    He, G.., Duan, Y., Qian, T.Y., Chen, X.: Early prediction on imbalanced multivariate time series. In: 22th CIKM, Burlingame, pp. 1889–1892 (2013)Google Scholar
  11. 11.
    Lee, C., Chen, J.C., Tseng, V.S.: A novel data mining mechanism considering bio-signal and environmental data with application on asthma monitoring. Computer Methods and Program in Biomedicine 101(1), 44–61 (2011)CrossRefGoogle Scholar
  12. 12.
    Li, J., Li, H., Wong, L., Pei, J., Dong, G.: Minimum description length principle: Generators are preferable to closed patterns. In: 21th AAAI, Boston, pp. 409–414 (2006)Google Scholar
  13. 13.
    Li, J., Liu, G., Wong, L.: Mining statistically important equivalence classes and delta-discriminative emerging patterns. In: 13th KDD, New York, pp. 430–439 (2007)Google Scholar
  14. 14.
    Lines, J., Davis, L.M., Hills, J., Bagnall, A.: A shapelet transform for time series classification. In: 18th KDD, New York, pp. 289–297 (2012)Google Scholar
  15. 15.
    Lo, D., Khoo, S., Li, J.: Mining and ranking generators of sequential patterns. In: SDM, Atlanta, pp. 553–564 (2008)Google Scholar
  16. 16.
    Olszewski, R.T.: Generalized feature extraction for structural pattern recognition in time-series data. PhD Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh (2011)Google Scholar
  17. 17.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering Frequent Closed Itemsets for Association Rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  18. 18.
    Xing, Z., Pei, J., Dong, G., Yu, P. S.: Mining sequence classifiers for early prediction. In: SDM, Atlanta, pp. 644–655 (2008)Google Scholar
  19. 19.
    Xing, Z., Pei, J., Yu, P.S.: Early classification on time series: A nearest neighbor approach. In: 21th IJCAI, Pasadena, pp. 1297–1302 (2009)Google Scholar
  20. 20.
    Ye, L., Keogh, E.: Time series shapelet: A new primitive for data mining. In: 15th KDD, Paris, pp. 947–956 (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Yu-Feng Lin
    • 1
  • Hsuan-Hsu Chen
    • 1
  • Vincent S. Tseng
    • 2
  • Jian Pei
    • 3
  1. 1.Department of Computer Science and Information EngineeringNational Cheng Kung UniversityTainanTaiwan, Republic of China
  2. 2.Department of Computer ScienceNational Chiao Tung UniversityHsinchuTaiwan, Republic of China
  3. 3.School of Computing ScienceSimon Fraser University BurnabyBurnabyCanada

Personalised recommendations