Preprocessing Time Series Data for Classification with Application to CRM

  • Yiming Yang
  • Qiang Yang
  • Wei Lu
  • Jialin Pan
  • Rong Pan
  • Chenhui Lu
  • Lei Li
  • Zhenxing Qin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3809)

Abstract

We develop an innovative data preprocessing algorithm for classifying customers using unbalanced time series data. This problem is directly motivated by an application whose aim is to uncover the customers’ churning behavior in the telecommunication industry. We model this problem as a sequential classification problem, and present an effective solution for solving the challenging problem, where the elements in the sequences are of a multi-dimensional nature, the sequences are uneven in length and classes of the data are highly unbalanced. Our solution is to integrate model based clustering and develop an innovative data preprocessing algorithm for the time series data. In this paper, we provide the theory and algorithms for the task, and empirically demonstrate that the method is effective in determining the customer class for CRM applications in the telecommunications industry.

Keywords

Classification of time series data for Telecommunications Applications 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: Yu, P.S., Chen, A.S.P. (eds.) Eleventh International Conference on Data Engineering, Taipei, Taiwan, pp. 3–14. IEEE Computer Society Press, Los Alamitos (1995)CrossRefGoogle Scholar
  3. 3.
    Borges, J., Levene, M.: Data mining of user navigation patterns. In: Masand, B., Spiliopoulou, M. (eds.) WebKDD 1999. LNCS (LNAI), vol. 1836, pp. 31–36. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  4. 4.
    Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S.: Visualization of navigation patterns on a web site using model-based clustering. Knowledge Discovery and Data Mining, pp. 280–284 (March 2000)Google Scholar
  5. 5.
    Domingos, P.: Metacost: A general method for making classifiers cost sensitive. In: Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pp. 155–164. AAAI Press, Menlo Park (1999)CrossRefGoogle Scholar
  6. 6.
    Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence, pp. 973–978 (2001)Google Scholar
  7. 7.
    Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171–186 (2001)MATHCrossRefGoogle Scholar
  8. 8.
    Levene, M., Loizou, G.: A probabilistic approach to navigation in hypertext. Information Sciences 114(1–4), 165–186 (1999)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Ling, C.X., Huang, J., Zhang, H.: AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of 18th International Conference on Artificial Intelligence (IJCAI–2003), pp. 329–341 (2003)Google Scholar
  10. 10.
    Ling, C.X., Li, C.: Data mining for direct marketing - specific problems and solutions. In: Proceedings of Fourth International Conference on Knowledge Discovery and Data Mining (KDD–1998), pp. 73–79 (1998)Google Scholar
  11. 11.
    Smyth, P.: Clustering sequences with hidden markov models. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, p. 648. The MIT Press, Cambridge (1997)Google Scholar
  12. 12.
    Wang, K., Zhou, S., Yang, Q., Yeung, J.M.S.: Mining customer value: from association rules to direct marketing. Journal of Data Mining and Knowledge Discovery (2005)Google Scholar
  13. 13.
    Zadrozny, B., Elkan, C.: Learning and making decisions when costs and prob- abilities are both unknown. In: Proceedings of the seventh ACM SIGKDD inter- national conference on Knowledge discovery and data mining (SIGKDD 2001), San Francisco, CA, USA, pp. 204–213 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Yiming Yang
    • 1
  • Qiang Yang
    • 2
  • Wei Lu
    • 1
  • Jialin Pan
    • 1
  • Rong Pan
    • 2
  • Chenhui Lu
    • 1
  • Lei Li
    • 1
  • Zhenxing Qin
    • 3
  1. 1.Software InstituteZhongshan UniversityGuangzhouChina
  2. 2.Department of Computer ScienceHong Kong University of Science and TechnologyKowloon, Hong KongChina
  3. 3.Faculty of Information TechnologyUniversity of TechnologyBroadway, SydneyAustralia

Personalised recommendations