Advertisement

Mining the Most Interesting Patterns from Multiple Phenotypes Medical Data

  • Ying Yin
  • Bin Zhang
  • Yuhai Zhao
  • Guoren Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4259)

Abstract

Mining the most interesting patterns from multiple phenotypes medical data poses a great challenge for previous work, which only focuses on bi-phenotypes (such as abnormal vs. normal) medical data. Association rule mining can be applied to analyze such dataset, whereas most rules generated are either redundancy or no sense. In this paper, we define two interesting patterns, namely VP (an acronym for “Vital Pattern”) and PP (an acronym for “Protect Pattern”), based on a statistical metric. We also propose a new algorithm called MVP that is specially designed to discover such two patterns from multiple phenotypes medical data. The algorithm generates useful rules for medical researchers, from which a clearly causal graph can be induced. The experiment results demonstrate that the proposed method enables the user to focus on fewer rules and assures that the survival rules are all interesting from the viewpoint of medical domain. The classifier build on the rules generated by our method outperforms existing classifiers.

Keywords

Association Rule Interesting Pattern Optimal Rule Interesting Measure Rule Discovery 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. of ACM SIGMOD 1993 Conference, pp. 207–216 (1993)Google Scholar
  2. 2.
    Decker, K.M., Focardi, S.: Technology overview: A report on data mining, technical report cscs tr-95-02. In Swiss Scientific Computing Center (1995)Google Scholar
  3. 3.
    Li, J., Wong, L.: Using rules to analyse bio-medical data: A comparison between C4.5 and PCL. In: Dong, G., Tang, C., Wang, W. (eds.) WAIM 2003. LNCS, vol. 2762, pp. 254–265. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  4. 4.
    Zhou, Z., Jiang, Y.: Medical diagnosis with c4.5 rule preceded by artificial neural network ensemble. IEEE Transactions on Information Technology in Biomedicine 7, 37–42 (2003)CrossRefGoogle Scholar
  5. 5.
    Brossette, J.S.E., Sprague, A.P., Moser, S.A.: Associarion rules and data mining in hospital infection control and public health surveillance. Journal of American Medical Informatics Association 5(4), 373–381 (1998)Google Scholar
  6. 6.
    Ohsaki, M., Sato, Y., Yokoi, H., Yamaguchi, T.: A rule discovery support system for sequential medical data in the case study of a chronic hepatitis dataset. In: Proc of the ECML/PKDD 2003 Discovery Challenge Workshop, pp. 154–165 (2003)Google Scholar
  7. 7.
    Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Proc. of 5th Intl. Conf. on Knowledge Discovery and Data Mining (1999)Google Scholar
  8. 8.
    Kamber, M., Shinghal, R.: Evalusting the interestingness of characteristic rules. In: Proc. of 2nd Intl. Conf. on Knowledge Discovery and Data Mining, pp. 263–266 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ying Yin
    • 1
  • Bin Zhang
    • 1
  • Yuhai Zhao
    • 1
  • Guoren Wang
    • 1
  1. 1.Department of Computer Science and EngineeringNortheastern UniversityShengyangP.R. China

Personalised recommendations