An Association Rule Analysis Framework for Complex Physiological and Genetic Data

  • Jing He
  • Yanchun Zhang
  • Guangyan Huang
  • Yefei Xin
  • Xiaohui Liu
  • Hao Lan Zhang
  • Stanley Chiang
  • Hailun Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7231)


Physiological and genetic information has been critical to the successful diagnosis and prognosis of complex diseases. In this paper, we introduce a support-confidence-correlation framework to accurately discover truly meaningful and interesting association rules between complex physiological and genetic data for disease factor analysis, such as type II diabetes (T2DM). We propose a novel Multivariate and Multidimensional Association Rule mining system based on Change Detection (MMARCD). Given a complex data set u i (e.g. u 1 numerical data streams, u 2 images, u 3 videos, u 4 DNA/RNA sequences) observed at each time tick t, MMARCD incrementally finds correlations and hidden variables that summarise the key relationships across the entire system. Based upon MMARCD, we are able to construct a correlation network for human diseases.


Association Rule Retinal Image Correlation Network Fuzzy Association Rule Physiological Observation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Botstein, D., Risch, N.: Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nature Genetics 33, 228–237 (2003), doi: 10.1038CrossRefGoogle Scholar
  2. 2.
    He, J., Zhang, Y., Huang, G.: Multivariate association mining for genetics and physiological data related with T2DM. Health Information Science and Systems (October 2011) (accepted)Google Scholar
  3. 3.
    Christensen, K., Murray, J.: What genome-wide association studies can do for medicine. N Engl. J. Med. 356(11), 1169–1171 (2007)CrossRefGoogle Scholar
  4. 4.
    Klein, R.: Complement factor H polymorphism in age-related macular degeneration. Science 308, 385–389 (2005), PMID, 15761122CrossRefGoogle Scholar
  5. 5.
    Johnson, A., O’Donnell, C.: An open access database of genome-wide association results. BMC Medical Genetics 10(6) (2009)Google Scholar
  6. 6.
    CGEMS Data Access, Cancer Genetic Markers of Susceptibility, National Cancer Institute, U.S.A.,
  7. 7.
    National Human Genome Research Institute, National Institutes of Health,
  8. 8.
    Diabetes Genetics Initiative,
  9. 9.
    Ku, C.: The pursuit of GWA studies: where are we now? Journal of Human Genetics 55(4), 195–206 (2010)CrossRefGoogle Scholar
  10. 10.
    Sladek, R., Rocheleau, G., Rung, J., et al.: A GWAS identifies novel risk loci for type 2 diabetes. Nature 445(7130), 881–885 (2007)CrossRefGoogle Scholar
  11. 11.
    Welcome Trust Case Control Consortium, Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661–678 (2007)Google Scholar
  12. 12.
    Kuok, C.M., Fu, A., Wong, M.H.: Shatin, Mining fuzzy association rules in databases. ACM SIGMOD 27(1) (1998)Google Scholar
  13. 13.
    Georgii, E., et al.: Analyzing microarray data using quantitative association rules. Bioinformatics 21(2), ii123–ii129Google Scholar
  14. 14.
    Agrawal, R., Srikant, R.: Fast algorithm for mining association rules in large databases. In: VLDB 1994, Santiago, Chile, pp. 487–499 (1994)Google Scholar
  15. 15.
    Pei, J., Han, J., Mao, R.: CLOSET: An efficient algorithm for mining frequent closed itemsets. In: Proc. 2000 ACM-SIGMOD Int. Workshop Data Mining and Knowledge Discovery (DMKD 2000), Dallas, TX, pp. 11–20 (May 2000)Google Scholar
  16. 16.
    Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. In: Proc. ICDM 2003 Int. Workshop on Frequent Itemset Mining Implementations (FIMI 2003), Melbourne, FL (November 2003)Google Scholar
  17. 17.
    Zaki, M., Hsiao, C.: CHARM: An efficient algorithm for closed itemset mining. In: Proc. 2002 SIAM Int. Conf. Data Mining (SDM 2002), Arlington, VA, pp. 457–473 (April 2002)Google Scholar
  18. 18.
    Burdick, D., Calimlin, M., Gehrke, J.: MAFIA: A maximal frequent itemset algorithm for transactional databases. In: Proc. 2001 Int. Conf. Data Engineering (ICDE 2001), Heidelberg, Germany, pp. 443–452 (April 2001)Google Scholar
  19. 19.
    Ying, et al.: Predicting source code changes by mining revision history. IEEE Trans. Software Engineering 30, 574–586 (2004)CrossRefGoogle Scholar
  20. 20.
    Zimmermann, et al.: Mining version histories to guide software changes. IEEE Trans. Software Eng. 31(6), 429–445 (2005)CrossRefGoogle Scholar
  21. 21.
    Wu, et al.: Re-examination of interestingness measures in pattern mining: a unified framework. Data Min. Knowl. Discov. 21(3), 371–397 (2010)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Goh, K., Cusick, M., Valle, D., et al.: The human disease network. Proc. Natl. Acad. Sci., USA 104, 8685–8690 (2007)CrossRefGoogle Scholar
  23. 23.
    Jimenez-Sanchez, G., Childs, B., Valle, D.: Human Disease Genes. Nature 409, 853–855Google Scholar
  24. 24.
    Childs, B., Valle, D.: Genetics, biology and disease. Annu. Rev. Genomics Hum. Genet. (1), 1–19 (2000)Google Scholar
  25. 25.
    Qiao, Z., He, J., Zhang, Y.: Multiple Time Series Anomaly Detection Based on Compression and Correlation Analysis: Algorithm and Medical Surveillance Case Study. In: 13th Asia Pacific Web Conference, Kunming, China (April 2012) (under review)Google Scholar
  26. 26.
    He, J., et al.: Cluster Analysis and Optimization in Color-Based Clustering for Image Abstract. In: ICDM Workshops 2007, pp. 213–218 (2007)Google Scholar
  27. 27.
    Huang, G., Ding, Z., He, J.: Automatic Generation of Traditional Style Painting by Using Density-Based Color Clustering. In: ICDM Workshops 2007, pp. 41–44 (2007)Google Scholar
  28. 28.
    MicroArray Gene Expression Markup Language Links,
  29. 29.
    Zhang, Y., Pang, C., He, J.: On multidimensional wavelet synopses for maximum error bounds. In: MCDM 2009, Chengdu, China (2009)Google Scholar
  30. 30.
    Huang, G., He, J., Ding, Z.: Wireless Video-Based Sensor Networks for Surveillance of Residential Districts. In: Zhang, Y., Yu, G., Bertino, E., Xu, G. (eds.) APWeb 2008. LNCS, vol. 4976, pp. 154–165. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  31. 31.
    Huang, G., He, J., Ding, Z.: Inter-frame change directing online clustering of multiple moving objects for video-based sensor networks. In: Web Intelligence/IAT Workshops 2008, pp. 442–446 (2008)Google Scholar
  32. 32.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jing He
    • Yanchun Zhang
      • Guangyan Huang
        • Yefei Xin
          • Xiaohui Liu
            • 1
          • Hao Lan Zhang
            • 2
          • Stanley Chiang
            • 3
          • Hailun Zhang
            • 4
          1. 1.Brunel UniversityUK
          2. 2.NITZhejiang UniversityChina
          3. 3.TLC Medical PTY LTDAustralia
          4. 4.Step High Technology Co LtdChina

          Personalised recommendations