Skip to main content

Part of the book series: Studies in Computational Intelligence ((SCI,volume 206))

Summary

Mining patterns involving multiple values that are significantly relevant is a difficult but very important problem that crosses many disciplines. Multi-value association patterns, which generalize sequentially ordered patterns, are sets of associated values extracted from sampling outcomes of a random N-tuple. Because they are value patterns from multiple variables, they are more specifically defined than their corresponding variable patterns. They are also easier to interpret. Normally, they can be detected by statistical testing if the occurrence of a pattern event is significantly deviated from the expected according to a prior model or null hypothesis. When the null hypothesis presumes the values of a pattern to be independent, the alternative hypothesis asserts that the values as a whole are associated, allowing some values to be independent within the detected set. Recently, a special type of multi-value association pattern is proposed which we called nested high-order pattern (NHOP), which is a subtype of the high-order pattern (HOP). We discuss here these patterns together with a related one called consigned pattern (CP). Evaluations using relevant experiments of synthetic and biomolecular data are also included.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: SIGMOD Conference 1993, pp. 207–216 (1993)

    Google Scholar 

  2. Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S.: The Pfam protein families database. Nucleic Acids Research 32, D138–D141 (2004)

    Article  Google Scholar 

  3. Bazzi, I., Glass, J.: Learning units for domain-independent out-of-vocabulary word modelling. In: Proceedings of European Conference on Speech Communication and Technology, Aalborg, pp. 61–64 (September 2001)

    Google Scholar 

  4. Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlations. In: SIGMOD Conference 1997, pp. 265–276 (1997)

    Google Scholar 

  5. Chiu, D.K.Y., Lui, T.W.H.: Integrated use of multiple interdependent patterns for biomolecular sequence analysis. International Journal of Fuzzy Systems, Special Issue on Intelligent Computation for Data Mining and Knowledge Discovery 4(3), 766–775 (2002)

    Google Scholar 

  6. Chiu, D.K.Y., Lui, T.W.H.: A multiple-pattern biosequence analysis method for diverse source association mining. Applied Bioinformatics 4(2), 85–92 (2005)

    Article  Google Scholar 

  7. Chiu, D.K.Y., Wong, A.K.C., Cheung, B.: Information discovery through hierarchical maximum entropy discretization and synthesis. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 125–140. MIT/AAAI Press (1991)

    Google Scholar 

  8. Chiu, D.K.Y., Wong, A.K.C.: Multiple pattern associations for interpreting structural and functional characteristics of biomolecules. Information Science, An International Journal 167, 23–39 (2004)

    MATH  MathSciNet  Google Scholar 

  9. Di Nardo, A.A., Larson, S.M., Davidson, A.R.: The relationship between conservation, thermodynamic stability, and function in the SH3 domain hydrophobic core. Journal of Molecular Biology 333(3), 641–655 (2003)

    Article  Google Scholar 

  10. Haberman, S.J.: The analysis of residuals in cross-classified tables. Biometrics 29, 205–220 (1973)

    Article  Google Scholar 

  11. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM-SIGMOD international conference on management of data (SIGMOD 2000), Dallas, TX, pp. 1–12 (2000)

    Google Scholar 

  12. Jaroszewicz, S., Simovici, D.A.: Interestingness of frequent itemsets using Bayesian networks as background knowledge. In: KDD 2004, pp. 178–186 (2004)

    Google Scholar 

  13. Jaroszewicz, S., Scheffer, T.: Fast discovery of unexpected patterns in data, relative to a Bayesian network. In: KDD 2005, pp. 118–127 (2005)

    Google Scholar 

  14. Lui, T.W.H., Chiu, D.K.Y.: Discovering maximized progressive high-order patterns in biosequences. In: Cao, P.Y., et al. (eds.) Proceedings of the 10th Joint Conference on Information Sciences, pp. 110–115 (2007)

    Google Scholar 

  15. Lui, T.W.H., Chiu, D.K.Y.: Complementary Analysis of High-Order Association Patterns and Classification. In: Proceedings of the 21st Florida Artificial Intelligence Research Society Conference (FLAIRS), Florida, USA, pp. 294–299 (2008)

    Google Scholar 

  16. Sy, B.K.: Information-statistical pattern based approach for data mining. Journal of Statistical Computing and Simulation 69(2), 1–31 (2001)

    Article  MathSciNet  Google Scholar 

  17. Sy, B.K.: Discovering association patterns based on mutual information. In: Perner, P., Rosenfeld, A. (eds.) MLDM 2003. LNCS, vol. 2734, pp. 369–378. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  18. Tillier, E.R., Lui, T.W.H.: Using multiple interdependency to separate functional from phylogenetic correlations in protein alignments. Bioinformatics 19, 750–755 (2003)

    Article  Google Scholar 

  19. Wang, W., Yang, J.: Mining sequential patterns from large data sets. In: Elmagarmid, A.K. (ed.) Advances in Database Systems. Springer, Heidelberg (2005)

    Google Scholar 

  20. Wong, A.K.C., Wang, Y.: High-order pattern discovery from discrete-valued data. IEEE Transactions on Knowledge and Data Engineering 8(6), 877–892 (1997)

    Article  Google Scholar 

  21. Wong, A.K.C., Wang, Y.: Pattern discovery: A data driven approach to decision support. IEEE Transactions on Knowledge and Data Engineering 15(3), 914–925 (2003)

    MathSciNet  Google Scholar 

  22. Wu, X., Barbara, D., Ye, Y.: Screening and interpreting multi-item associations based on log-linear modeling. In: KDD 2003, pp. 276–285 (2003)

    Google Scholar 

  23. Zaki, M.J.: Scalable algorithms for association mining. IEEE Transactions on Knowlegde and Data Engineering 12, 372–390 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Lui, T.W.H., Chiu, D.K.Y. (2009). Multi-value Association Patterns and Data Mining. In: Abraham, A., Hassanien, AE., de Leon F. de Carvalho, A.P., Snášel, V. (eds) Foundations of Computational, IntelligenceVolume 6. Studies in Computational Intelligence, vol 206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01091-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01091-0_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01090-3

  • Online ISBN: 978-3-642-01091-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics