Multi-value Association Patterns and Data Mining

Lui, Thomas W. H.; Chiu, David K. Y.

doi:10.1007/978-3-642-01091-0_8

Thomas W. H. Lui⁶ &
David K. Y. Chiu⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 206))

980 Accesses
1 Citations

Summary

Mining patterns involving multiple values that are significantly relevant is a difficult but very important problem that crosses many disciplines. Multi-value association patterns, which generalize sequentially ordered patterns, are sets of associated values extracted from sampling outcomes of a random N-tuple. Because they are value patterns from multiple variables, they are more specifically defined than their corresponding variable patterns. They are also easier to interpret. Normally, they can be detected by statistical testing if the occurrence of a pattern event is significantly deviated from the expected according to a prior model or null hypothesis. When the null hypothesis presumes the values of a pattern to be independent, the alternative hypothesis asserts that the values as a whole are associated, allowing some values to be independent within the detected set. Recently, a special type of multi-value association pattern is proposed which we called nested high-order pattern (NHOP), which is a subtype of the high-order pattern (HOP). We discuss here these patterns together with a related one called consigned pattern (CP). Evaluations using relevant experiments of synthetic and biomolecular data are also included.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: SIGMOD Conference 1993, pp. 207–216 (1993)
Google Scholar
Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S.: The Pfam protein families database. Nucleic Acids Research 32, D138–D141 (2004)
Article Google Scholar
Bazzi, I., Glass, J.: Learning units for domain-independent out-of-vocabulary word modelling. In: Proceedings of European Conference on Speech Communication and Technology, Aalborg, pp. 61–64 (September 2001)
Google Scholar
Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlations. In: SIGMOD Conference 1997, pp. 265–276 (1997)
Google Scholar
Chiu, D.K.Y., Lui, T.W.H.: Integrated use of multiple interdependent patterns for biomolecular sequence analysis. International Journal of Fuzzy Systems, Special Issue on Intelligent Computation for Data Mining and Knowledge Discovery 4(3), 766–775 (2002)
Google Scholar
Chiu, D.K.Y., Lui, T.W.H.: A multiple-pattern biosequence analysis method for diverse source association mining. Applied Bioinformatics 4(2), 85–92 (2005)
Article Google Scholar
Chiu, D.K.Y., Wong, A.K.C., Cheung, B.: Information discovery through hierarchical maximum entropy discretization and synthesis. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 125–140. MIT/AAAI Press (1991)
Google Scholar
Chiu, D.K.Y., Wong, A.K.C.: Multiple pattern associations for interpreting structural and functional characteristics of biomolecules. Information Science, An International Journal 167, 23–39 (2004)
MATH MathSciNet Google Scholar
Di Nardo, A.A., Larson, S.M., Davidson, A.R.: The relationship between conservation, thermodynamic stability, and function in the SH3 domain hydrophobic core. Journal of Molecular Biology 333(3), 641–655 (2003)
Article Google Scholar
Haberman, S.J.: The analysis of residuals in cross-classified tables. Biometrics 29, 205–220 (1973)
Article Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM-SIGMOD international conference on management of data (SIGMOD 2000), Dallas, TX, pp. 1–12 (2000)
Google Scholar
Jaroszewicz, S., Simovici, D.A.: Interestingness of frequent itemsets using Bayesian networks as background knowledge. In: KDD 2004, pp. 178–186 (2004)
Google Scholar
Jaroszewicz, S., Scheffer, T.: Fast discovery of unexpected patterns in data, relative to a Bayesian network. In: KDD 2005, pp. 118–127 (2005)
Google Scholar
Lui, T.W.H., Chiu, D.K.Y.: Discovering maximized progressive high-order patterns in biosequences. In: Cao, P.Y., et al. (eds.) Proceedings of the 10^th Joint Conference on Information Sciences, pp. 110–115 (2007)
Google Scholar
Lui, T.W.H., Chiu, D.K.Y.: Complementary Analysis of High-Order Association Patterns and Classification. In: Proceedings of the 21^st Florida Artificial Intelligence Research Society Conference (FLAIRS), Florida, USA, pp. 294–299 (2008)
Google Scholar
Sy, B.K.: Information-statistical pattern based approach for data mining. Journal of Statistical Computing and Simulation 69(2), 1–31 (2001)
Article MathSciNet Google Scholar
Sy, B.K.: Discovering association patterns based on mutual information. In: Perner, P., Rosenfeld, A. (eds.) MLDM 2003. LNCS, vol. 2734, pp. 369–378. Springer, Heidelberg (2003)
Chapter Google Scholar
Tillier, E.R., Lui, T.W.H.: Using multiple interdependency to separate functional from phylogenetic correlations in protein alignments. Bioinformatics 19, 750–755 (2003)
Article Google Scholar
Wang, W., Yang, J.: Mining sequential patterns from large data sets. In: Elmagarmid, A.K. (ed.) Advances in Database Systems. Springer, Heidelberg (2005)
Google Scholar
Wong, A.K.C., Wang, Y.: High-order pattern discovery from discrete-valued data. IEEE Transactions on Knowledge and Data Engineering 8(6), 877–892 (1997)
Article Google Scholar
Wong, A.K.C., Wang, Y.: Pattern discovery: A data driven approach to decision support. IEEE Transactions on Knowledge and Data Engineering 15(3), 914–925 (2003)
MathSciNet Google Scholar
Wu, X., Barbara, D., Ye, Y.: Screening and interpreting multi-item associations based on log-linear modeling. In: KDD 2003, pp. 276–285 (2003)
Google Scholar
Zaki, M.J.: Scalable algorithms for association mining. IEEE Transactions on Knowlegde and Data Engineering 12, 372–390 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing and Information Science, University of Guelph,
Thomas W. H. Lui & David K. Y. Chiu

Authors

Thomas W. H. Lui
View author publications
You can also search for this author in PubMed Google Scholar
David K. Y. Chiu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Machine Intelligence Research Labs, (MIR Labs), Scientific Network for Innovation and Research Excellence, Auburn, P.O. Box 2259, 98071-2259, Washington, USA
Ajith Abraham
College of Business Administration, Quantitative and Information System Department, Kuwait University, P.O. Box 5486, 13055, Safat, Kuwait
Aboul-Ella Hassanien
Department of Computer Science, University of São Paulo, Caixa Postal 668, 13560-970, Sao Carlos, SP, Brazil
André Ponce de Leon F. de Carvalho
Dept. Computer Science, Technical University Ostrava, Tr. 17. Listopadu 15, 708 33, Ostrava, Czech Republic
Václav Snášel

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lui, T.W.H., Chiu, D.K.Y. (2009). Multi-value Association Patterns and Data Mining. In: Abraham, A., Hassanien, AE., de Leon F. de Carvalho, A.P., Snášel, V. (eds) Foundations of Computational, IntelligenceVolume 6. Studies in Computational Intelligence, vol 206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01091-0_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-01091-0_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01090-3
Online ISBN: 978-3-642-01091-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics