Abstract
Correlation is an important statistical measure for estimating dependencies between numerical attributes in multivariate datasets. Previous correlation discovery algorithms mostly dedicate to find piecewise correlations between the attributes. Other research efforts, such as correlation preserving discretization, can find strongly correlated intervals through a discretization process while preserving correlation. However, discretization based methods suffer from some fundamental problems, such as information loss and crisp boundary. In this paper, we propose a novel method to discover strongly correlated intervals from numerical datasets without using discretization. We propose a hypergraph model to capture the underlying correlation structure in multivariate numerical data and a corresponding algorithm to discover strongly correlated intervals from the hypergraph model. Strongly correlated intervals can be found even when the corresponding attributes are less or not correlated. Experiment results from a health social network dataset show the effectiveness of our algorithm.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Semantic Mining of Activity, Social, and Health data, AIMLAB, University of Oregon. http://aimlab.cs.uoregon.edu/smash/. Accessed June 2015
Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. In: ACM International Conference on Knowledge Discovery and Data Mining, pp. 261–270 (1999)
Doyle, P., Snell, L.: Random walks and electric networks. Appl. Math. Comput. 10, 12 (1984)
Freedman, D., Pisani, R., Purves, R.: Statistics. W.W. Norton & Company, New York (2007)
Ishibuchi, H., Yamamoto, T., Nakashima, T.: Fuzzy data mining: effect of fuzzy discretization. In: IEEE International conference on Data Mining, pp. 241–248 (2001)
Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. Int. Trans. Comput. Sci. Eng. 32(1), 47–58 (2006)
Liu, H., LePendu, P., Jin, R., Dou, D.: A hypergraph-based method for discovering semantically associated itemsets. In: IEEE International Conference on Data Mining, pp. 398–406 (2011)
Lovász, L.: Random walks on graphs: a survey. Combinatorics, Paul erdos is eighty 2(1), 1–46 (1993)
Mehta, S., Parthasarathy, S., Yang, H.: Toward unsupervised correlation preserving discretization. IEEE Trans. Knowl. Data Eng. 17(9), 1174–1185 (2005)
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: ACM International Conference on Management of Data, pp. 1–12 (1996)
Struc, V., Pavesic, N.: The corrected normalized correlation coefficient: a novel way of matching score calculation for lda-based face verification. In: International Conference on Fuzzy Systems and Knowledge Discovery, pp. 110–115 (2008)
Takeshi, F., Yasuhido, M., Shinichi, M., Takeshi, T.: Mining optimized association rules for numeric attributes. In: Symposium on Principles of Database Systems, pp. 182–191 (1996)
Takeshi, F., Yasukiko, M., Shinichi, M., Takeshi, T.: Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization. In: ACM International Conference on Management of Data, pp. 13–23 (1996)
Zhou, D., Huang, J., Schölkopf, B.: Learning with hypergraphs: clustering, classification, and embedding. In: Advances in Neural Information Processing Systems, pp. 1601–1608 (2007)
Acknowledgment
This work is supported by the NIH grant R01GM103309. We thank Brigitte Piniewski and David Kil for their input.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, H., Dou, D., Fang, Y., Zhang, Y. (2015). Mining Strongly Correlated Intervals with Hypergraphs. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds) Database and Expert Systems Applications. Globe DEXA 2015 2015. Lecture Notes in Computer Science(), vol 9262. Springer, Cham. https://doi.org/10.1007/978-3-319-22852-5_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-22852-5_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22851-8
Online ISBN: 978-3-319-22852-5
eBook Packages: Computer ScienceComputer Science (R0)