Abstract
Many different kinds of algorithms have been developed to discover relationships between two attribute groups (e.g., association rule discovery algorithms, functional dependency discovery algorithms, and correlation tests). Of these algorithms, only the correlation tests discover relationships using the measurement scales of attribute groups. Measurement scales determine whether order or distance information should be considered in the relationship discovery process. Order and distance information limits the possible forms a legitimate relationship between two attribute groups can have. Since this information is considered in correlation tests, the relationships discovered tend not to be spurious. Furthermore, the result of a correlation test can be empirically evaluated by measuring its significance. Often, the appropriate correlation test to apply on an attribute group pair must be selected manually, as information required to identify the appropriate test (e.g., the measurement scale of the attribute groups) is not available in the database. However, information required for test identification can be inferred from the system catalog, and analysis of the values of the attribute groups. In this paper, we propose a (semi-) automated correlation test identification method which infers information for identifying appropriate tests, and measures the correlation between attribute group pairs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Agrawal, T. Imielinski, A. Swami. Mining Association Rules Between Sets of Items in Large Databases. Proc. of the ACM SIGMOD Conference on Management of Data. pp. 207–216.
R.B. Burns. Introduction to Research Methods — Third Edition. Addison-Wesley. 1997.
S. Brin, R. Motwani, C. Silverstein. Beyond Market Baskets: Generalizing Association Rules to Correlations. Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data. 1997. pp. 265–276.
T.J. Biblarz, A.E. Raftery. The Effects of Family Disruption on Social Mobility. American Sociological Review. 1993.
B. Everitt. Cluster Analysis. Heinemann Educational Books. 1980.
J.E. Freund, R.E. Walpole. Mathematical Statistics-Fourth Edition. Prentice-Hall. 1987.
J.D. Gibbons. Nonparametric Methods for Quantitative Analysis (Second Edition). American Sciences Press Inc. 1985.
V. Greaney, and T. Kelleghan. Equality of Opportunity in Irish Schools. Dublin: Educational Company. 1984.
J.F. Hair Jr., R.E. Anderson, R.L. Tatham, and W.C. Black. Multivariate Data Analysis with Readings. Prentice Hall. 1995.
D.V. Huntsberger, and P.P. Billingsley. Elements of Statistical Inference. Allyn and Bacon Inc. 1987.
J.S. Long. Regression Models for Categorical and Limited Dependent Variables. Sage Publications. 1997.
J.A. Larson, S.B. Navathe, R. Elmasri. A Theory of Attribute Equivalence in Databases with Application to Schema Integration. IEEE Transactions on Software Engineering. April 1989. pp. 449–463.
H. Mannila, and K.J. Raiha. Algorithms for Inferring Functional Dependencies From Relations. Data and Knowledge Engineering. February, 1994. pp. 83–90.
J. Neter, W. Wasserman, M.H. Kutner. Applied Linear Regression Models. 2nd Edition. Irwin Homewood. 1989.
P.D. Scott, A.P.M. Coxon, M.H. Hobbs, R.J. Williams. SNOUT: An Intelligent Assistant for Exploratory Data Analysis. Principles of Knowledge Discovery and Data Mining. 1997. pp. 189–199.
G.B. Thomas, and R.L. Finney. Calculus and Analytic Geometry. Addison-Wesley. 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cecil, C.E.H., Chiang, R.H.L., Lim, EP. (1999). A Heuristic Method for Correlating Attribute Group Pairs in Data Mining. In: Kambayashi, Y., Lee, D.L., Lim, EP., Mohania, M.K., Masunaga, Y. (eds) Advances in Database Technologies. ER 1998. Lecture Notes in Computer Science, vol 1552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49121-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-49121-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65690-6
Online ISBN: 978-3-540-49121-7
eBook Packages: Springer Book Archive