Abstract
The typical model, which involves the measures: support, confidence, and interest, is often adapted to mining association rules. In the model, the related parameters are usually chosen by experience; consequently, the number of useful rules is hard to estimate. If the number is too large, we cannot effectively extract the meaningful rules. This paper analyzes the meanings of the parameters and designs a variety of equations between the number of rules and the parameters by using regression method. Finally, we experimentally obtain a preferable regression equation. This paper uses multiple correlation coefficients to test the fitting effects of the equations and uses significance test to verify whether the coefficients of parameters are significantly zero or not. The regression equation that has a larger multiple correlation coefficient will be chosen as the optimally fitted equation. With the selected optimal equation, we can predict the number of rules under the given parameters and further optimize the choice of the three parameters and determine their ranges of values.
Similar content being viewed by others
References
R. Agrawal, T. Imielinski, A. Swami. Mining association rules between sets of items in large databases. In Proceedings of ACM SIGMOD International Conference on Management of Data, ACM, Washington DC, USA, pp. 207–216, 1993.
J. Wang, J. Han, J. Pei. Closet+: Searching for the best strategies for mining frequent closed itemsets. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, Washington DC, USA, pp. 236–245, 2003.
M. J. Zaki, C. Hsiao. Charm: An efficient algorithm for closed itemset mining. In Proceedings of the 2nd SIAM International Conference on Data Mining, Arlington, USA, pp. 12–28, 2002.
M. Li. Application of mining association rules with multiple minimum supports in sales data. Computer Engineering, vol. 23, no. 8, pp. 92–93, 2003. (in Chinese)
U. Yun. An efficient mining of weighted frequent pattern with length-decreasing support constraints. Knowledgebased Systems, vol. 21, no. 8, pp. 741–752, 2008.
E. R. Omiecinski. Alternative interest measures for mining associations in databases. IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 1, pp. 57–69, 2003.
S. Brin, R. Motwani, C. Silverstein. Beyond market baskets: Generalizing association rules to correlations. In Proceedings of ACM SIGMOD International Conference on Management of Data, ACM, Tucson, USA, pp. 265–276, 1997.
K. M. Ahmed, N. M. E. Makky, Y. Taha. A note on beyond market baskets: Generalizing association rules to correlations. ACM SIGKDD Explorations Newsletter, vol.1, no. 2, pp. 46–48, 2000.
Z. He, H. K. Huang, S. F. Tian. An approach to finding optimized correlated association rules. Chinese Journal of Computers, vol. 29, no. 6, pp. 906–913, 2006. (in Chinese)
D. Q. Ye, S. L. Zhao. Correlation technique research of association rule based on linear regression. Journal of Computer Research and Development, vol. 45, no. 21, pp. 291–294, 2008. (in Chinese)
H. Tsukimoto. Logical regression analysis: from mathematical formulas to linguistic rules. Studies in Fuzziness and Soft Computing, vol. 180, pp. 21–61, 2005.
Q. Liu, N. R. Cook, A. Bergstrom, C. C. Hsieh. A two-stage hierarchical regression model for meta-analysis of epidemiologic nonlinear dose-response data. Computational Statistics & Data Analysis, vol. 53, no. 12, pp. 4157–4167, 2009.
R. C. Tsaur, H. F. Wang. Necessity analysis of fuzzy regression equations using a fuzzy goal programming model. International Journal of Fuzzy Systems, vol. 11, no. 2, pp. 107–115, 2009.
N. Duan, F. N. Hu, X. Yu. An improved control algorithm for high-order nonlinear systems with unmodelled dynamics. International Journal of Automation and Computing, vol. 6, no. 3, pp. 234–239, 2009.
X. Y. Luo, Z. H. Zhu, X. P. Guan. Adaptive fuzzy dynamic surface control for uncertain nonlinear systems. International Journal of Automation and Computing, vol.6, no. 4, pp. 385–390, 2009.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Natural Science Foundation of China (No. J07240003, No. 60773084, No. 60603023) and National Research Fund for the Doctoral Program of Higher Education of China (No. 20070151009).
Wei-Guo Yi graduated from Northeast Normal University, PRC in 2002. He received the M. Sc. degree from Northeast Normal University in 2005. He is a lecturer of Dalian Jiaotong University, PRC. Currently, he is a Ph.D. candidate at the School of Information Science and Technology, Dalian Maritime University, Dalian, PRC.
His research interests include data mining and pattern recognition.
Ming-Yu Lu graduated from Heilongjiang University of Computer Software, PRC in 1985. He received the M. Sc. degree in 1988 and the Ph.D. degree in 2003, both from Tsinghua University, PRC. He is a senior member of the China Computer Federation and professor of Dalian Maritime University, PRC.
His research interests include data mining, text mining, and Web mining.
Zhi Liu graduated from Dalian University, PRC in 1995. She received the M. Sc. degree from Dalian University of Technology, PRC in 1999. She is an associate professor of Dalian Maritime University, PRC. Currently, she is a Ph. D. candidate at the School of Information Science and Technology, Dalian Maritime University.
Her research interests include data mining and artificial intelligence.
Rights and permissions
About this article
Cite this article
Yi, WG., Lu, MY. & Liu, Z. Regression analysis of the number of association rules. Int. J. Autom. Comput. 8, 78–82 (2011). https://doi.org/10.1007/s11633-010-0557-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11633-010-0557-x