Abstract
Data mining technology and association rule mining can be important technologies to deal with a large amount of accumulated data in the medical field, and can reflect the value of large medical data. According to the characteristics of large medical data, aiming at the problem that the traditional Apriori algorithm scans the database too long and generates too many candidate itemsets, a method of digital mapping and sorting of itemsets is proposed. The method of the base model and generation model was used to generate superset, which can improve the efficiency of superset generation and pruning. By using open source framework Hadoop and transplanting the improved algorithm to the Hadoop platform combined with the MapReduce framework, the idea of parallel improvement was introduced based on database partition. Experimental results show that it solves the redundancy of large-scale data sets and makes Apriori algorithm have good parallel scalability. Finally, an example was given to demonstrate the possibility of improving the algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zou, Y.: Analysis and prospect of data mining technology in hospital informatization construction. Digit. Technol. Appl. 36(01), 233–234 (2018)
Wang, L.H.: Application of data mining in medical systems. Digit. Technol. Appl. (08), 96–98 (2017)
Lin, G.: Research on mining association rules of TCM symptoms-syndromes-drugs in chronic glomerulonephritis. University of Electronic Science and Technology (2016)
Fang, Y.Y., Zhu, X.M., Li, D.: Preliminary study on the utilization of cardiac configuration data of hypertension based on association rules. Sci. Technol. Innov. (06), 81–82 (2018)
Zang, W., Cao, B.X.: An improved parallel apriori algorithm with index structure. Electron. Technol. (2014)
Wang, D.M., Cui, X.Y.: Research on optimization of apriori algorithm based on cloud computing and big medical data. Beijing University of Posts and Telecommunications (2015)
Wu, X.Y., Mo, Z.: An optimization method for mining frequent itemsets based on APRI algorithm. Comput. Syst. Appl. 23(06), 124–129 (2014)
Chang, R., Chen, Z.W.: A hybrid structural optimization algorithm. China Manuf. Inf. 40(19), 49–53 (2011)
Yao, Q., Tian, Y., Li, P.: Design and development of a medical big data processing system based on Hadoop. J. Med. Syst. 39, 23 (2015)
Zhou H.: Disease correlation analysis of big medical data. Electron. Technol. Softw. Eng. (18), 187–188 (2017)
Li, S.Q.: Analysis and research of association rule mining algorithms in data mining. Electron. Technol. Softw. Eng. (04), 200 (2015)
Savasere, A., Omiecinski, E., Navathe, S.: An efficient algorithm for mining association rules in large databases. In: Proceedings of 1995 VLDB International Conference, Zurich, Switzerland, pp. 432–433 (2015)
Liu, L.J.: Research and application of improved Apriori algorithms. Comput. Eng. Design (12), 3324–3328 (2017)
Goethals, B.: Survey on frequent pattern mining, HUT basic research unit, vol. 30, no. 10, pp. 1–43. Department of Computer Science, University of Helsinki (2013)
Yu, H.L., Wen, J., Wang, H.G.: An improved Apriori algorithm based on the boolean matrix and hadoop. Proc. Eng. 15, 1827–1837 (2011)
Cui, L.: Application of MapReduce-based parallel association rule algorithms in social analysis. Hebei University of Engineering (2015)
Acknowledgments
This work was supported by the national natural science foundation of China ([2018]61741124) and the science planning project of Guizhou province (Guizhou science and technology cooperation platform talent [2018] no. 5781). What’s more, we thank the anonymous reviewers sincerely for their significant and valuable feedback.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kong, G., Tian, H., Wu, Y., Wei, Q. (2020). Improvement of Association Rule Algorithm Based on Hadoop for Medical Data. In: Qin, P., Wang, H., Sun, G., Lu, Z. (eds) Data Science. ICPCSEE 2020. Communications in Computer and Information Science, vol 1258. Springer, Singapore. https://doi.org/10.1007/978-981-15-7984-4_38
Download citation
DOI: https://doi.org/10.1007/978-981-15-7984-4_38
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-7983-7
Online ISBN: 978-981-15-7984-4
eBook Packages: Computer ScienceComputer Science (R0)