A parallel FP-growth algorithm on World Ocean Atlas data with multi-core CPU

  • Yu Jiang
  • Minghao Zhao
  • Chengquan Hu
  • Lili He
  • Hongtao Bai
  • Jin Wang
Article
  • 33 Downloads

Abstract

According to the complexity of ocean data, this paper adopts a parallel mining algorithm of association rules to explore the correlation and regularity of oxygen, temperature, phosphate, nitrate and silicate in the ocean. After the marine data is interpolated, this paper utilizes the parallel FP-growth algorithm to mine the data and then briefly analyzes the mining results of the frequent itemsets and association rules. The relationship between the parallel efficiency and the core number of CPU is analyzed through datasets with different scales. The experimental results indicate that the acceleration effect is ideal when each thread scored 200,000–300,000 data, which leads to more than 1.2 times of performance improvement.

Keywords

Association rules mining FP-growth WOA13 Parallel algorithm 

Notes

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (51679105, 61672261, 51409117) and Jilin Province Department of Education Thirteen Five science and technology research projects [2016] No. 432, [2017] No. JJKH20170804KJ.

References

  1. 1.
    Dericquebourg P, Person A, Ségalen L et al (2015) Environmental significance of Upper Miocene phosphorites at hominid sites in the Lukeino Formation (Tugen Hills, Kenya). Sediment Geol 327:43–54CrossRefGoogle Scholar
  2. 2.
    Gadino AN, Brunner JF, Chambers U et al (2016) A perspective on the extension of research-based information to orchard management decision-makers: lessons learned and potential future directions. Biol Control 102:121–127CrossRefGoogle Scholar
  3. 3.
    Shinohara M, Kanazawa T, Shiobara H (2011) Recent progress in ocean bottom seismic observation and new results of marine seismology. In: Underwater Technology. IEEE, 2011, pp 1–7Google Scholar
  4. 4.
    King B (2001) Argo: the global array of profiling floats. Godae Project Office, Melbourne, pp 248–258Google Scholar
  5. 5.
    Chu PC, Fan CW (2016) Absolute geostrophic velocity inverted from World Ocean Atlas 2013 (WOAV13) with the P-vector method. Geosci Data J 2(2):78–82CrossRefGoogle Scholar
  6. 6.
    Guinehut S, Traon PYL, Larnicol G et al (2004) Combining Argo and remote-sensing data to estimate the ocean three-dimensional temperature fields—a first approach based on simulated observations. J Mar Syst 46(1):85–98CrossRefGoogle Scholar
  7. 7.
    Gengxin Ch, Yijun H, Xiaoqing Ch et al (2010) Vertical structure and evolution of the Luzon Warm Eddy. Chin J Oceanol Limnol 28(05):955–961CrossRefGoogle Scholar
  8. 8.
    Kobashi F, Kubokawa A (2012) Review on North Pacific subtropical countercurrents and subtropical fronts: role of mode waters in ocean circulation and climate. J Oceanogr 68(1):21–43CrossRefGoogle Scholar
  9. 9.
    Liu C, Armin K, Liu Z et al (2016) Deep-reaching thermocline mixing in the equatorial pacific cold tongue. Nat Commun 7:11576CrossRefGoogle Scholar
  10. 10.
    Lin Kawuu W, Chung Sheng-Hao, Lin Chun-Cheng (2016) A fast and distributed algorithm for mining frequent patterns in congested networks. Computing 98(3):235–256MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Yang XY, Liu Z, Fu Y (2010) MapReduce as a programming model for association rules algorithm on Hadoop. In: International Conference on Information Sciences and Interaction Sciences. IEEE, 2010, pp 99–102Google Scholar
  12. 12.
    Xiaohong L, Yan J, Yilong L et al (2016) Time series of raster-oriented method for marine abnormal events extraction. J Geo-Inf Sci 18(4):453–460Google Scholar
  13. 13.
    Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp 207–216Google Scholar
  14. 14.
    Hájek P, Havel I, Chytil M (1966) The GUHA method of automatic hypotheses determination. Computing 1(4):293–308CrossRefMATHGoogle Scholar
  15. 15.
    Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., pp 487–499Google Scholar
  16. 16.
    Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390CrossRefGoogle Scholar
  17. 17.
    Han J, Pei J, Yin Y et al (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87MathSciNetCrossRefGoogle Scholar
  18. 18.
    Rong Z, Xia D, Zhang Z (2013) Complex statistical analysis of big data: implementation and application of Apriori and FP-growth algorithm based on MapReduce. In: Proceedings of 2013 IEEE 4th International Conference on Software Engineering and Service Science (ICSESS), pp 968–972Google Scholar
  19. 19.
    Qu Z, Keeney J, Robitzsch S et al (2016) Multilevel pattern mining architecture for automatic network monitoring in heterogeneous wireless communication networks. China Commun 13(7):108–116CrossRefGoogle Scholar
  20. 20.
    Shen J, Shen J, Chen X et al (2016) An efficient public auditing protocol with novel dynamic structure for cloud data. IEEE Trans Inf Forensics Secur 12(10):2402–2415CrossRefGoogle Scholar
  21. 21.
    Xia Z, Wang X, Zhang L et al (2017) A privacy-preserving and copy-deterrence content-based image retrieval scheme in cloud computing. IEEE Trans Inf Forensics Secur 11(11):2594–2608CrossRefGoogle Scholar
  22. 22.
    Xia Z, Wang X, Sun X et al (2016) A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans Parallel Distrib Syst 27(2):340–352CrossRefGoogle Scholar
  23. 23.
    Kong Y, Zhang M, Ye D (2017) A belief propagation-based method for task allocation in open and dynamic cloud environments. Knowl Based Syst 115:123–132CrossRefGoogle Scholar
  24. 24.
    Wang Y, Cai S, Yin M (2017) Local search for minimum weight dominating set with two-level configuration checking and frequency based scoring function. J Artif Intell Res (JAIR) 58:267–295MathSciNetMATHGoogle Scholar
  25. 25.
    Wang Y, Cai S, Yin M (2016) Two efficient local search algorithms for maximum weight clique problem. In: AAAI, pp 805–811Google Scholar
  26. 26.
    Wang Y, Yin M, Ouyang D et al (2017) A novel local search algorithm with configuration checking and scoring mechanism for the set k-covering problem. Int Trans Oper Res 24(6):1463–1485MathSciNetCrossRefMATHGoogle Scholar
  27. 27.
    Wang Y, Ouyang D, Zhang L et al (2017) A novel local search for unicost set covering problem using hyperedge configuration checking and weight diversity. Sci China Inf Sci 60(6):062103CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Yu Jiang
    • 1
    • 2
  • Minghao Zhao
    • 1
    • 2
  • Chengquan Hu
    • 1
    • 2
  • Lili He
    • 1
    • 2
  • Hongtao Bai
    • 1
    • 2
  • Jin Wang
    • 3
    • 4
  1. 1.College of Computer Science and TechnologyJilin UniversityChangchunChina
  2. 2.Key Laboratory of Symbolic Computation and Knowledge EngineeringJilin UniversityChangchunChina
  3. 3.School of Information EngineeringYangzhou UniversityYangzhouChina
  4. 4.Key Lab of Broadband Wireless Communication and Sensor Network TechnologyNanjing University of Posts and Telecommunications, Ministry of EducationNanjingChina

Personalised recommendations