Abstract
On the Internet, huge amounts of data generated is distributed, heterogeneous, dynamic, more complex, if the use of the existing centralized data mining methods can not meet the application requirements. To solve these problems, proposed a cloud computing- based Web data mining method, the massive data and mining tasks will be decomposed on multiple computers parallely processed. We use open platform–Hadoop to establish a parallel association rules mining algorithm based on Apriori, and it tests and veriftes the efficiency of system. This paper proposed a design thinking that migrate the calculation to the store, the calculation will be implemented on the locals to rage nodes, thus it can avoid the large amount of data transmission on the network, and will no take a lot of band width.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Li, J., Xu, C., Tan, S.-B.: A Web data mining system design and research. Computer Technology and Development 19(2), 55–58 (2009)
Tao, Z.: Web Data Mining Analysis. Friends of Science 6(17), 68–73 (2009)
Branch, C.K., Dashun, Y.: Web data integration in data mining research. Computer Engineering and Design 8(27), 271–350 (2006)
Jun, J.: A cloud-based data mining platform architecture design and implementation. Qingdao University, Qingdao (2009)
Zheng, J.: Grid-based parallel implementation of data mining algorithms. Fujian University of Technology 2(8), 20–24 (2010)
Ye, Y.-B., Chiang, C.C.: A Parallel Apriori Al gori thm f or Frequent It em set s Mining. In: Proceedings of the Fourth International Conference on Software Engineering Research Management and Applications (SERA 2006), pp. 7–94 (2006)
Zheng, J.: Grid-based parallel implementation of data mining algorithms. Fujian University of Technology 2(8), 57–64 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ruan, S. (2012). Based on Cloud-Computing’s Web Data Mining. In: Zhao, M., Sha, J. (eds) Communications and Information Processing. Communications in Computer and Information Science, vol 289. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31968-6_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-31968-6_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31967-9
Online ISBN: 978-3-642-31968-6
eBook Packages: Computer ScienceComputer Science (R0)