Abstracts
Distributed systems, represented by Hadoop, are becoming an essential component of large-scale mining system. Therefore, this paper is to complete a data mining task in the Hadoop distributed system, whose main purpose is to build a distributed cluster computing environment by Hadoop and perform data mining tasks in the environment. The paper studies the Hadoop system structure and acquires an in-depth understanding on the distributed file system HDFS and the principle and implementation of MapReduce parallel programming model. We achieve a systemic control of the data mining process, apply the traditional data mining algorithms to MapReduce programming model, research the implementation of data mining algorithms on Hadoop platform, and mainly analyze the execution efficiency and scalability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xu, Z.: Big data: The impending data revolution, Guangxi Normal University, Guangxi (2012)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2006)
Chow, J.: Redpoll:A machine learning library based on hadoop, CS Department, Jinan University, Guangzhou (2010)
Qin, G., Li, Q.: Knowledge acquisition and discovery based on data mining. Comput. Eng. (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer India
About this paper
Cite this paper
Guo, J., Li, Y., Du, L., Zhao, G., Jiang, J. (2014). Research on Distributed Data Mining System Based on Hadoop Platform. In: Patnaik, S., Li, X. (eds) Proceedings of International Conference on Computer Science and Information Technology. Advances in Intelligent Systems and Computing, vol 255. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1759-6_72
Download citation
DOI: https://doi.org/10.1007/978-81-322-1759-6_72
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1758-9
Online ISBN: 978-81-322-1759-6
eBook Packages: EngineeringEngineering (R0)