Advertisement

A Distributed Data Mining System Framework for Mobile Internet Access Log Based on Hadoop

  • Yunliang JiangEmail author
  • Jiangang Yang
  • Liang Tang
  • Yong Liu
  • Xiaoming Zhao
  • Xiulan Hao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8971)

Abstract

Because of the popularity of mobile phone and the development of mobile network, mobile data is growing explosively. Mobile data mining is more and more of attention. But single node-based data mining platform has been unable to store and analysis the massive data. According to cloud computing technology, we preset a distributed data mining framework based on Hadoop. Then, we present the implementation of this system framework and process mobile internet access log on the Hadoop cluster. Comparative tests will show that this distributed system framework is significantly efficient for processing huge scale dataset.

Keywords

Cloud computing Hadoop Hive Datamine Mobile phone web log 

Notes

Acknowledgments

This work was partly supported by National Natural Science Foundation of China (61370173, 61173123), Zhejiang Provincial Natural Science Foundation of China (R1090244, Z1101243, Y12F020061), Zhejiang Province Community Technology Research Projects (2011C23132), “Information Processing and Automation Technology” Key Subject Open Fund of Zhejiang Province (201100803), and Science Foundation of Chinese University (2012FZA5013). We are grateful to the anonymous referees for their insightful comments and suggestions, which clarified the presentation.

References

  1. 1.
    Bahmani, B., Chakrabarti, K., Xin, D.: Fast personalized PageRank on MapReduce. In: SIGMOD 2011, pp. 973–984Google Scholar
  2. 2.
    Li, B., Mazur, E., Diao, Y., McGregor, A., Shenoy, P.J.: A platform for scalable one-pass analytics using MapReduce. In: SIGMOD 2011, pp. 985–996Google Scholar
  3. 3.
    de Kruijf, M., Sankaralingam, K.: MapReduce for the cell broadband engine architecture. IBM J. Res. Dev. (IBMRD) 53(5), 10 (2009)Google Scholar
  4. 4.
    Hill, R., Hirsch, L., Lake, P.: Guide to Cloud Computing, pp. 5–20. Springer, London (2013)CrossRefGoogle Scholar
  5. 5.
  6. 6.
    Armbrust, M., et al.: A view of cloud computing. Commun. ACM (CACM) 53(4), 50–58 (2010)CrossRefGoogle Scholar
  7. 7.
  8. 8.
    White, T.: Hadoop: the Definitive Guide, 2nd edn. O’Reilly Media, Inc, California (2012)Google Scholar
  9. 9.
    Thusoo, A., et al.: Improving the performance of Hadoop Hive by sharing scan and computation tasks. J. Cloud Comput. Adv. Syst. Appl. 3, 140–156 (2014)Google Scholar
  10. 10.
    Thusoo, A., et. al.: Hive a warehousing solution over a MapReduce framework. Facebook Data Infrastructure Team (2009)Google Scholar
  11. 11.
    Dean, J., Ghemawat, S.: MapReduce: a flexible data processing tool. Commun. ACM (CACM) 53(1), 72–77 (2010)CrossRefGoogle Scholar
  12. 12.
    Verma, A., et al.: Breaking the MapReduce stage barrier. Cluster Comput. 16(1), 191–206 (2013)CrossRefGoogle Scholar
  13. 13.
    Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: EuroSys, pp. 59–72 (2007)Google Scholar
  14. 14.
    Chaiken, R., Jenkins, B., Larson, P.-A., Ramsey, B., Shakib, D., Weaver, S., Zhou J.: Scope: easy and efficient parallel processing of massive data sets. In: Proceedings of the VLDB Endowment, vol. 1, no. 2, pp. 1265–1276 (2008)Google Scholar
  15. 15.
    Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing. In: SIGMOD, pp. 1099–1110 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Yunliang Jiang
    • 1
    • 3
    Email author
  • Jiangang Yang
    • 3
  • Liang Tang
    • 4
  • Yong Liu
    • 2
  • Xiaoming Zhao
    • 5
  • Xiulan Hao
    • 1
  1. 1.School of Information EngineeringHuzhou UniversityHuzhouChina
  2. 2.National Laboratory of Industrial Control TechnologyZhejiang UniversityHangzhouChina
  3. 3.School of Computer ScienceHangzhou Dianzi UniversityHangzhouChina
  4. 4.China Ship Development and Design CenterWuhanChina
  5. 5.School of Mathematics and Information EngineeringTaizhou UniversityLinhaiChina

Personalised recommendations