Research and Application of Spark Platform on Big Data Processing in Intelligent Agriculture of Jilin Province

  • Siwei Fu
  • Guifen ChenEmail author
  • Shan Zhao
  • Enze Xiao
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 546)


Aiming at the demand of real-time massive data processing of Intelligent Agriculture in Jilin Province, this paper studies the big data processing of Intelligent Agriculture in Jilin Province based on Spark platform by acquiring real-time data through monitoring platform. This study first conducted the performance comparison experiment of Hadoop and Spark data processing platform, then used the Spark distributed cluster computing platform, real-time processing the big data of monitoring area. The experimental results show that the Spark platform speeds up 11.4 times faster than the Hadoop platform in the case of 100 million data sizes; and based on the Spark platform for real-time processing of big data intelligent agricultural monitoring network, not only provides memory calculations to reduce IO overhead, but also the results are faster and more accurate. The research results provide strong support for the implementation of precision agriculture technology in intelligent agriculture.


Spark Big data processing MapReduce Intelligent Agriculture in Jilin Province 



This work was funded by the China Spark Program. 2015GA660004. “Integration and demonstration of corn precise operation technology based on Internet of things”.


  1. 1.
    Cheng, X., Jin, X., Wang, Y., Guo, J., Zhang, T., Li Guojie, J.: Large data system and analysis technology. J. Softw. (09), 1889–1908 (2014)Google Scholar
  2. 2.
    Reyes-Ortiz, J.L., Oneto, L., Anguita, D.: Big data analytics in the cloud: spark on Hadoop vs MPI/OpenMP on beowulf. Procedia Comput. Sci. 53, 121–130 (2015)CrossRefGoogle Scholar
  3. 3.
    Feng, Y., Huarui, W., Huaji, Z., Haihui, Z., Xiang, S.: Based on Hadoop’s massive agricultural data resource management platform. Comput. Eng. 12, 242–244 (2011)Google Scholar
  4. 4.
    Shyam, R., Bharathi Ganesh, H.B., Sachin Kumar, S., Poornachandran, P., Soman, K.P.: Apache spark a big data analytics platform for smart grid. Procedia Technol. 21, 171–178 (2015)CrossRefGoogle Scholar
  5. 5.
    Qi, R., Wang, Z., Huang, Y., Li S.: Based on Spark’s parallel combination test case set generation method. J. Comput. Sci., 1–18 (2017)Google Scholar
  6. 6.
    Jian, L., Guifen, C., Ying, M., Hang, C.: Research and system realization of farmland environment simulation monitoring based on 3D GIS. Chin. J. Agric. Sci. Technol. 3, 50–55 (2017)Google Scholar
  7. 7.
    Czerwinski, D.: Digital filter implementation in Hadoop data mining system. Comput. Netw., 410–420 (2015)Google Scholar
  8. 8.
    He, Q., Wang, H., Zhuang, F., Shang, T., Shi, Z.: Parallel sampling from big data with uncertainty distribution. Fuzzy Sets Syst., 117–133 (2015)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Yang, Z., Zheng, Q., Wang, S., Yang, J., Zhou, L.: Adaptive task scheduling strategy under heterogeneous Spark cluster. Comput. Eng. 1, 31–35 (2016)Google Scholar
  10. 10.
    Chen, G.F., Dong, W., Jiang, J., Wang, G.W.: Variable—rata fertilization decision—making system based on visualization toolkit and spatial fuzzy clustering. Sens. Lett. 01, 230–235 (2012)CrossRefGoogle Scholar
  11. 11.
    Fan, Z., Zhaokang, Y., Fanping, X., Kun, Y., Zhangye, W.: Visualization of large data heat map based on Spark. J. Comput. Aided Des. Graph. 11, 1881–1886 (2016)Google Scholar
  12. 12.
    Chen, G., Yang, Y., Guo, H., Sun, X., Chen, H., Cai, L.: Analysis and research of k-means algorithm in soil fertility based on Hadoop platform. In: Li, D., Chen, Y. (eds.) CCTA 2014. IAICT, vol. 452, pp. 304–312. Springer, Cham (2015). Scholar
  13. 13.
    Cai, L.: Based on large data processing technology Hadoop platform maize precision fertilization intelligent decision system research. Jilin Agricultural University, Changchun (2015)Google Scholar
  14. 14.
    Xin, W., Kan, L., Rongguo, C.: Distributed spatial data analysis framework based on Shark/Spark. Earth Inf. Sci. 04, 401–407 (2015)Google Scholar
  15. 15.
    Xiande, Z.: Based on the Spark platform real-time flow calculation recommendation system research and implementation. Jiangsu University, Jiangsu (2016)Google Scholar
  16. 16.
    Wen, Q., Wang, J., Zhu, H., Cao, Y., Long, M.: Distributed hash learning method for approximate nearest neighbor query. J. Comput. 01, 192–206 (2017)Google Scholar
  17. 17.
    Li, W., Chen, Y., Guo, K., Guo, S., Liu, Z.: Parallel limit learning machine based on improved particle swarm optimization. Pattern Recognit. Artif. Intell. 09, 840–849 (2016)Google Scholar
  18. 18.
    Ziyu, L.: Big Data Technology Principle and Application. People’s Posts and Telecommunications Press, Beijing (2017)Google Scholar
  19. 19.
    Yang, T., Wang, J., Yang, T., Zhang, X.: A data processing mechanism for high-efficiency large-scale graphs in Spark. Appl. Res. Comput. 12, 3730–3734 (2016)Google Scholar
  20. 20.
    Heng, C.: A Spark-based distributed semantic data distributed reasoning framework. Comput. Sci. S2, 93–96 (2016)Google Scholar
  21. 21.
    Zhang, X., Chen, H., Qian, J., Dong, Y.: HSSM: a method of maximizing hierarchical data for streaming data. J. Comput. Res. Dev. 08, 1792–1805 (2016)Google Scholar
  22. 22.
    Sun, Z., Du, K., Zheng, F., Yin, S.: Research and application of large data in wisdom agriculture. Chin. Agric. Sci. Technol. Rev. 06, 63–71 (2013)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2019

Authors and Affiliations

  1. 1.Jilin Agricultural UniversityChangchunChina

Personalised recommendations