The Tentative Research of Hydrological IoT Data Processing System Based on Apache Flink

  • Feng YeEmail author
  • Peng Zhang
  • Cheng Hu
  • Songjie Zhu
  • Ling Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11434)


With the widespread application of sensor and IoT technology in the field of water conservancy informatization, the traditional application systems based on Java EE or pure NoSQL databases for hydrological data processing and analysis have been difficult to meet the new requirements for processing and analyzing large-scale hydrological IoT stream data. How to select a suitable big data processing platform and how to implement application systems for hydrological IoT stream data requires in-depth theoretical foundations, more experimental comparisons, effective design paradigm and practical implementations. This paper summarizes the research status of big data in water conservancy domain, and then proposes a hydrological IoT data processing system based on Apache Flink. We use the sensor data obtained in Chuhe river as the experimental dataset, and take the common and daily operations for hydrological data as example. The experimental results show that the processing capability of the hydrological IoT data processing system is far superior to the traditional multi-tier architecture system based on Java EE or pure NoSQL databases, and it obviously becomes an appreciable solution for water conservancy informatization.


Apache Flink Stream data Water conservancy informatization IoT 



This work is partly supported by the 2018 Jiangsu Province Key Research and Development Program (Modern Agriculture) Project under Grant No.BE2018301, the 2017 Jiangsu Province Postdoctoral Research Funding Project under Grant No. 1701020C, the 2017 Six Talent Peaks Endorsement Project of Jiangsu under Grant No. XYDXX-078, and the Fundamental Research Funds for the Central Universities under Grant No. 2013B01814.


  1. 1.
    Yu, R., Yang, X., Huang, J., et al.: QoS-aware service selection in virtualization-based cloud computing. In: Proceedings of 14th Asia-Pacific Network Operations and Management Symposium: Management in the Big Data and IoT Era, pp. 1–8. IEEE Computer Society (2012)Google Scholar
  2. 2.
    Walker, S.J.: Big data: a revolution that will transform how we live, work, and think. Math. Comput. Educ. 47(17), 181–183 (2013)Google Scholar
  3. 3.
    Friedman, E., Tzoumas, K.: Introduction to Apache Flink: Stream Processing for Real Time and Beyond. O’Reilly Media, Sebastopol (2016)Google Scholar
  4. 4.
    Deshpande, T.: Learning Apache Flink. Packt Publishing, Birmingham (2017)Google Scholar
  5. 5.
    Feng, J., Xu, X., Tang, Z., et al.: Research on key technology of water big data and resource utilization. Water Resour. Informatiz. 8, 6–9 (2013)Google Scholar
  6. 6.
    Helsel, D.R., Hirsch, R.M.: Statistical Methods in Water Resources.
  7. 7.
    Gong, H., Liu, W., et al.: Water resources data center construction based on big data. In: 3rd Water Conservancy Information and Digital Water Conservancy Technology Forum, pp. 243–248. Hohai University Press, Nanjing (2015)Google Scholar
  8. 8.
    Qin, X., Wang, H., Du, X., et al.: Big data analysis-competition and symbiosis of RDBMS and MapReduce. J. Software 23(1), 32–45 (2012)CrossRefGoogle Scholar
  9. 9.
    Lam, C.: Hadoop in Action. Manning Publications, Stamford (2011)Google Scholar
  10. 10.
    Bajaber, F., Elshawi, R., Batarfi, O., et al.: Big data 2.0 processing systems: taxonomy and open challenges. J. Grid Comput. 14, 379–405 (2016)CrossRefGoogle Scholar
  11. 11.
    Sakr, S., Liu, A., Fayoumi, A.G.: The family of MapReduce and large-scale data processing systems. ACM Comput. Surv. 46(1), 10–11 (2013)CrossRefGoogle Scholar
  12. 12.
    Zhao, S., Jiang, J.: Typical big data computing frameworks. ZTE Technol. J. 22(2), 14–18 (2016)Google Scholar
  13. 13.
    Estrada, R., Ruiz, I.: Big Data SMACK: A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka. Apress, New York (2016)CrossRefGoogle Scholar
  14. 14.
    Zhang, P., Li, P., Ren, Y., et al.: Distributed stream processing and technologies for big data: a review. J. Comput. Res. Develop. 51(Suppl), 1–9 (2014)Google Scholar
  15. 15.
    Sakr, S.: Big Data 2.0 Processing Systems: A Survey, pp. 74–89. Springer, Cham (2016). Scholar
  16. 16.
    Liu, X., Iftikhar, N., Xie, X.: Survey of real-time processing systems for big data. In: Proceedings of the 18th International Database Engineering and Applications Symposium. Association for Computing Machinery, pp. 356–361 (2014)Google Scholar
  17. 17.
    Chintapalli, S., Dagit, D., Evans, B., et al.: Benchmarking streaming computation engines: storm, Flink and spark streaming. In: Proceedings of IEEE 28th International Parallel and Distributed Processing Symposium Workshops, pp. 1789–1792. IEEE Computer Society (2016)Google Scholar
  18. 18.
    Narkhede, N., Shapira, G., Palino, T.: Kafka: The Definitive Guide. O’Reilly Media, Sebastopol (2017)Google Scholar
  19. 19.
    Tiwari, S.: Professional NoSQL. Wiley, Indianapolis (2011)Google Scholar
  20. 20.
    George, L.: HBase: The Definitive Guide. O’Reilly Media, Sebastopol (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Feng Ye
    • 1
    • 2
    Email author
  • Peng Zhang
    • 3
  • Cheng Hu
    • 1
  • Songjie Zhu
    • 1
  • Ling Li
    • 1
  1. 1.College of Computer and InformationHohai UniversityNanjingChina
  2. 2.Postdoctoral Centre, Nanjing Longyuan Micro-Electronic CompanyNanjingChina
  3. 3.Jiangsu Province Water Resources DepartmentNanjingChina

Personalised recommendations