Neural Computing and Applications

, Volume 29, Issue 7, pp 577–593 | Cite as

A novel hybrid data-driven model for multi-input single-output system simulation

  • Guangyuan Kan
  • Xiaoyan He
  • Jiren Li
  • Liuqian Ding
  • Dawei Zhang
  • Tianjie Lei
  • Yang Hong
  • Ke Liang
  • Depeng Zuo
  • Zhenxin Bao
  • Mengjie Zhang
Original Article


Artificial neural network (ANN)-based data-driven model is an effective and robust tool for multi-input single-output (MISO) system simulation task. However, there are several conundrums which deteriorate the performance of the ANN model. These problems include the hard task of topology design, parameter training, and the balance between simulation accuracy and generalization capability. In order to overcome conundrums mentioned above, a novel hybrid data-driven model named KEK was proposed in this paper. The KEK model was developed by coupling the K-means method for input clustering, ensemble back-propagation (BP) ANN for output estimation, and K-nearest neighbor (KNN) method for output error estimation. A novel calibration method was also proposed for the automatic and global calibration of the KEK model. For the purpose of intercomparison of model performance, the ANN model, KNN model, and proposed KEK model were applied for two applications including the Peak benchmark function simulation and the real-world electricity system daily total load forecasting. The testing results indicated that the KEK model outperformed other two models and showed very good simulation accuracy and generalization capability in the MISO system simulation tasks.


MISO system Data-driven model K-means method Ensemble BPANN KNN method KEK model 



This research was funded by the IWHR Research and Development Support Program (JZ0145B052016) IWHR Scientific Research Projects of Outstanding Young Scientists “Research and application on the fast global optimization method for the Xinanjiang model parameters based on the high performance heterogeneous computing” (No. KY1605), Specific Research of China Institute of Water Resources and Hydropower Research (Grant Nos. Fangji 1240), the Third Sub-Project: Flood Forecasting, Controlling and Flood Prevention Aided Software Development—Flood Control Early Warning Communication System and Flood Forecasting, Controlling and Flood Prevention Aided Software Development for Poyang Lake Area of Jiangxi Province (0628-136006104242, JZ0205A432013, SLXMB200902), Study of distributed flood risk forecast model and technology based on multi-source data integration and hydrometeorological coupling system (2013CB036400), IWHR application project of multi-source precipitation fusion and soil moisture remote sensing assimilation, the NNSF of China, Numerical Simulation Technology of Flash Flood based on Godunov Scheme and Its Mechanism Study by Experiment (No. 51509263), and China Postdoctoral Science Foundation on Grant (Grant NO.: 2016M591214). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.


  1. 1.
    Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185MathSciNetGoogle Scholar
  2. 2.
    Benmouiza K, Cheknane A (2013) Forecasting hourly global solar radiation using hybrid k-means and nonlinear autoregressive neural network models. Energy Convers Manag 75:561–569CrossRefGoogle Scholar
  3. 3.
    Bowden GJ, Dandy GC, Maier HR (2005) Input determination for neural network models in water resources applications. Part 1-background and methodology. J Hydrol 301:75–92CrossRefGoogle Scholar
  4. 4.
    Bowden GJ, Maier HR, Dandy GC (2005) Input determination for neural network models in water resources applications. Part 2. Case study: forecasting salinity in a river. J Hydrol 301:93–107CrossRefGoogle Scholar
  5. 5.
    Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40:200–210CrossRefGoogle Scholar
  6. 6.
    Dong J, Zheng C, Kan G, Wen J, Zhao M, Yu J (2015) Applying the ensemble artificial neural network-based hybrid data-driven model to daily total load forecasting. Neural Comput Appl 26(3):603–611CrossRefGoogle Scholar
  7. 7.
    Everitt BS, Landau S, Leese M, Stahl D (2011) Miscellaneous clustering methods, in cluster analysis, 5th edn. Wiley, ChichesterCrossRefzbMATHGoogle Scholar
  8. 8.
    Gutierrez-Corea FV, Manso-Callejo MA, Moreno-Regidor MP, Manrique-Sancho MT (2016) Forecasting short-term solar irradiance based on artificial neural networks and data from neighboring meteorological stations. Sol Energy 134:119–131CrossRefGoogle Scholar
  9. 9.
    He H, Tan Y (2012) A two-stage genetic algorithm for automatic clustering. Neurocomputing 81:49–59CrossRefGoogle Scholar
  10. 10.
    He X, He F, Cai W (2016) Underdetermined BSS based on K-means and AP clustering. Circ Syst Signal Process 32(8):2881–2913CrossRefGoogle Scholar
  11. 11.
    Kan G, Yao C, Li Q, Li Z, Yu Z, Liu Z, Ding L, He X, Liang K (2015) Improving event-based rainfall-runoff simulation using an ensemble artificial neural network based hybrid data-driven model. Stoch Environ Res Risk Assess 29:1345–1370CrossRefGoogle Scholar
  12. 12.
    Kan G, Lei T, Liang K, Li J, Ding L, He X, Yu H, Zhang D, Zuo D, Bao Z, Mark Amo-boateng HuY, Zhang M (2016) A multi-core CPU and many-core GPU based fast parallel shuffled complex evolution global optimization approach. IEEE Trans Parallel Distrib Syst. doi: 10.1109/TPDS.2016.2575822 Google Scholar
  13. 13.
    Kan GY, Li J, Zhang X, Ding L, He X, Liang K, Jiang X, Ren M, Li H, Wang F, Zhang Z, Hu Y (2015) A new hybrid data-driven model for event-based rainfall-runoff simulation. Neural Comput Appl. doi: 10.1007/s00521-016-2200-4 Google Scholar
  14. 14.
    Kan G, Liang K, Li JR, Ding LQ, He XY, Hu YB, Mark AB (2016) Accelerating the SCE-UA global optimization method based on multi-core CPU and many-core GPU. Adv Meteorol. doi: 10.1155/2016/8483728 Google Scholar
  15. 15.
    Khadse CB, Chaudhari MA, Borghate VB (2016) Conjugate gradient back-propagation based on artificial neural network for real time power quality assessment. Int J Electr Power Energy Syst 82:197–206CrossRefGoogle Scholar
  16. 16.
    Kim HJ, Jo NO, Shin KS (2016) Optimization of cluster-based evolutionary undersampling for the artificial neural networks in corporate bankruptcy prediction. Expert Syst Appl 59:226–234CrossRefGoogle Scholar
  17. 17.
    Li Z, Kan G, Yao C, Liu Z, Li Q, Yu S (2014) An improved neural network model and its application in hydrological simulation. J Hydrol Eng 19(10):04014019-1–04014019-17CrossRefGoogle Scholar
  18. 18.
    Mangalova E, Shesterneva O (2016) K-nearest neighbors for GEFCom2014 probabilistic wind power forecasting. Int J Forecast 32(3):1067–1073CrossRefGoogle Scholar
  19. 19.
    Sahoo AK, Zuo MJ, Tiwari MK (2012) A data clustering algorithm for stratified data partitioning in artificial neural network. Expert Syst Appl 39:7004–7014CrossRefGoogle Scholar
  20. 20.
    Sfidari E, Kadkhodaie-Ilkhchi A, Najjari S (2012) Comparison of intelligent and statistical clustering approaches to predicting total organic carbon using intelligent systems. J Petrol Sci Eng 86–87:190–205CrossRefGoogle Scholar
  21. 21.
    Shahrivari S, Jalili S (2016) Single-pass and linear-time k-means clustering based on MapReduce. Inf Syst 60:1–12CrossRefGoogle Scholar
  22. 22.
    Xie JY, Gao HC, Xie WX, Liu XH, Grant PW (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inf Syst 254:19–40Google Scholar
  23. 23.
    Yang S, Ting TO, Man KL, Guan SU (2013) Investigation of neural network for function approximation. Information Technology and Quantitative Management (ITQM2013). Proc Comput Sci 17:586–594CrossRefGoogle Scholar
  24. 24.
    Zhang Y, Wang J (2016) K-nearest neighbors and a kernel density estimator for GEFCom2014 probabilistic wind power forecasting. Int J Forecast 32(3):1074–1080CrossRefGoogle Scholar
  25. 25.
    Zhao Z, Zhang Y, Liao H (2008) Design of ensemble neural network using the Akaike information criterion. Eng Appl Artif Intell 21:1182–1188CrossRefGoogle Scholar

Copyright information

© The Natural Computing Applications Forum 2016

Authors and Affiliations

  • Guangyuan Kan
    • 1
    • 2
  • Xiaoyan He
    • 1
  • Jiren Li
    • 1
  • Liuqian Ding
    • 1
  • Dawei Zhang
    • 1
  • Tianjie Lei
    • 1
  • Yang Hong
    • 2
    • 3
  • Ke Liang
    • 4
  • Depeng Zuo
    • 5
  • Zhenxin Bao
    • 6
  • Mengjie Zhang
    • 7
  1. 1.State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin, Research Center on Flood and Drought Disaster Reduction of the Ministry of Water ResourcesChina Institute of Water Resources and Hydropower ResearchBeijingPeople’s Republic of China
  2. 2.State Key Laboratory of Hydroscience and Engineering, Department of Hydraulic EngineeringTsinghua UniversityBeijingPeople’s Republic of China
  3. 3.Department of Civil Engineering and Environmental ScienceUniversity of OklahomaNormanUSA
  4. 4.College of Hydrology and Water ResourcesHohai UniversityNanjingPeople’s Republic of China
  5. 5.College of Water SciencesBeijing Normal UniversityBeijingPeople’s Republic of China
  6. 6.Nanjing Hydraulic Research InstituteNanjingPeople’s Republic of China
  7. 7.State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin, Department of Water Resources (DWR)China Institute of Water Resources and Hydropower ResearchBeijingPeople’s Republic of China

Personalised recommendations