pp 1–15 | Cite as

Effective learning model of user classification based on ensemble learning algorithms

  • Qunsheng Ruan
  • Qingfeng WuEmail author
  • Yingdong Wang
  • Xiling Liu
  • Fengyu Miao


Aiming to aid Electric-Power Industry to accurately understand users, hybrid learning model based ensemble learning algorithms for recognizing user to be sensitive to electric charge is proposed in this paper. On the basis of big data presented by CCF competition sponsor in China, with some excellent technology or algorithm such as JieBa, SFFS, etc., we extract many key features from data set and successfully draw a portrait for users who pay close attention to electric charge. Furthermore, machine learning algorithms and the strategy selection model related to them are investigated. The feasibility that hybrid learning model combining several ensemble learning algorithms can substantially improve classification accuracy are proved from theoretical level. Then the details of implementing hybrid learning model are described in the paper. Lastly, the hybrid learning model named Stacking is achieved, which yields better performance in contrast to the state-of-the-art competitors. The experimental results indicate that Stacking has both high precision and recall with 0.8 and 0.85 respectively. Furthermore the F1 score of Stacking evaluation is 0.823.


User classification User portrait Machine learning algorithms Hybrid learning model 

Mathematics Subject Classification




This work has partly been supported by the Key project of national key R&D project (No. 2017YFC17003303), National Nature Science Foundation of China (Nos. 61402387), Science and Technology Guiding Project of Fujian Province of China (Nos. 2015H0037, 2016H0035), the Natural Science Foundation of Fujian Province, China (Grant Nos. 2017J01773, 2018J01555), the Educational Middle and Youth Foundation of Fujian Province, China (Grant No. JAT160537), the research program of normal university (Grant Nos. 2016Z06, 2016Z03), The authors would like to appreciate the valuable comments and suggestions from the editors and reviewers.


  1. 1.
    Zhao SG (2014) High conversions ratio user portrait of social media: deep investigation and research based 500 users. Mod Med J Commun Univ China 31:115–120Google Scholar
  2. 2.
    Customer portrait created by China grid client service central based on Big Data.[DB/OL]. Accessed 07 Oct 2017
  3. 3.
    Lin L, Wang F, Xie XL (2017) Random forests-based extreme learning machine ensemble for multi-regime time series prediction. Expert Syst Appl 83:164–176CrossRefGoogle Scholar
  4. 4.
    Isaac FV, Elena HP, Diego AE (2017) Combining machine learning models for the automatic detection of eeg arousals. Neurocomputing 268:100–108CrossRefGoogle Scholar
  5. 5.
    Janik M, Bossew P, Kurihara O (2018) Machine learning methods as a tool to analyse incomplete or irregularly sampled radon time series data. Total Environ 630:1155–1167CrossRefGoogle Scholar
  6. 6.
    Rory M, Eibe F (2017) Accelerating the XGBoost algorithm using GPU computing. Peer J 5:341–345Google Scholar
  7. 7.
    Qiao Y, Zhang HP, Yu M. Sina-Weibo (2016) Spammer detection with GBDT, social media processing. In: 5th National conference on social media processing, 29–30 Oct, Nanchang, ChinaGoogle Scholar
  8. 8.
    Zhang XS, Zhuang Y, Wang W (2016) Transfer boosting with synthetic instances for class imbalanced object recognition. IEEE Trans Cybern 99:1–14Google Scholar
  9. 9.
    Luo Y, Ye WB, Zhao XJ (2017) Classification of data from electronic nose using gradient tree boosting algorithm. Sensors 17:2376CrossRefGoogle Scholar
  10. 10.
    Ma J, Cheng CP (2016) Identifying the influential features on the regional energy use intensity of residential buildings based on random forests. Appl Energy 183:193–201CrossRefGoogle Scholar
  11. 11.
    Zhang TL, Xia DH, Tang HS (2016) Classification of steel samples by laser-induced breakdown spectroscopy and random rorest. Chemometr Intell Lab Syst 157:196–201CrossRefGoogle Scholar
  12. 12.
    Tamayo D, Silburt A, Valencia D (2016) A machine learns to predict the stability of tightly packed planetary systems. Astrophys J Lett 832:123–132CrossRefGoogle Scholar
  13. 13.
    Sankari ES, Manimegalai D (2017) Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets. J Theor Biol 435:208–2017CrossRefGoogle Scholar
  14. 14.
    Hyoseon J, Woongwoo L, Hyeyoung P (2017) Automatic classification of tremor severity in Parkinson’s disease using a wearable device. Sensors 17:3390Google Scholar
  15. 15.
    Kulju S, Riegger L, Koltay P, Mattila K, Hyvaluoma J (2018) Fluid flow simulations meet high-speed video: computer vision comparison of droplet dynamics. J Colloid Interface Sci 522:45–56CrossRefGoogle Scholar
  16. 16.
    Chawla NV, Bowyer KW, Hall LO (2002) Synthetic minority over-sampling technique. J Artif Intell Res 16:321–357CrossRefGoogle Scholar
  17. 17.
    Han H, Wang WY, Mao BH (2014) Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: 18th International conference on advances in intelligent computing, 5–7 May, Tokyo, JapanGoogle Scholar
  18. 18.
    Gao M, Hong X, Chen S (2011) A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems. Neurocomputing 74:3456–3466CrossRefGoogle Scholar
  19. 19.
    Davila FC, Renatao DM (2016) A bee-inspired data clustering approach to design RBF neural network classifiers. Neurocomputing 79:852–863Google Scholar
  20. 20.
    Xiao YC, Wang HG, Zhang L (2014) Two methods of selecting gaussian kernel parameters for one-class SVM and their application to fault detection. Knowl Based Syst 59:75–84CrossRefGoogle Scholar
  21. 21.
    Jonson L, Borg M, Broman D (2016) Automated bug assignment: ensemble-based machine learning in large scale industrial context. Empir Softw Eng 21:1533–1538CrossRefGoogle Scholar
  22. 22.
    Dakkak OE, Peccati G, Prunster L (2014) Exchangeable Hoeffding decompositions over finite sets: a combinatorial characterization and counterexamples. J Multivar Anal 131:51–64MathSciNetCrossRefGoogle Scholar
  23. 23.
    Freidman J, Jastoe T, Tibshirani T (2001) Additive logistic regression: astatistical view of boosting. Ann Stat 28:337–340CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Austria, part of Springer Nature 2018

Authors and Affiliations

  • Qunsheng Ruan
    • 1
  • Qingfeng Wu
    • 1
    Email author
  • Yingdong Wang
    • 1
  • Xiling Liu
    • 2
  • Fengyu Miao
    • 2
  1. 1.Software SchoolXiamen UniversityXiamenChina
  2. 2.School of Information, Mechanical and Electrical EnginerringNingde Normal UniversityNingdeChina

Personalised recommendations