Applied Intelligence

, Volume 41, Issue 1, pp 30–41 | Cite as

Ensemble learning from multiple information sources via label propagation and consensus

Article

Abstract

Many applications are facing the problem of learning from multiple information sources, where sources may be labeled or unlabeled, and information from multiple information sources may be beneficial but cannot be integrated into a single information source for learning. In this paper, we propose an ensemble learning method for different labeled and unlabeled sources. We first present two label propagation methods to infer the labels of training objects from unlabeled sources by making a full use of class label information from labeled sources and internal structure information from unlabeled sources, which are processes referred to as global consensus and local consensus, respectively. We then predict the labels of testing objects using the ensemble learning model of multiple information sources. Experimental results show that our method outperforms two baseline methods. Meanwhile, our method is more scalable for large information sources and is more robust for labeled sources with noisy data.

Keywords

Multiple information sources Ensemble learning Label propagation Consensus 

References

  1. 1.
    Adhikari A, Rao RR (2008) Synthesizing heavy association rules from different real data sources. Pattern Recognit Lett 29(1):59–71 CrossRefGoogle Scholar
  2. 2.
    Adhikari A, Ramachandrarao P, Pedrycz W (2011) Study of select items in different data sources by grouping. Knowl Inf Syst 27(1):23–43 CrossRefGoogle Scholar
  3. 3.
    Ahmed E, Nabli A, Gargouri F (2013) A new semi-supervised hierarchical active clustering based on ranking constraints for analysts groupization. Appl Intell 39(2):217–235 CrossRefGoogle Scholar
  4. 4.
    Aksela M, Laaksonen J (2006) Using diversity of errors for selecting members of a committee classifier. Pattern Recognit 39:608–623 CrossRefMATHGoogle Scholar
  5. 5.
    Augsten N, Bohlen M, Gamper J (2013) The address connector: noninvasive synchronization of hierarchical data sources. Knowl Inf Syst 37(3):639–663 CrossRefGoogle Scholar
  6. 6.
    Bache K, Lichman M (2013) UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine, CA. http://archive.ics.uci.edu/ml
  7. 7.
    Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140 MATHMathSciNetGoogle Scholar
  8. 8.
    Camacho D, Aler R, Borrajo D, Molina J (2006) Multi-agent plan based information gathering. Appl Intell 25(1):59–71 CrossRefMATHGoogle Scholar
  9. 9.
    Chapelle O, Scholkopf B, Zien A (2006) Semi-supervised learning. MIT Press, Cambridge CrossRefGoogle Scholar
  10. 10.
    Czyz J, Kittler J, Vandendorpe L (2004) Multiple classifier combination for face-based identity verification. Pattern Recognit 37:1459–1469 CrossRefGoogle Scholar
  11. 11.
    Dietterich T (2002) The handbook of brain theory and neural networks, 2nd edn. MIT Press, Cambridge Google Scholar
  12. 12.
    Freund Y (1990) Boosting a weak learning algorithm by majority. In: Proceedings of the third annual workshop on computational learning theory, pp 202–216 Google Scholar
  13. 13.
    Freund Y (1996) Boosting a weak learning algorithm by majority. Inf Comput 121:256–285 CrossRefMathSciNetGoogle Scholar
  14. 14.
    Fujino A, Ueda N, Nagata M (2013) Adaptive semi-supervised learning on labeled and unlabeled data with different distributions. Knowl Inf Syst 37:129–154 CrossRefGoogle Scholar
  15. 15.
    Fumera G, Roli F (2005) A theoretical and experimental analysis of linear combiners for multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 27(6):942–956 CrossRefGoogle Scholar
  16. 16.
    Gao J, Fan W, Sun Y, Han J (2009) Heterogeneous source consensus learning via decision propagation and negotiation. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining (KDD-09), Paris, France, June, pp 339–347 CrossRefGoogle Scholar
  17. 17.
    Gao J, Liang F, Fan W, Sun Y, Han J (2009) Graph-based consensus maximization among multiple supervised and unsupervised models. In: Advances in neural information processing systems (NIPS-09), pp 585–593 Google Scholar
  18. 18.
    Grossi V, Turini F (2012) Streaming mining: a novel architecture for ensemble-based classification. Knowl Inf Syst 30:247–281 CrossRefGoogle Scholar
  19. 19.
    Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46:389–422 CrossRefMATHGoogle Scholar
  20. 20.
    Hansen L, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001 CrossRefGoogle Scholar
  21. 21.
    Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20:832–844 CrossRefGoogle Scholar
  22. 22.
    Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20:226–239 CrossRefGoogle Scholar
  23. 23.
    Kuncheva L, Rodriguez J (2012) A weighted voting framework for classifiers ensembles. Knowl Inf Syst. doi:10.1007/s10115-012-0586-6 Google Scholar
  24. 24.
    Lee H, Kim E, Pedrycz W (2012) A new selective neural network ensemble with negative correlation. Appl Intell 37(4):488–498 CrossRefGoogle Scholar
  25. 25.
    Li T, Ogihara M (2005) Semisupervised learning from different information sources. Knowl Inf Syst 7:289–309 CrossRefGoogle Scholar
  26. 26.
    Pise N, Kulkarni P (2008) A survey of semi-supervised learning methods. In: Proceedings of 2008 international conference on computational intelligence and security (CIS-08), pp 30–34 CrossRefGoogle Scholar
  27. 27.
    Preece A, Hui K, Gray A, Matri P (2001) Designing for scalability in a knowledge fusion system. Knowl-Based Syst 14:173–179 CrossRefGoogle Scholar
  28. 28.
    Schapire R (1990) The strength of weak learnability. Mach Learn 5:197–227 Google Scholar
  29. 29.
    Schapire R, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37:297–336 CrossRefMATHGoogle Scholar
  30. 30.
    Tang XL, Han M (2010) Semi-supervised Bayesian ARTMAP. Appl Intell 33(3):202–317 CrossRefMathSciNetGoogle Scholar
  31. 31.
    The DBLP Computer Science Bibliography. http://www.informatik.uni-trier.de/~ley/db/
  32. 32.
    Verma B, Hassan S (2011) Hybrid ensemble approach for classification. Appl Intell 34(2):258–278 CrossRefGoogle Scholar
  33. 33.
    Wang CW, You WH (2013) Boosting-SVM: effective learning with reduced data dimension. Appl Intell 39(3):465–474 CrossRefGoogle Scholar
  34. 34.
    Wu X, Zhang S (2003) Synthesizing high-frequency rules from different data sources. IEEE Trans Knowl Data Eng 15(2):353–367 CrossRefGoogle Scholar
  35. 35.
    Ye M, Wu X, Hu X, Hu D (2013) Multi-level rough set reduction for decision rule mining. Appl Intell 39(3):642–658 CrossRefGoogle Scholar
  36. 36.
    Yin X, Han J, Yang J, Yu PS (2006) Efficient classification across multiple database relations: a CrossMine approach. IEEE Trans Knowl Data Eng 18(6):770–783 CrossRefGoogle Scholar
  37. 37.
    Yuan L, Wang Y, Thompson P, Narayan VA, Ye J (2012) Multi-source feature learning for joint analysis of incomplete multiple heterogeneous neuroimaging data. NeuroImage 61:622–632 CrossRefGoogle Scholar
  38. 38.
    Zhang S, You X, Jin Z, Wu X (2009) Mining globally interesting patterns from multiple databases using kernel estimation. Expert Syst Appl 36(8):10863–10869 CrossRefGoogle Scholar
  39. 39.
    Zhang P, Zhu X, Tan J, Guo L (2010) Classifier and cluster ensembles for mining concept drifting data streams. In: Proceedings of the 10th IEEE international conference on data mining (KDD-10), pp 1175–1180 Google Scholar
  40. 40.
    Zhao Z, Glotin H, Xie Z, Gao J, Wu X (2012) Cooperative sparse representation in two opposite directions for semi-supervised image annotation. IEEE Trans Image Process 21(9):4218–4231 CrossRefMathSciNetGoogle Scholar
  41. 41.
    Zhou D, Bousque O, Lal TN, Weston J (2004) Learning with local and global consistency. In: Proceedings of advances in neural information processing systems (NIPS-04), pp 321–328 Google Scholar
  42. 42.
    Zhu X (2005) Semi-supervised learning literature survey. Technical report 1530, Department of Computer Sciences, University of Wisconsin, Madison Google Scholar
  43. 43.
    Zhu X, Jin R (2009) Multiple information sources cooperative learning. In: Proceedings of the 21st international joint conference on artificial intelligence (IJCAI-09), California, July, pp 1369–1376 Google Scholar
  44. 44.
    Zhu X, Li B, Wu X, Dan H, Zhang C (2011) CLAP: collaborative pattern mining for distributed information systems. Decis Support Syst 52(1):40–51 CrossRefGoogle Scholar
  45. 45.
    Zhuang F, Luo P, Xiong H, Xiong Y (2010) Cross-domain learning from multiple sources: a consensus regularization perspective. IEEE Trans Knowl Data Eng 22(12):1664–1678 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.School of Computer Science and Information EngineeringHefei University of TechnologyHefeiP.R. China
  2. 2.Department of Computer Science and EngineeringMinnan Normal UniversityZhangzhouP.R. China
  3. 3.Department of Computer ScienceUniversity of VermontBurlingtonUSA

Personalised recommendations