Online transfer learning with multiple decision trees

Wen, Yimin; Qin, Yixiu; Qin, Keke; Lu, Xiaoxia; Liu, Pingshan

doi:10.1007/s13042-019-00998-3

Online transfer learning with multiple decision trees

Original Article
Published: 16 August 2019

Volume 10, pages 2941–2962, (2019)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Yimin Wen ORCID: orcid.org/0000-0001-5017-3987^1,2,3,
Yixiu Qin³,
Keke Qin³,
Xiaoxia Lu⁴ &
…
Pingshan Liu¹

934 Accesses
5 Citations
Explore all metrics

Abstract

Online learning techniques have been widely used in many fields where instances come one by one. However, in early stage of a data stream, online learning models cannot exhibit good classification accuracy for it cannot collect sufficient instances to learn. For example, a well-known online learning algorithm named as very fast decision tree (VFDT) needs to wait for Hoeffding bound satisfied to split, which leads to poor classification accuracy at the beginning of data stream. Thus, VFDT may not be appropriate for some real applications which demand us a fast and accurate online detection. This situation will become more serious in the scenario of data stream classification with concept drift. This paper attempts to take transfer learning algorithm to make up this shortcoming of VFDT. To achieve this goal, a new decision tree method named as VFDT-D is first proposed to cache instances in its leaf nodes to handle numerical attributes and adapt to a framework of online transfer learning (OTL), and then a measure which considers tree path, classification accuracy and classification confidence is proposed to evaluate the local similarity between source and target domain classifiers. At last, a multiple-source online transfer learning algorithm named as DMOTL is proposed to take VFDT-D as base classifier and use the proposed measure of local similarity to select the optimal source domain classifier to help transfer learning. The extensive experiments on several synthetic and real-world datasets demonstrate the advantage of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey of transfer learning

Article Open access 28 May 2016

A survey on semi-supervised learning

Article Open access 15 November 2019

References

Shalev-Shwartz S (2012) Online learning and online convex optimization. Found Trends Mach Learn 4(2):107–194
Article MATH Google Scholar
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, pp. 71–80
Chattopadhyay R, Ye J, Panchanathan S, et al (2011) Multisource domain adaptation and its application to early detection of fatigue. In: Pro of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, New York: ACM, pp. 717–725
Sidhu P, Bhatia MPS (2018) A novel online ensemble approach to handle concept drifting data streams: diversified dynamic weighted majority. Int J Mach Learn Cybern 9(1):37–61
Article Google Scholar
Pan SJ, Yang Q (2010) A Survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Zhuang FZ, Luo P, He Q et al (2015) Survey on transfer learning research. J Softw 26(1):26–39 (in Chinese)
MathSciNet Google Scholar
Weiss K, Khoshgoftaar TM, Wang DD (2016) A survey of transfer learning. J Big Data 3(1):9–48
Article Google Scholar
Pan W, Yang Q (2013) Transfer learning in heterogeneous collaborative filtering domains. Artif Intell 197(4):39–55
Article MathSciNet MATH Google Scholar
Pan W, Zhong H, Xu C et al (2015) Adaptive bayesian personalized ranking for heterogeneous implicit feedbacks. Knowl-Based Syst 73(1):173–180
Article Google Scholar
Quattoni A, Collins M, Darrell T (2008) Transfer learning for image classification with sparse prototype representations. In: Proc of the Computer Vision and Pattern Recognition. Piscataway: IEEE, pp. 1–8
Zhao P, Hoi SCH (2010) OTL: a framework of online transfer learning. In: Proc. of the international conference on machine learning. New York: ACM, pp. 1231–1238
Zhao P, Hoi SCH, Wang J et al (2014) Online transfer learning. J Artif Intell 216(16):76–102
Article MathSciNet MATH Google Scholar
Wu Q, Wu H, Zhou X et al (2017) Online transfer learning with multiple homogeneous or heterogeneous Sources. IEEE Trans Knowl Data Eng 29(7):1494–1507
Article Google Scholar
Li ZJ, Li YX, Wang F et al (2015) Online learning algorithms for big data analytics: a survey. J Comput Res Dev 52(8):1707–1721 (in Chinese)
Google Scholar
Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386–408
Article Google Scholar
Crammer K, Dekel O, Keshet J et al (2006) Online passive-aggressive algorithms. J Mach Learn Res 7(3):551–585
MathSciNet MATH Google Scholar
Gama J, Rocha R, Medas P (2003) Accurate decision trees for mining high-speed data streams. In: Proc. of the ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM, pp 523–528
Dai W, Yang Q, Xue G R et al (2007) Boosting for transfer learning. In: Proc. of the 24th international conference on Machine learning. New York: ACM, pp 193–200
Eaton E, Desjardins M (2011) Selective transfer between learning tasks using task-based boosting. In: Proc. of the AAAI conference on artificial intelligence. Menlo Park: AAAI, pp 337–342
Wang XS, Pan J, Cheng YH et al (2013) Self-adaptive transfer for decision trees based on similarity metric. Acta Automatica Sinica 39(12):2186–2192 (in Chinese)
Article Google Scholar
Gao J, Fan W, Jiang J et al (2008) Knowledge transfer via multiple model local structure mapping. In: Proc. of the ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM, pp 283–291
Ge L, Gao J, Zhang AD (2013) OMS-TL: a framework of online multiple source transfer learning. In: Proc. of the 22nd ACM international conference on information and knowledge management. New York: ACM, pp 2423–2428
Tang SQ, Wen YM, Qin YX (2017) Online transfer learning from multiple sources based on local classification accuracy. J Softw 28(11):2940–2960 (in Chinese)
MATH Google Scholar
Zadrozny B, Elkan C (2001) Learning and making decisions when costs and probabilities are both unknown. In: Proc. of the ACM SIGKDD international conference on knowledge discovery & data mining. New York: ACM, pp 204–213
Ntoutsi I, Kalousis A, Theodoridis Y (2008) A general framework for estimating similarity of datasets and decision trees: exploring semantic similarity of decision trees. In: Proc. of the SIAM international conference on data mining. Philadelphia: SIAM, pp 810–821
Huang Z (1997) Clustering large datasets with mixed numeric and categorical values. In: Proc. of the 1st Pacific-asia conference on knowledge discovery and data mining. Springer, Berlin, pp 21–34
Bifet A, Holmes G, Kirkby R et al (2010) MOA: massive online analysis. J Mach Learn Res 11(2):1601–1604
Google Scholar
Xiang W E, Pan J S, Pan W et al (2011) Source-selection-free transfer learning, In: Proc of the twenty-second international joint conference on artificial intelligence, Menlo Park: AAAI, pp 2355–2360

Download references

Acknowledgments

This work was partially supported by the National Natural Science Foundation of China (61866007, 61762029, 61662014, 61763007), the Natural Science Foundation of Guangxi District (2018GXNSFDA138006), Guangxi Key Laboratory of Trusted Software (KX201721), Collaborative innovation center of cloud computing and big data (YD16E12), Image intelligent processing project of Key Laboratory Fund (GIIP201505).

Author information

Authors and Affiliations

Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin, China
Yimin Wen & Pingshan Liu
Guangxi Key Laboratory of Image and Graphic Intelligent Processing, Guilin University of Electronic Technology, Guilin, China
Yimin Wen
School of Computer Science and Information Safety, Guilin University of Electronic Technology, Guilin, China
Yimin Wen, Yixiu Qin & Keke Qin
Information Engineering College, Guangzhou Huaxia Vocational College, Guangzhou, China
Xiaoxia Lu

Authors

Yimin Wen
View author publications
You can also search for this author in PubMed Google Scholar
Yixiu Qin
View author publications
You can also search for this author in PubMed Google Scholar
Keke Qin
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoxia Lu
View author publications
You can also search for this author in PubMed Google Scholar
Pingshan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yimin Wen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (RAR 1027 kb)

Supplementary material 2 (RAR 50 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wen, Y., Qin, Y., Qin, K. et al. Online transfer learning with multiple decision trees. Int. J. Mach. Learn. & Cyber. 10, 2941–2962 (2019). https://doi.org/10.1007/s13042-019-00998-3

Download citation

Received: 08 May 2019
Accepted: 05 August 2019
Published: 16 August 2019
Issue Date: October 2019
DOI: https://doi.org/10.1007/s13042-019-00998-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online transfer learning with multiple decision trees

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey of transfer learning

A survey on semi-supervised learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (RAR 1027 kb)

Supplementary material 2 (RAR 50 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Online transfer learning with multiple decision trees

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey of transfer learning

A survey on semi-supervised learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (RAR 1027 kb)

Supplementary material 2 (RAR 50 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation