Skip to main content
Log in

Transfer learning for concept drifting data streams in heterogeneous environments

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Learning in non-stationary environments remains challenging due to dynamic and unknown probability distribution. This issue is even more problematic when there is a lack of supervision data for a specific domain, making the use of labeled data from a related but different domain highly valuable. This paper addresses the streaming data classification and introduces a heterogeneous unsupervised domain adaptation method. To cover the uncertainty caused by the distribution discrepancy and concept drifting data, the proposed method prioritizes target domain data with the highest uncertainty, as they indicate changes in data distribution. It utilizes a fuzzy-based feature-level adaptation and optimizes parameters through accelerated optimization. Additionally, it employs instance selection in the source domain to identify qualified samples, further enhancing classification and adaptation. Three different settings of the proposed method have been configured, and five state-of-the-art methods have been selected as competing methods. Regarding different types of concept drift, various experiments taken from four benchmark datasets demonstrate the superiority of the proposed method in terms of accuracy and computational time. The Wilcoxon statistical test has been conducted to prove a meaningful distinction between the evaluation metrics results of the proposed method and the competing ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Fig. 2
Algorithm 3
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://qwone.com/~jason/20Newsgroups/.

  2. https://jmcauley.ucsd.edu/data/amazon/.

  3. https://people.eecs.berkeley.edu/jhoffman/domainadapt/#datasets_code.

  4. http://hemanthdv.org/OfficeHome-Dataset/.

References

  1. Yang T, Yu X, Ma N, Zhao Y, Li H (2021) A novel domain adaptive deep recurrent network for multivariate time series prediction. Eng Appl Artif Intell 106:104498. https://doi.org/10.1016/j.engappai.2021.104498

    Article  Google Scholar 

  2. Ge P, Ren C-X, Xu X-L, Yan H (2023) Unsupervised domain adaptation via deep conditional adaptation network. Pattern Recogn 134:109088. https://doi.org/10.1016/j.patcog.2022.109088

    Article  Google Scholar 

  3. Qu S, Zou T, Rohrbein F, Lu C, Chen G, Tao D, Jiang C (2023) Upcycling models under domain and category shift. In: Paper presented at the 2023 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  4. Khan S, Asim M, Khan S, Musyafa A, Wu Q (2023) Unsupervised domain adaptation using fuzzy rules and stochastic hierarchical convolutional neural networks. Comput Electr Eng 105:108547. https://doi.org/10.1016/j.compeleceng.2022.108547

    Article  Google Scholar 

  5. Sun J, Dai Y, Zhao K, Jia Z (2021) Second order Takagi–Sugeno fuzzy model with domain adaptation for nonlinear regression. Inf Sci 570:34–51. https://doi.org/10.1016/j.ins.2021.04.024

    Article  MathSciNet  Google Scholar 

  6. Zuo H, Lu J, Zhang G, Pedrycz W (2019) Fuzzy rule-based domain adaptation in homogeneous and heterogeneous spaces. IEEE Trans Fuzzy Syst 27(2):348–361. https://doi.org/10.1109/TFUZZ.2018.2853720

    Article  Google Scholar 

  7. Liu F, Zhang G, Lu J (2021) Multisource heterogeneous unsupervised domain adaptation via fuzzy relation neural networks. IEEE Trans Fuzzy Syst 29(11):3308–3322. https://doi.org/10.1109/TFUZZ.2020.3018191

    Article  Google Scholar 

  8. Li Y, Sun H, Yan W (2022) Domain adaptive twin support vector machine learning using privileged information. Neurocomputing 469:13–27

    Article  Google Scholar 

  9. Krawczyk B, Minku LL, Gama J, Stefanowski J, Woźniak M (2017) Ensemble learning for data stream analysis: a survey. Inform Fus 37:132–156

    Article  Google Scholar 

  10. Suárez-Cetrulo AL, Quintana D, Cervantes A (2023) A survey on machine learning for recurring concept drifting data streams. Expert Syst Appl 213:118934. https://doi.org/10.1016/j.eswa.2022.118934

    Article  Google Scholar 

  11. Karimian M, Beigy H (2023) Concept drift handling: a domain adaptation perspective. Expert Syst Appl 224:119946. https://doi.org/10.1016/j.eswa.2023.119946

    Article  Google Scholar 

  12. Zhang Y, Davison BD (2021) Domain adaptation for object recognition using subspace sampling demons. Multimed Tools Appl 80(15):23255–23274

    Article  Google Scholar 

  13. Li B, Wang Y, Zhang S, Li D, Keutzer K, Darrell T, Zhao H (2021) Learning invariant representations and risks for semi-supervised domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1104–1113

  14. Huang H, Liu Q (2021) Domain structure-based transfer learning for cross-domain word representation. Inform Fus 76:145–156. https://doi.org/10.1016/j.inffus.2021.05.013

    Article  Google Scholar 

  15. Dhaini M, Berar M, Honeine P, Van Exem A (2023) Unsupervised domain adaptation for regression using dictionary learning. Knowl-Based Syst 267:110439. https://doi.org/10.1016/j.knosys.2023.110439

    Article  Google Scholar 

  16. Zhang Z, Chen H, Li S, An Z, Wang J (2020) A novel geodesic flow kernel based domain adaptation approach for intelligent fault diagnosis under varying working condition. Neurocomputing 376:54–64. https://doi.org/10.1016/j.neucom.2019.09.081

    Article  Google Scholar 

  17. Sanodiya RK, Mathew J, Aditya R, Jacob A, Nayanar B (2021) Kernelized unified domain adaptation on geometrical manifolds. Expert Syst Appl 167:114078

    Article  Google Scholar 

  18. Brahma D, Rai P (2023) A probabilistic framework for lifelong test-time adaptation. In: Paper presented at the 2023 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  19. Liu H, Shao M, Ding Z, Fu Y (2018) Structure-preserved unsupervised domain adaptation. IEEE Trans Knowl Data Eng 31(4):799–812

    Article  Google Scholar 

  20. Yan H, Ding Y, Li P, Wang Q, Xu Y, Zuo W (2017) Mind the class weight bias: weighted maximum mean discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2272–2281

  21. Li S, Ma W, Zhang J, Liu CH, Liang J, Wang G (2021) Meta-reweighted regularization for unsupervised domain adaptation. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2021.3114536

    Article  Google Scholar 

  22. Xie H, Liu B, Xiao Y (2021) Transfer learning-based one-class dictionary learning for recommendation data stream. Inf Sci 547:526–538. https://doi.org/10.1016/j.ins.2020.08.091

    Article  MathSciNet  Google Scholar 

  23. Zhang S-s, Liu J-w, Zuo X (2021) Adaptive online incremental learning for evolving data streams. Appl Soft Comput 105:107255. https://doi.org/10.1016/j.asoc.2021.107255

    Article  Google Scholar 

  24. Xu Q, Wei X, Bai R, Li S, Meng Z (2023) Integration of deep adaptation transfer learning and online sequential extreme learning machine for cross-person and cross-position activity recognition. Expert Syst Appl 212:118807. https://doi.org/10.1016/j.eswa.2022.118807

    Article  Google Scholar 

  25. Li J, Chen E, Ding Z, Zhu L, Lu K, Shen HT (2020) Maximum density divergence for domain adaptation. IEEE Trans Pattern Anal Mach Intell 43:3918–3930

    Article  Google Scholar 

  26. Wang W, Wang H, Zhang Z, Zhang C, Gao Y (2019) Semi-supervised domain adaptation via Fredholm integral based kernel methods. Pattern Recogn 85:185–197

    Article  Google Scholar 

  27. Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359. https://doi.org/10.1109/TKDE.2009.191

    Article  Google Scholar 

  28. Chandra S, Haque A, Khan L, Aggarwal C (2016) An adaptive framework for multistream classification. In: Proceedings of the 25th ACM international on conference on information and knowledge management, pp 1181–1190

  29. Huang J, Gretton A, Borgwardt K, Schölkopf B, Smola A (2006) Correcting sample selection bias by unlabeled data. Adv Neural Inf Process Syst 19:601–608. https://doi.org/10.1109/CVPR.2018.00400

    Article  Google Scholar 

  30. Haque A, Wang Z, Chandra S, Dong B, Khan L, Hamlen KW (2017) Fusion: an online method for multistream classification. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 919–928

  31. Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Brazilian symposium on artificial intelligence, Springer, pp 286–295

  32. Du H, Minku LL, Zhou H (2019) Multi-source transfer learning for non-stationary environments. 2019 International Joint Conference on Neural Networks (IJCNN), IEEE, pp 1–8

  33. Pratama M, de Carvalho M, Xie R, Lughofer E, Lu J (2019) ATL: Autonomous knowledge transfer from many streaming processes. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 269–278

  34. Zhao P, Hoi SC, Wang J, Li B (2014) Online transfer learning. Artif Intell 216:76–102

    Article  MathSciNet  Google Scholar 

  35. Hong C, Zeng Z, Xie R, Zhuang W, Wang X (2018) Domain adaptation with low-rank alignment for weakly supervised hand pose recovery. Signal Process 142:223–230. https://doi.org/10.1016/j.sigpro.2017.07.032

    Article  Google Scholar 

  36. Chen S, Hong Z, Harandi M, Yang X (2022) Domain neural adaptation. IEEE Trans Neural Netw Learn Syst 34:8630–8641

    Article  MathSciNet  Google Scholar 

  37. Meng M, Chen Q, Wu J (2021) Structure preservation adversarial network for visual domain adaptation. Inf Sci 579:266–280. https://doi.org/10.1016/j.ins.2021.07.085

    Article  MathSciNet  Google Scholar 

  38. Zhang C, Zhao Q, Wang Y (2020) Transferable attention networks for adversarial domain adaptation. Inf Sci 539:422–433. https://doi.org/10.1016/j.ins.2020.06.016

    Article  MathSciNet  Google Scholar 

  39. Yao Q, Qian Q, Qin Y, Guo L, Wu F (2022) Adversarial domain adaptation network with pseudo-siamese feature extractors for cross-bearing fault transfer diagnosis. Eng Appl Artif Intell 113:104932. https://doi.org/10.1016/j.engappai.2022.104932

    Article  Google Scholar 

  40. Wang C, Chen D, Chen J, Lai X, He T (2021) Deep regression adaptation networks with model-based transfer learning for dynamic load identification in the frequency domain. Eng Appl Artif Intell 102:104244. https://doi.org/10.1016/j.engappai.2021.104244

    Article  Google Scholar 

  41. Idrees MM, Minku LL, Stahl F, Badii A (2020) A heterogeneous online learning ensemble for non-stationary environments. Knowl-Based Syst 188:104983

    Article  Google Scholar 

  42. Li Q, Xiong Q, Ji S, Yu Y, Wu C, Gao M (2021) Incremental semi-supervised extreme learning machine for mixed data stream classification. Expert Syst Appl 185:115591

    Article  Google Scholar 

  43. Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G (2019) Learning under concept drift: a review. IEEE Trans Knowl Data Eng 31(12):2346–2363. https://doi.org/10.1109/TKDE.2018.2876857

    Article  Google Scholar 

  44. Guo H, Li H, Sun N, Ren Q, Zhang A, Wang W (2023) Concept drift detection and accelerated convergence of online learning. Knowl Inf Syst 65(3):1005–1043. https://doi.org/10.1007/s10115-022-01790-6

    Article  Google Scholar 

  45. Goldenberg I, Webb GI (2019) Survey of distance measures for quantifying concept drift and shift in numeric data. Knowl Inf Syst 60(2):591–615. https://doi.org/10.1007/s10115-018-1257-z

    Article  Google Scholar 

  46. Han M, Zhang X, Chen Z, Wu H, Li M (2023) Dynamic ensemble selection classification algorithm based on window over imbalanced drift data stream. Knowl Inf Syst 65(3):1105–1128. https://doi.org/10.1007/s10115-022-01791-5

    Article  Google Scholar 

  47. Gu Q, Dai Q, Yu H, Ye R (2021) Integrating multi-source transfer learning, active learning and metric learning paradigms for time series prediction. Appl Soft Comput 109:107583. https://doi.org/10.1016/j.asoc.2021.107583

    Article  Google Scholar 

  48. Halstead B, Koh YS, Riddle P, Pechenizkiy M, Bifet A (2023) Combining diverse meta-features to accurately identify recurring concept drift in data streams. ACM Trans Knowl Discov Data 17(8):107. https://doi.org/10.1145/3587098

    Article  Google Scholar 

  49. Jafseer KT, Shailesh S, Sreekumar A (2023) Modeling concept drift detection as machine learning model using overlapping window and Kolmogorov–Smirnov test. In: Doriya R, Soni B, Shukla A, Gao X-Z (eds) Machine Learning, Image Processing, Network Security and Data Sciences: Select Proceedings of 3rd International Conference on MIND 2021. Springer Nature Singapore, Singapore, pp 113–129. https://doi.org/10.1007/978-981-19-5868-7_10

    Chapter  Google Scholar 

  50. Liu W, Zhu C, Ding Z, Zhang H, Liu Q (2023) Multiclass imbalanced and concept drift network traffic classification framework based on online active learning. Eng Appl Artif Intell 117:105607. https://doi.org/10.1016/j.engappai.2022.105607

    Article  Google Scholar 

  51. Talapula DK, Kumar A, Ravulakollu KK, Kumar M (2023) A hybrid deep learning classifier and optimized key windowing approach for drift detection and adaption. Decision Anal J 6:100178. https://doi.org/10.1016/j.dajour.2023.100178

    Article  Google Scholar 

  52. Keogh E, Chu S, Hart D, Pazzani M (2001) An online algorithm for segmenting time series. In: Proceedings 2001 IEEE international conference on data mining, IEEE, pp 289–296

  53. Chowdhury MFR, Selouani SA, O’Shaughnessy D (2012) Bayesian on-line spectral change point detection: a soft computing approach for on-line ASR. Int J Speech Technol 15(1):5–23

    Article  Google Scholar 

  54. Alippi C, Boracchi G, Carrera D, Roveri M (2015) Change detection in multivariate datastreams: Likelihood and detectability loss. arXiv preprint arXiv:151004850

  55. Mallikarjunaswamy S, Sharmila N, Siddesh GK, Nataraj KR, Komala M (2022) A novel architecture for cluster based false data injection attack detection and location identification in smart grid. Advances in Thermofluids and Renewable Energy. Springer, pp 599–611

  56. Nordli Ø, Przybylak R, Ogilvie AEJ, Isaksen K (2014) Long-term temperature trends and variability on Spitsbergen: the extended Svalbard Airport temperature series, 1898–2012. Polar Res 33(1):21349

    Article  Google Scholar 

  57. Borchani H, Martínez AM, Masegosa AR, Langseth H, Nielsen TD, Salmerón A, Fernández A, Madsen AL, Sáez R (2015) Modeling concept drift: A probabilistic graphical model based approach. International symposium on intelligent data analysis, Springer, pp 72–83

  58. Yi M, Zhao D, Liao C, Yin H (2022) MK-SCE: A novel multi-kernel based self-adapt concept drift ensemble learning. In: Proceedings of 2021 Chinese intelligent systems conference, Springer, pp 492–497

  59. Shrivastava N, Bhagat A, Nair R (2022) Graph powered machine learning in smart sensor networks. Smart Sensor Networks. Springer, pp 209–226

  60. Aslani M, Seipel S (2020) A fast instance selection method for support vector machines in building extraction. Appl Soft Comput 97:106716. https://doi.org/10.1016/j.asoc.2020.106716

    Article  Google Scholar 

  61. Liu F, Lu J, Zhang G (2018) Unsupervised heterogeneous domain adaptation via shared fuzzy equivalence relations. IEEE Trans Fuzzy Syst 26(6):3555–3568. https://doi.org/10.1109/TFUZZ.2018.2836364

    Article  Google Scholar 

  62. Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imag Sci 2(1):183–202

    Article  MathSciNet  Google Scholar 

  63. Pan SJ, Tsang IW, Kwok JT, Yang Q (2011) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210. https://doi.org/10.1109/TNN.2010.2091281

    Article  Google Scholar 

  64. Saenko K, Kulis B, Fritz M, Darrell T (2010) Adapting visual category models to new domains. Springer, Berlin, pp 213–226

    Google Scholar 

  65. Venkateswara H, Eusebio J, Chakraborty S, Panchanathan S (2017) Deep hashing network for unsupervised domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5018–5027

  66. Read J (2018) Concept-drifting data streams are time series; the case for continuous adaptation. arXiv preprint arXiv:181002266

  67. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:13013781

  68. Arora S, Liang Y, Ma T (2017) A simple but tough-to-beat baseline for sentence embeddings. In: 5th international conference on learning representations, ICLR

  69. Gong B, Shi Y, Sha F, Grauman K (2012) Geodesic flow kernel for unsupervised domain adaptation. In: IEEE conference on computer vision and pattern recognition, IEEE, pp 2066–2073

  70. Sheskin DJ (2007) Handbook of parametric and nonparametric statistical procedures. Chapman & Hall/CRC, Boca Raton

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

MM wrote the main manuscript text, prepared all tables and figures, and collaborated on the methodology and investigation. MR collaborated on the methodology and investigation. AS collaborated on the methodology and investigation. All authors reviewed the manuscript.

Corresponding author

Correspondence to Mohammad Rahmanimanesh.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix I

This section introduces some definitions of the fuzzy theory briefly.

Definition 1

A fuzzy set \(\overline{A }\) in a universe of discourse \(X=\left\{x\right\}\) is defined as \(\overline{A }=\left\{x\text{,}\mu \left(x|\overline{A }\right)\right\}\) where \(\mu \left(x|\overline{A }\right)\subseteq [0\text{,}1]\) is the membership function of an element \(x\text{,} \;x\in {\mathbb{R}}^{n}\) in a fuzzy set \(\overline{A }\).

Definition 2

A fuzzy set \(\overline{A }\left({a}_{1}\text{,}{a}_{2}\text{,}\dots \text{,}{a}_{n}\right)\) of \({\mathbb{R}}^{n}\) is called a fuzzy vector at non-fuzzy set \(A\left({a}_{1}\text{,}{a}_{2}\text{,}\dots \text{,}{a}_{n}\right){\in {\mathbb{R}}}^{n}\) if its membership function \(\mu \) fulfills the following properties.

  1. 1.

    \(\mu \left({x}_{1}\text{,}{x}_{2}\text{,}\dots \text{,}{x}_{n}|\overline{A }({a}_{1}\text{,}{a}_{2}\text{,}\dots \text{,}{a}_{n})\right)=1\) is upper semi-continuous in \({\varvec{x}}=({x}_{1}\text{,}{x}_{2}\text{,}\dots \text{,}{x}_{n}){\in {\mathbb{R}}}^{n}\).

  2. 2.

    \(\mu \left(({x}_{1}\text{,}{x}_{2}\text{,}\dots \text{,}{x}_{n})|\overline{A }({a}_{1}\text{,}{a}_{2}\text{,}\dots \text{,}{a}_{n})\right)=1\) if and only if \(\left({x}_{1}\text{,}{x}_{2}\text{,}\dots \text{,}{x}_{n}\right)=({a}_{1}\text{,}{a}_{2}\text{,}\dots \text{,}{a}_{n})\).

  3. 3.

    \({\overline{A} }_{\alpha }=\left\{{\varvec{x}}|\mu \left({\varvec{x}}|\overline{A }\left({a}_{1}\text{,}{a}_{2}\text{,}\dots \text{,}{a}_{n}\right)\right)=\alpha \text{,} {\varvec{x}}{\in {\mathbb{R}}}^{n}\right\}\) is a compact convex subset of \({\mathbb{R}}^{n}\) for all \(\alpha \) in [0,1].

Definition 3

The \(\alpha \)-cut of a fuzzy set \(\overline{A }\) in \(X\), is defined as for each \(\alpha \in (0\text{,}1]\).

Definition 4

Given \(N\) fuzzy sets, \({\overline{A} }_{1}\text{,}{\overline{A} }_{2}\text{,}\dots \text{,}{\overline{A} }_{N}\), an operator \(R:\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)\to [0\text{,}1]\) indicates a fuzzy relation if the following properties are fulfilled.

  1. 1.

    Reflexivity property: \(R\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)=1\text{,} \forall {\overline{A} }_{i}\).

  2. 2.

    Symmetry property: \(R\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)=R\left({\overline{A} }_{j}\text{,}{\overline{A} }_{i}\right)\text{,} \forall {\overline{A} }_{i}\text{,}{\overline{A} }_{j}\)

Clearly, the fuzzy relation \(R\) is an \(N\times N\) matrix \({R}^{M}=\left({r}_{ij}\right)\text{,} {r}_{ij}=R\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)\).

Definition 5

The max–min composition of two fuzzy relations \({R}_{a}^{M}\) and \({R}_{b}^{M}\) is defined as:

$${\left({R}_{a}^{M}\circ {R}_{b}^{M}\right)}_{ij}=\underset{k=1}{\overset{N}{\bigvee }}\left({r}_{ik}^{(a)}\bigwedge {r}_{kj}^{(b)}\right)$$
(I-1)

where \({r}_{ik}^{(a)}\) is the element of \({R}_{a}^{M}\), \({r}_{kj}^{(b)}\) is the element of \({R}_{b}^{M}\), \(\bigvee \) represents the maximum operation, and \(\wedge \) denotes the minimum operation.

Definition 6

Given \(N\) fuzzy sets, \({\overline{A} }_{1}\text{,}{\overline{A} }_{2}\text{,}\dots \text{,}{\overline{A} }_{N}\), an operator \(R:\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)\to [0\text{,}1]\) indicates a fuzzy equivalence relation if the following properties are fulfilled.

  1. 1.

    Reflexivity property: \(R\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)=1\text{,} \forall {\overline{A} }_{i}\).

  2. 2.

    Symmetry property: \(R\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)=R\left({\overline{A} }_{j}\text{,}{\overline{A} }_{i}\right)\text{,} \forall {\overline{A} }_{i}\text{,}{\overline{A} }_{j}\)

  3. 3.

    Transitivity property: \({R}_{(2)}^{M}={R}_{(1)}^{M}\circ {R}_{(1)}^{M}\).

where \({R}^{M}\) is the fuzzy relation matrix \(R\) and \(\circ \) is the max–min operator. Using fuzzy relations, a fuzzy equivalence relation is constructed.

Definition 7

Given \(N\) fuzzy sets, \({\overline{A} }_{1}\text{,}{\overline{A} }_{2}\text{,}\dots \text{,}{\overline{A} }_{N}\), there must be a finite \(m\in {\mathbb{Z}}\), and an operator \({R}_{T}\) holds the following conditions:

$$ R_{T}^{M} = R_{\left( m \right)}^{M} = \underbrace {{R^{M} \circ R^{M} \circ \ldots \circ R^{M} }}_{m} $$
$${R}_{T}^{M}={R}_{T}^{M}\circ {R}_{T}^{M}$$

where \({R}^{M}\) is the fuzzy relation matrix \(R\), \(\circ \) is the max–min operator, and \({R}_{T}^{M}\) is the fuzzy relation matrix of \({R}_{T}\). \({R}_{T}\) is the max–min transitive closure of \(R\).

Appendix II

This section contains some detailed information for computing the metric \(\mathcal{D}\) introduced by [61].

For each fuzzy feature vector \({\overline{A} }_{i}\left({a}_{i1}\text{,}{a}_{i2}\text{,}\dots \text{,}{a}_{in}\right)\in F({\mathbb{R}}^{n})\). For each \({\overline{a} }_{ij}\in F({\mathbb{R}})\), its membership function is computed by Eq. (II-1).

$${\mu }_{ij}(x|{\overline{a} }_{ij})=\left\{\begin{array}{l}0\; \forall x<{a}_{ij}-{\rho }_{i} \\ 1-\frac{{\Vert x-{a}_{ij}\Vert }_{1}}{{\rho }_{i}} \forall {\Vert x-{a}_{ij}\Vert }_{1}\le {\rho }_{i}\text{,} \, x\in R \\ 0 \forall \;x>{a}_{ij}+{\rho }_{i}\end{array}\right.$$
(II-1)

where \({a}_{ij}\) indicates the \(j\) th feature value of the \(i\) th sample and \({\rho }_{i}\) shows the hesitation degree of the \(i\) th sample considering a triangular membership function. Utilizing Eq. (II-2), \({\mu }_{ij}({\varvec{x}}|{\overline{A} }_{i})\) where \({\varvec{x}}=({x}_{1}\text{,}{ x}_{2}\text{,}\dots \text{, }{x}_{n})\) is obtained.

$${\mu }_{ij}({\varvec{x}}|{\overline{A} }_{i})= \left\{\begin{array}{l}0\; if \exists {x}_{j}\text{,} { x}_{j}<{a}_{ij}-{\rho }_{i} \\ 1-\frac{{\Vert {\varvec{x}}-{a}_{ij}\Vert }_{1}}{{n\rho }_{i}} \;if \forall {x}_{j}\text{,} {\Vert x-{a}_{ij}\Vert }_{1}\le {\rho }_{i}\text{,} x\in {\mathbb{R}}^{n} \\ 0\; if \exists {x}_{j}\text{,} { x}_{j}>{a}_{ij}+{\rho }_{i}\end{array}\right.$$
(II-2)

To define the fuzzy relation between two heterogeneous domains (source and target), the following metric is defined to measure the distance between the fuzzy vectors:

$$\mathcal{D}\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)=\frac{1}{n}\underset{0}{\overset{1}{\int }}sup\left\{{\mathcal{D}}_{\lambda }\left(u\text{,}v\right): {\mathcal{D}}_{\lambda }\left(u\text{,}v\right)\in\Omega \left(\lambda \right)\right\}d\lambda ; u\in {\overline{A} }_{i}\left(\lambda \right)\text{,} v\in {\overline{A} }_{j}(\lambda )$$
(II-3)

where \(\lambda \) is the membership value, \({\mathcal{D}}_{\lambda }\left(u\text{,}v\right)\) indicates the distance between points \(u\) and \(v\) in \({\mathbb{R}}^{n}\) with the given \(\lambda \). \(\Omega \left(\lambda \right)\) is computed by Eq. (II-4).

$$\Omega \left(\lambda \right)=\left\{d(u\text{,}{\overline{A} }_{j}(\lambda ))\right\}\cup \left\{d(v\text{,}{\overline{A} }_{i}(\lambda ))\right\}$$
$$d(u\text{,}{\overline{A} }_{j}(\lambda ))=min\left\{d\left(u\text{,}v\right)\text{,} v\in {\overline{A} }_{j}(\lambda )\right\}$$
$$d(v\text{,}{\overline{A} }_{i}(\lambda ))=min\left\{d\left(v\text{,}u\right)\text{,} u\in {\overline{A} }_{i}(\lambda )\right\}$$
(II-4)

where \(d\left(v\text{,}u\right)\) is the \({l}_{1}\)-norm between two n-dimensional vectors (\(u\) and \(v\)). Note that the supremum operator (sup) in Eq. (II-3) indicates the longest distance between the fuzzy vector of one specific domain to the fuzzy set of another domain. Eq. (II-3) can be rewritten as Eq. (II-5).

$$\mathcal{D}\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)=\frac{1}{n}\underset{0}{\overset{1}{\int }}d\left({A}_{i}\text{,}{A}_{j}\right)+\frac{1}{2}\left|d\left(u\text{,}{\overline{A} }_{j}\left(\lambda \right)\right)+d(v\text{,}{\overline{A} }_{i}(\lambda ))\right|d\lambda $$
(II-5)

The above equation is de-fuzzified regarding Eqs. (II-1) and (II-2) as follows:

$$\mathcal{D}\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)=\frac{1}{n}d\left({A}_{i}\text{,}{A}_{j}\right)+\frac{1}{2}\underset{0}{\overset{1}{\int }}\left|\left(1-\lambda \right){\rho }_{i}-\left(1-\lambda \right){\rho }_{j}\right|d\lambda $$
$$=\frac{1}{n}d\left({A}_{i}\text{,}{A}_{j}\right)+\frac{1}{2}{\Vert {\rho }_{i}-{\rho }_{j}\Vert }_{1}\underset{0}{\overset{1}{\int }}\left(1-\lambda \right)d\lambda $$
$$=\frac{1}{n}d\left({A}_{i}\text{,}{A}_{j}\right)+\frac{1}{4}{\Vert {\rho }_{i}-{\rho }_{j}\Vert }_{1}$$
(II-6)

Eq. (II-6) cannot be used directly for computing the fuzzy relation because it does not satisfy two properties of the fuzzy relation, including (1) symmetry, a condition in which \(\mathcal{D}\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)=\mathcal{D}\left({\overline{A} }_{j}\text{,}{\overline{A} }_{i}\right)\text{, }\forall {\overline{A} }_{i}\text{, }{\overline{A} }_{j}\) and (2) reflexivity, a condition in which \(\mathcal{D}\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)=1\text{,} \forall {\overline{A} }_{i}\). Thus, the following function is employed.

$${R}_{\mathcal{D}}\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)={e}^{-\frac{\mathcal{D}\left({\overline{A} }_{i}\text{,}{\overline{A} }_{j}\right)}{2{\sigma }^{2}}}$$
(II-7)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Moradi, M., Rahmanimanesh, M. & Shahzadi, A. Transfer learning for concept drifting data streams in heterogeneous environments. Knowl Inf Syst 66, 2799–2857 (2024). https://doi.org/10.1007/s10115-023-02043-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-02043-w

Keywords

Navigation