Abstract
Large language models (LLMs) increasingly serve as the backbone for classifying text associated with distinct domains and simultaneously several labels (classes). When encountering domain shifts, e.g., classifier of movie reviews from IMDb to Rotten Tomatoes, adapting such an LLM-based multi-label classifier is challenging due to incomplete label sets at the target domain and daunting training overhead. The existing domain adaptation methods address either image multi-label classifiers or text binary classifiers. In this paper, we design DALLMi, Domain Adaptation Large Language Model interpolator, a first-of-its-kind semi-supervised domain adaptation method for text data models based on LLMs, specifically BERT. The core of DALLMi is the novel variation loss and MixUp regularization, which jointly leverage the limited positively labeled and large quantity of unlabeled text and, importantly, their interpolation from the BERT word embeddings. DALLMi also introduces a label-balanced sampling strategy to overcome the imbalance between labeled and unlabeled data. We evaluate DALLMi against the partial-supervised and unsupervised approach on three datasets under different scenarios of label availability for the target domain. Our results show that DALLMi achieves higher mAP than unsupervised and partially-supervised approaches by 19.9% and 52.2%, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Each label represents a possible class.
- 2.
- 3.
- 4.
References
Bekker, J., Davis, J.: Learning from positive and unlabeled data: a survey. Mach. Learn. 109(4), 719–760 (2020)
Buonocore, T.M., Crema, C., Redolfi, A., Bellazzi, R., Parimbelli, E.: Localizing in-domain adaptation of transformer-based biomedical language models. J. Biomed. Informatics 144, 104431 (2023)
Cartucho, J., Ventura, R., Veloso, M.: Robust object recognition through symbiotic deep learning in mobile robots. In: IEEE/RSJ IROS, pp. 2336–2341 (2018)
Chen, H., Liu, F., Wang, Y., Zhao, L., Wu, H.: A variational approach for learning from positive and unlabeled data. In: NeurIPS, vol. 33, pp. 14844–14854 (2020)
Chronopoulou, A., Peters, M.E., Dodge, J.: Efficient hierarchical domain adaptation for pretrained language models. In: NAACL, pp. 1336–1351 (2022)
Crawford, M., Khoshgoftaar, T.M., Prusa, J.D., Richter, A.N., Najada, H.A.: Survey of review spam detection using machine learning techniques. J. Big Data 2, 23 (2015)
Deng, A., Wu, Y., Zhang, P., Lu, Z., Li, W., Su, Z.: A weakly supervised framework for real-world point cloud classification. Comput. Graph. 102, 78–88 (2022)
Eastwood, C., Mason, I., Williams, C.K.I., Schölkopf, B.: Source-free adaptation to measurement shift via bottom-up feature restoration. In: ICLR (2022)
Grangier, D., Iter, D.: The trade-offs of domain adaptation for neural language models. In: ACL, pp. 3802–3813 (2022)
Guo, Y., Rennard, V., Xypolopoulos, C., Vazirgiannis, M.: Bertweetfr: domain adaptation of pre-trained language models for French tweets. In: W-NUT, pp. 445–450 (2021)
Lee, L.H., Wan, C.H., Rajkumar, R., Isa, D.: An enhanced support vector machine classification framework by using euclidean distance function for text document categorization. Appl. Intell. 37(1), 80–99 (2012)
Liu, H., Long, M., Wang, J., Wang, Y.: Learning to adapt to evolving domains. In: NeurIPS, vol. 33, pp. 22338–22348 (2020)
Motiian, S., Piccirilli, M., Adjeroh, D.A., Doretto, G.: Unified deep supervised domain adaptation and generalization. In: IEEE ICCV, pp. 5716–5726 (2017)
Nasukawa, T., Yi, J.: Sentiment analysis: capturing favorability using natural language processing. In: K-CAP, pp. 70–77 (2003)
Pham, D.D., Koesnadi, S.M., Dovletov, G., Pauli, J.: Unsupervised adversarial domain adaptation for multi-label classification of chest x-ray. In: IEEE ISBI, pp. 1236–1240 (2021)
Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. MIT Press (2022)
Rietzler, A., Stabinger, S., Opitz, P., Engl, S.: Adapt or get left behind: domain adaptation through BERT language model finetuning for aspect-target sentiment classification. In: LREC, pp. 4933–4941 (2020)
Ryu, M., Lee, G., Lee, K.: Knowledge distillation for BERT unsupervised domain adaptation. Knowl. Inf. Syst. 64(11), 3113–3128 (2022)
Sachidananda, V., Kessler, J.S., Lai, Y.: Efficient domain adaptation of language models via adaptive tokenization. In: SustaiNLP@EMNLP, pp. 155–165 (2021)
Singh, I.P., Ghorbel, E., Kacem, A., Rathinam, A., Aouada, D.: Discriminator-free unsupervised domain adaptation for multi-label image classification. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023)
Singhal, P., Walambe, R., Ramanna, S., Kotecha, K.: Domain adaptation: challenges, methods, datasets, and applications. IEEE Access 11, 6973–7020 (2023)
Sun, X., et al.: Text classification via large language models. In: EMNLP 2023 Findings (2023). https://aclanthology.org/2023.findings-emnlp.603/
Verma, V., et al.: Manifold Mixup: better representations by interpolating hidden states. In: ICML, vol. 97, pp. 6438–6447 (2019)
Wang, D., Shelhamer, E., Liu, S., Olshausen, B.A., Darrell, T.: Tent: fully test-time adaptation by entropy minimization. In: ICLR (2021)
Yuan, Z., Zhang, K., Huang, T.: Positive label is all you need for multi-label classification. arXiv preprint arXiv:2306.16016 (2023)
Zhang, H., Cissé, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. In: ICLR (2018)
Zhang, Y., Zhang, H., Deng, B., Li, S., Jia, K., Zhang, L.: Semi-supervised models are strong unsupervised domain adaptation learners. arXiv preprint arXiv:2106.00417 (2021)
Acknowledgements
This work has been supported by the Spoke “FutureHPC & BigData” of the ICSC - Centro Nazionale di Ricerca in “High Performance Computing, Big Data and Quantum Computing”, funded by EU - NextGenerationEU and the EuPilot project funded by EuroHPC JU under G.A. 101034126.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bețianu, M., Mălan, A., Aldinucci, M., Birke, R., Chen, L. (2024). DALLMi: Domain Adaption for LLM-Based Multi-label Classifier. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14647. Springer, Singapore. https://doi.org/10.1007/978-981-97-2259-4_21
Download citation
DOI: https://doi.org/10.1007/978-981-97-2259-4_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2261-7
Online ISBN: 978-981-97-2259-4
eBook Packages: Computer ScienceComputer Science (R0)