Abstract
This chapter provides a brief introduction to transfer learning with history and its importance. As data collection and labeling in a new domain are challenging, transfer learning can play a vital role to build a reusable model. After explaining the fundamentals of transfer learning, the strategies are presented followed by different pre-trained models in the fields of computer vision and natural language processing. We explored prominent models like VGG-16, Inception, ULMFiT, and BERT. After mentioning the successful models, applications and limitations of transfer learning have been discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blitzer, J., Dredze, M., & Pereira, F. (2007). Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics (pp. 440–447). United States Association for Computational Linguistics.
Bozinovski, S. (1981). Teaching space: A representation concept for adaptive pattern classification. Retrieved from COINS Technical Report, University of Massachusetts at Amherst
Bozinovski, S. (2020). Reminder of the First Paper on Transfer Learning in Neural Networks, 1976. Lithuanian Academy of Sciences. Informatica, 44(3). Retrieved 28 April 2021 from. https://doi.org/10.31449/inf.v44i3.2828
Caruana, R. (1997). Multitask learning. Machine Learning, 28(1), 41–75.
Dai, W., Xue, G.-R., Yang, Q., & Yu, Y. (2007a). Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 210–219). Association for Computing Machinery. Retrieved 29 April 2021 from https://doi.org/10.1145/1281192.1281218
Dai, W., Xue, G.-R., Yang, Q., & Yu, Y. (2007b). Transferring naive Bayes classifiers for text classification. In AAAI (Vol. 7, pp. 540–545).
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248–255). IEEE.
Deng, J., Zhang, Z., Marchi, E., & Schuller, B. (2013). Sparse autoencoder-based feature transfer learning for speech emotion recognition. In 2013 Humaine association conference on affective computing and intelligent interaction (pp. 511–516). IEEE.
Devlin, J., Chang, M. -W., Lee, K., & Toutanova, K. (2018, October 11). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv [cs.CL]. Retrieved from http://arxiv.org/abs/1810.04805
Eaton, E., desJardins, M., & Lane, T. (n.d.). Modeling transfer relationships between learning tasks for improved inductive transfer. Machine Learning and Knowledge Discovery in Databases. Retrieved from https://doi.org/10.1007/978-3-540-87479-9_39.
Farhadi, A., Forsyth, D., & White, R. (2007). Transfer learning in sign language. In 2007 IEEE conference on computer vision and pattern recognition (pp. 1–8). ieeexplore.ieee.org
Ge, L., Gao, J., Ngo, H., Li, K., & Zhang, A. (2014). On handling negative transfer and imbalanced distributions in multiple source transfer learning. Statistical Analysis and Data Mining, 7(4), 254–271.
Howard, J., & Ruder, S. (2018, January 18). Universal language model fine-tuning for text classification. arXiv [cs.CL]. Retrieved from http://arxiv.org/abs/1801.06146
Kan, M., Wu, J., Shan, S., & Chen, X. (2014). Domain adaptation for face recognition: Targetize source domain bridged by common subspace. International Journal of Computer Vision, 109(1-2), 94–109.
Kannan, A., Kurach, K., Ravi, S., Kaufmann, T., Tomkins, A., Miklos, B., et al. (2016). Smart reply: Automated response suggestion for email. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 955–964). Association for Computing Machinery. Retrieved 28 April 2021 from https://doi.org/10.1145/2939672.2939801
Ling, X., Xue, G.-R., Dai, W., Jiang, Y., Yang, Q., & Yu, Y. (2008). Can Chinese web pages be classified with English data source? In Proceedings of the 17th international conference on World Wide Web (pp. 969–978). Association for Computing Machinery. Retrieved 29 April 2021 from https://doi.org/10.1145/1367497.1367628
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013, October 16). Distributed representations of words and phrases and their compositionality. arXiv [cs.CL]. Retrieved from http://arxiv.org/abs/1310.4546
Milton-Barker, A. (2019). Inception V3 deep convolutional architecture For classifying acute Myeloid/Lymphoblastic Leukemia. Accessed on: February, 17.
Mou, L., Meng, Z., Yan, R., Li, G., Xu, Y., Zhang, L., & Jin, Z. (2016, March 19). How transferable are neural networks in NLP applications? arXiv [cs.CL]. Retrieved from http://arxiv.org/abs/1603.06111
Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.
Pratt, L. Y. (1993). Discriminability-based transfer between neural networks. Advances in Neural Information Processing Systems, 204–204.
Raina, R., Ng, A. Y., & Koller, D. (2006). Constructing informative priors using transfer learning. In Proceedings of the 23rd international conference on Machine learning (pp. 713–720). Association for Computing Machinery. Retrieved 29 April 2021 from https://doi.org/10.1145/1143844.1143934
Raina, R., Battle, A., Lee, H., Packer, B., & Ng, A. Y. (2007). Self-taught learning: Transfer learning from unlabeled data. In Proceedings of the 24th international conference on Machine learning (pp. 759–766). Association for Computing Machinery. Retrieved 28 April 2021 from https://doi.org/10.1145/1273496.1273592
Rosenstein, M. T., Marx, Z., Kaelbling, L. P., & Dietterich, T. G. (2005). To transfer or not to transfer. In NIPS 2005 workshop on transfer learning (Vol. 898, pp. 1–4). engr.oregonstate.edu
Ruan, S., Wobbrock, J. O., Liou, K., Ng, A., & Landay, J. (2016). Speech is 3x faster than typing for english and mandarin text entry on mobile devices. arXiv Preprint arXiv:1608. 07323. Retrieved from https://hci.stanford.edu/research/speech/paper/speech_paper.pdf
Simonyan, K., & Zisserman, A. (2014, September 4). Very deep convolutional networks for large-scale image recognition. arXiv [cs.CV]. Retrieved from http://arxiv.org/abs/1409.1556
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9).
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818–2826).
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017). Inception-v4, Inception-ResNet and the impact of residual connections on learning. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1) Retrieved 28 April 2021 from https://ojs.aaai.org/index.php/AAAI/article/view/11231
Thrun, S., & Pratt, L. (2012). Learning to learn. Springer.
Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., et al. (2016, September 26). Google’s neural Machine translation system: Bridging the gap between human and machine translation. arXiv [cs.CL]. Retrieved from http://arxiv.org/abs/1609.08144
Xie, M., Jean, N., Burke, M., Lobell, D., & Ermon, S. (2016). Transfer learning from deep features for remote sensing and poverty mapping. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). Retrieved 30 April 2021 from https://ojs.aaai.org/index.php/AAAI/article/view/9906
Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014, November 6). How transferable are features in deep neural networks? arXiv [cs.LG]. Retrieved from http://arxiv.org/abs/1411.1792
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Rafiq, R.B., Albert, M.V. (2022). Transfer Learning: Leveraging Trained Models on Novel Tasks. In: Albert, M.V., Lin, L., Spector, M.J., Dunn, L.S. (eds) Bridging Human Intelligence and Artificial Intelligence. Educational Communications and Technology: Issues and Innovations. Springer, Cham. https://doi.org/10.1007/978-3-030-84729-6_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-84729-6_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-84728-9
Online ISBN: 978-3-030-84729-6
eBook Packages: EducationEducation (R0)