Abstract
Similarity functions are widely used in many machine learning or pattern recognition tasks. We consider here a recent framework for binary classification, proposed by Balcan et al., allowing to learn in a potentially non geometrical space based on good similarity functions. This framework is a generalization of the notion of kernels used in support vector machines in the sense that allows one to use similarity functions that do not need to be positive semi-definite nor symmetric. The similarities are then used to define an explicit projection space where a linear classifier with good generalization properties can be learned. In this paper, we propose to study experimentally the usefulness of similarity based projection spaces for transfer learning issues. More precisely, we consider the problem of domain adaptation where the distributions generating learning data and test data are somewhat different. We stand in the case where no information on the test labels is available. We show that a simple renormalization of a good similarity function taking into account the test data allows us to learn classifiers more performing on the target distribution for difficult adaptation problems. Moreover, this normalization always helps to improve the model when we try to regularize the similarity based projection space in order to move closer the two distributions. We provide experiments on a toy problem and on a real image annotation task.
This work was supported in part by the french project VideoSense ANR-09-CORD-026 of the ANR in part by the IST Programme of the European Community, under the PASCAL2 Network of Excellence, IST-2007-216886. This publication only reflects the authors’ views.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ayache, S., Quénot, G., Gensel, J.: Image and video indexing using networks of operators. Journal on Image and Video Processing, 1:1–1:13 (2007)
Balcan, M.F., Blum, A., Srebro, N.: Improved guarantees for learning via similarity functions. In: Proceedings of COLT, pp. 287–298 (2008)
Balcan, M.F., Blum, A., Srebro, N.: A theory of learning with similarity functions. Machine Learning Journal 72(1-2), 89–112 (2008)
Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.: A theory of learning from different domains. Machine Learning Journal 79(1-2), 151–175 (2010)
Ben-David, S., Blitzer, J., Crammer, K., Pereira, F.: Analysis of representations for domain adaptation. In: Proceedings of NIPS 2006, pp. 137–144 (2006)
Ben-David, S., Lu, T., Luu, T., Pal, D.: Impossibility theorems for domain adaptation. JMLR W&CP 9, 129–136 (2010)
Bernard, M., Boyer, L., Habrard, A., Sebban, M.: Learning probabilistic models of tree edit distance. Pattern Recognition 41(8), 2611–2629 (2008)
Bickel, S., Brückner, M., Scheffer, T.: Discriminative learning for differing training and test distributions. In: Proceeding of ICML, pp. 81–88 (2007)
Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proceedings of EMNLP, pp. 120–128 (2006)
Bruzzone, L., Marconcini, M.: Domain adaptation problems: A DASVM classification technique and a circular validation strategy. IEEE Trans. Pattern Anal. Mach. Intell. 32(5), 770–787 (2010)
Daumé III, H.: Frustratingly easy domain adaptation. In: Proceedings of the Association for Computational Linguistics, ACL (2007)
Davis, J., Kulis, B., Jain, P., Sra, S., Dhillon, I.: Information-theoretic metric learning. In: Proceedings of ICML, pp. 209–216 (2007)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) Results (2007), http://www.pascal-network.org/challenges/VOC/voc2007/workshop/
Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Analysis & Applications 13(1), 113–129 (2010)
Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Proceedings of NIPS, vol. 17, pp. 513–520 (2004)
Haasdonk, B.: Feature space interpretation of svms with indefinite kernels. IEEE Trans. Pattern Anal. Mach. Intell. 27(4), 482–492 (2005)
Huang, J., Smola, A., Gretton, A., Borgwardt, K., Schölkopf, B.: Correcting sample selection bias by unlabeled data. In: Proceedings of NIPS, pp. 601–608 (2006)
Jiang, J.: A literature survey on domain adaptation of statistical classifiers. Tech. rep., Computer Science Department at University of Illinois at Urbana-Champaign (2008), http://sifaka.cs.uiuc.edu/jiang4/domain_adaptation/da_survey.pdf
Jiang, J., Zhai, C.: Instance weighting for domain adaptation in NLP. In: Proceedings of ACL (2007)
Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation: Learning bounds and algorithms. In: Proceedings of COLT, pp. 19–30 (2009)
Pan, S., Tsang, I., Kwok, J., Yang, Q.: Domain adaptation via transfer component analysis. In: Proceedings of IJCAI, pp. 1187–1192 (2009)
Pan, S., Yang, Q.: A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(10), 1345–1359 (2010)
Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.: Dataset Shift in Machine Learning. The MIT Press, Cambridge (2009)
Ristad, E., Yianilos, P.: Learning string-edit distance. IEEE Trans. on Pattern Analysis and Machine Intelligence 20(5), 522–532 (1998)
Smeaton, A., Over, P., Kraaij, W.: High-Level Feature Detection from Video in TRECVid: a 5-Year Retrospective of Achievements. In: Multimedia Content Analysis, Theory and Applications, pp. 151–174. Springer, Berlin (2009)
Sugiyama, M., Nakajima, S., Kashima, H., von Bünau, P., Kawanabe, M.: Direct importance estimation with model selection and its application to covariate shift adaptation. In: Proceedings of NIPS (2007)
Weinberger, K., Saul, L.: Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research (JMLR) 10, 207–244 (2009)
Xu, H., Mannor, S.: Robustness and generalization. In: Proceedings of COLT, pp. 503–515 (2010)
Zhong, E., Fan, W., Yang, Q., Verscheure, O., Ren, J.: Cross validation framework to choose amongst models and datasets for transfer learning. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part III. LNCS, vol. 6323, pp. 547–562. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Morvant, E., Habrard, A., Ayache, S. (2011). On the Usefulness of Similarity Based Projection Spaces for Transfer Learning. In: Pelillo, M., Hancock, E.R. (eds) Similarity-Based Pattern Recognition. SIMBAD 2011. Lecture Notes in Computer Science, vol 7005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24471-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-24471-1_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24470-4
Online ISBN: 978-3-642-24471-1
eBook Packages: Computer ScienceComputer Science (R0)