Modeling Transfer Relationships Between Learning Tasks for Improved Inductive Transfer

  • Eric Eaton
  • Marie desJardins
  • Terran Lane
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5211)


In this paper, we propose a novel graph-based method for knowledge transfer. We model the transfer relationships between source tasks by embedding the set of learned source models in a graph using transferability as the metric. Transfer to a new problem proceeds by mapping the problem into the graph, then learning a function on this graph that automatically determines the parameters to transfer to the new learning task. This method is analogous to inductive transfer along a manifold that captures the transfer relationships between the tasks. We demonstrate improved transfer performance using this method against existing approaches in several real-world domains.


Transfer Function Logistic Regression Parameter Vector Predictive Accuracy Training Instance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Marx, Z., Rosenstein, M.T., Kaelbling, L.P., Dietterich, T.G.: Transfer learning with an ensemble of background tasks. In: NIPS 2005 Workshop on Transfer Learning, Whistler, BC, Canada (2005)Google Scholar
  2. 2.
    Kienzle, W., Chellapilla, K.: Personalized handwriting recognition via biased regularization. In: Proceedings of the Twenty-Third International Conference on Machine Learning, Pittsburgh, PA (2006)Google Scholar
  3. 3.
    Thrun, S., O’Sullivan, J.: Clustering learning tasks and the selective cross-task transfer of knowledge. Technical Report CMU-CS-95-209, Carnegie Mellon University, Pittsburgh, PA (November 1995)Google Scholar
  4. 4.
    Thrun, S., O’Sullivan, J.: Discovering structure in multiple learning tasks: the TC algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 489–497. Morgan Kaufmann, San Francisco (July 1996)Google Scholar
  5. 5.
    Bakker, B., Heskes, T.: Task clustering and gating for Bayesian multitask learning. Machine Learning Research 4, 83–99 (2003)CrossRefGoogle Scholar
  6. 6.
    Jordan, M., Jacobs, R.: Hierarchical mixtures of experts and the EM algorithm. Neural Computation 6(2), 181–214 (1994)CrossRefGoogle Scholar
  7. 7.
    Pratt, L.Y.: Transferring Previously Learned Back-Propagation Neural Networks to New Learning Tasks. PhD thesis, Rutgers University (June 1993)Google Scholar
  8. 8.
    Mitchell, T.M., Thrun, S.B.: Learning analytically and inductively. In: Mind Matters: A Tribute to Allen Newell, pp. 85–110. Lawrence Erlbaum Associates, Mahwah (1996)Google Scholar
  9. 9.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann, San Francisco (2000)Google Scholar
  10. 10.
    Duffy, D.E., Santner, T.J.: On the small sample properties of norm-restricted maximum likelihood estimators for logistic regression models. Communications in Statistics: Theory and Methods 18, 959–980 (1989)zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Le Cessie, S., Van Houwelingen, J.C.: Ridge estimators in logistic regression. Applied Statistics 41(1), 191–201 (1992)zbMATHCrossRefGoogle Scholar
  12. 12.
    Schölkopf, B., Smola, A.J.: Learning with Kernels. MIT Press, Cambridge (2002)Google Scholar
  13. 13.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2001)zbMATHGoogle Scholar
  14. 14.
    Chung, F.R.K.: Spectral Graph Theory. CBMS Regional Conference Series in Mathematics, vol. 92. American Mathematical Society, Providence, RI (1994)Google Scholar
  15. 15.
    Rosenberg, S.: The Laplacian on a Riemannian Manifold. Cambridge University Press, Cambridge (1997)zbMATHGoogle Scholar
  16. 16.
    Baker, C.T.H.: The Numerical Treatment of Integral Equations. Clarendon Press, Oxford (1977)zbMATHGoogle Scholar
  17. 17.
    Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the Nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(2) (February 2004)Google Scholar
  18. 18.
    Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a Gram matrix for improved kernel-based learning. Journal of Machine Learning Research 6, 2153–2175 (2005)MathSciNetGoogle Scholar
  19. 19.
    Asuncion, A., Newman, D.: UCI machine learning repository (2007)Google Scholar
  20. 20.
    Rennie, J.: 20 Newsgroups data set, sorted by date (September 2003),
  21. 21.
    Chung, F.: Laplacians and the Cheeger inequality for directed graphs. Annals of Combinatorics 9, 1–19 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Chung, F.: The diameter and Laplacian eigenvalues of directed graphs. Electronic Journal of Combinatorics 13(4) (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Eric Eaton
    • 1
  • Marie desJardins
    • 1
  • Terran Lane
    • 2
  1. 1.Department of Computer Science and Electrical EngineeringUniversity of Maryland Baltimore County 
  2. 2.Department of Computer ScienceUniversity of New Mexico 

Personalised recommendations