Relational retrieval using a combination of path-constrained random walks

Abstract

Scientific literature with rich metadata can be represented as a labeled directed graph. This graph representation enables a number of scientific tasks such as ad hoc retrieval or named entity recognition (NER) to be formulated as typed proximity queries in the graph. One popular proximity measure is called Random Walk with Restart (RWR), and much work has been done on the supervised learning of RWR measures by associating each edge label with a parameter. In this paper, we describe a novel learnable proximity measure which instead uses one weight per edge label sequence: proximity is defined by a weighted combination of simple “path experts”, each corresponding to following a particular sequence of labeled edges. Experiments on eight tasks in two subdomains of biology show that the new learning method significantly outperforms the RWR model (both trained and untrained). We also extend the method to support two additional types of experts to model intrinsic properties of entities: query-independent experts, which generalize the PageRank measure, and popular entity experts which allow rankings to be adjusted for particular entities that are especially important.

References

  1. Agarwal, A., Chakrabarti, S., & Aggarwal, S. (2006). Learning to rank networked entities. In KDD

  2. Andrew, G., & Gao, J. (2007). Scalable training of l1-regularized log-linear models. In ICML.

  3. Arnold, A., & Cohen, W. W. (2009). Information extraction as link prediction: using curated citation networks to improve gene detection. In ICWSM.

  4. Aslam, J. A., Kanoulas, E., Pavlu, V., Savev, S., & Yilmaz, E. (2009). Document selection methodologies for efficient and effective learning-to-rank. In SIGIR.

  5. Balmin, A., Hristidis, V., & Papakonstantinou, Y. (2004). Objectrank: authority-based keyword search in databases. In VLDB.

  6. Chakrabarti, S., & Agarwal, A. (2006). Learning parameters in entity relationship graphs from ranking preferences. In PKDD.

  7. Diligenti, M., Gori, M., & Maggini, M. (2005). Learning web page scores by error back-propagation. In IJCAI.

  8. Dou, Z., Song, R., & Wen, J.-R. (2007). A large-scale evaluation and analysis of personalized search strategies. In WWW.

  9. Getoor, L., & Taskar, B. (2007). Introduction to statistical relational learning. Cambridge: MIT Press.

    Google Scholar 

  10. He, Q., Pei, J., Kifer, D., Mitra, P. & Giles, C. L. (2010). Context-aware citation recommendation.

  11. Liben-Nowell, D., & Kleinberg, J. (2007) The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology.

  12. Lupu, M., Piroi, F., Huang, X., Zhu, J., & Tait, J. (2009). Overview of the TREC 2009 chemical IR track. In TREC-18.

  13. Minkov, E., & Cohen, W. W. (2008) Learning graph walk based similarity measures for parsed text. In EMNLP.

  14. Minkov, E., Cohen, W. W., & Ng, A. Y. (2006). Contextual search and name disambiguation in email using graphs. In SIGIR.

  15. Nie, Z., Zhang, Y., Wen, J.-R., & Ma, W.-Y. (2005). Object-level ranking: bringing order to web objects. In WWW.

  16. Pavlu, V. (2008). Large scale IR evaluation. PhD thesis, Northeastern University, College of Computer and Information Science.

  17. Perkins, S., Lacker, K., & Theiler, J. (2003). Grafting: fast, incremental feature selection by gradient descent in function space. Journal of Machine Learning Research.

  18. Richardson, M., & Domingos, P. (2006). Markov logic networks. Machine Learning.

  19. Toutanova, K., Manning, C. D., & Ng, A. Y. (2004). Learning random walk models for inducing word dependency distributions.

  20. Tsoi, A. C., Morini, G., Scarselli, F., Hagenbuchner, M., & Maggini, M. (2003). Adaptive ranking of web pages. In WWW.

  21. White, R. W., Bilenko, M., & Cucerzan, S. (2007). Studying the use of popular destinations to enhance web search interaction. In SIGIR.

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ni Lao.

Additional information

Editors: José L Balcázar, Francesco Bonchi, Aristides Gionis, and Michèle Sebag.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Lao, N., Cohen, W.W. Relational retrieval using a combination of path-constrained random walks. Mach Learn 81, 53–67 (2010). https://doi.org/10.1007/s10994-010-5205-8

Download citation

Keywords

  • Entity relation graph
  • Random walk
  • Learning to rank
  • Relational model
  • Filtering and recommending