Machine Learning

, Volume 81, Issue 1, pp 53–67 | Cite as

Relational retrieval using a combination of path-constrained random walks

Article

Abstract

Scientific literature with rich metadata can be represented as a labeled directed graph. This graph representation enables a number of scientific tasks such as ad hoc retrieval or named entity recognition (NER) to be formulated as typed proximity queries in the graph. One popular proximity measure is called Random Walk with Restart (RWR), and much work has been done on the supervised learning of RWR measures by associating each edge label with a parameter. In this paper, we describe a novel learnable proximity measure which instead uses one weight per edge label sequence: proximity is defined by a weighted combination of simple “path experts”, each corresponding to following a particular sequence of labeled edges. Experiments on eight tasks in two subdomains of biology show that the new learning method significantly outperforms the RWR model (both trained and untrained). We also extend the method to support two additional types of experts to model intrinsic properties of entities: query-independent experts, which generalize the PageRank measure, and popular entity experts which allow rankings to be adjusted for particular entities that are especially important.

Keywords

Entity relation graph Random walk Learning to rank Relational model Filtering and recommending 

References

  1. Agarwal, A., Chakrabarti, S., & Aggarwal, S. (2006). Learning to rank networked entities. In KDD Google Scholar
  2. Andrew, G., & Gao, J. (2007). Scalable training of l1-regularized log-linear models. In ICML. Google Scholar
  3. Arnold, A., & Cohen, W. W. (2009). Information extraction as link prediction: using curated citation networks to improve gene detection. In ICWSM. Google Scholar
  4. Aslam, J. A., Kanoulas, E., Pavlu, V., Savev, S., & Yilmaz, E. (2009). Document selection methodologies for efficient and effective learning-to-rank. In SIGIR. Google Scholar
  5. Balmin, A., Hristidis, V., & Papakonstantinou, Y. (2004). Objectrank: authority-based keyword search in databases. In VLDB. Google Scholar
  6. Chakrabarti, S., & Agarwal, A. (2006). Learning parameters in entity relationship graphs from ranking preferences. In PKDD. Google Scholar
  7. Diligenti, M., Gori, M., & Maggini, M. (2005). Learning web page scores by error back-propagation. In IJCAI. Google Scholar
  8. Dou, Z., Song, R., & Wen, J.-R. (2007). A large-scale evaluation and analysis of personalized search strategies. In WWW. Google Scholar
  9. Getoor, L., & Taskar, B. (2007). Introduction to statistical relational learning. Cambridge: MIT Press. MATHGoogle Scholar
  10. He, Q., Pei, J., Kifer, D., Mitra, P. & Giles, C. L. (2010). Context-aware citation recommendation. Google Scholar
  11. Liben-Nowell, D., & Kleinberg, J. (2007) The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology. Google Scholar
  12. Lupu, M., Piroi, F., Huang, X., Zhu, J., & Tait, J. (2009). Overview of the TREC 2009 chemical IR track. In TREC-18. Google Scholar
  13. Minkov, E., & Cohen, W. W. (2008) Learning graph walk based similarity measures for parsed text. In EMNLP. Google Scholar
  14. Minkov, E., Cohen, W. W., & Ng, A. Y. (2006). Contextual search and name disambiguation in email using graphs. In SIGIR. Google Scholar
  15. Nie, Z., Zhang, Y., Wen, J.-R., & Ma, W.-Y. (2005). Object-level ranking: bringing order to web objects. In WWW. Google Scholar
  16. Pavlu, V. (2008). Large scale IR evaluation. PhD thesis, Northeastern University, College of Computer and Information Science. Google Scholar
  17. Perkins, S., Lacker, K., & Theiler, J. (2003). Grafting: fast, incremental feature selection by gradient descent in function space. Journal of Machine Learning Research. Google Scholar
  18. Richardson, M., & Domingos, P. (2006). Markov logic networks. Machine Learning. Google Scholar
  19. Toutanova, K., Manning, C. D., & Ng, A. Y. (2004). Learning random walk models for inducing word dependency distributions. Google Scholar
  20. Tsoi, A. C., Morini, G., Scarselli, F., Hagenbuchner, M., & Maggini, M. (2003). Adaptive ranking of web pages. In WWW. Google Scholar
  21. White, R. W., Bilenko, M., & Cucerzan, S. (2007). Studying the use of popular destinations to enhance web search interaction. In SIGIR. Google Scholar

Copyright information

© The Author(s) 2010

Authors and Affiliations

  1. 1.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations