Skip to main content

Position-Aligned Translation Model for Citation Recommendation

  • Conference paper
Book cover String Processing and Information Retrieval (SPIRE 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7608))

Included in the following conference series:

Abstract

The goal of a citation recommendation system is to suggest some references for a snippet in an article or a book, and this is very useful for both authors and the readers. The citation recommendation problem can be cast as an information retrieval problem, in which the query is the snippet from an article, and the relevant documents are the cited articles. In reality, the citation snippet and the cited articles may be described in different terms, and this makes the citation recommendation task difficult. Translation model is very useful in bridging the vocabulary gap between queries and documents in information retrieval. It can be trained on a collection of query and document pairs, which are assumed to be parallel. However, such training data contains much noise: a relevant document usually contains some relevant parts along with irrelevant ones. In particular, the citation snippet may only mention only some parts of the cited article’s content. To cope with this problem, in this paper, we propose a method to train translation models on such noisy data, called position-aligned translation model. This model tries to align the query to the most relevant parts of the document, so that the estimated translation probabilities could rely more on them. We test this model in a citation recommendation task for scientific papers. Our experiments show that the proposed method can significantly improve the previous retrieval methods based on translation models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: SIGIR 1996, pp. 4–11 (1996)

    Google Scholar 

  2. Karimzadehgan, M., Zhai, C.: Estimation of statistical translation models based on mutual information for ad hoc information retrieval. In: Proceeding of SIGIR 2010, pp. 323–330 (2010)

    Google Scholar 

  3. Berger, A., Lafferty, J.: Information retrieval as statistical translation. In: Proceedings of SIGIR 1999, pp. 222–229 (1999)

    Google Scholar 

  4. Nie, J.Y., Simard, M., Isabelle, P., Durand, R.: Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web. In: Proceedings SIGIR 1999, pp. 74–81 (1999)

    Google Scholar 

  5. Lavrenko, V., Choquette, M., Croft, W.B.: Cross-lingual relevance models. In: Proceedings of SIGIR 2002, pp. 175–182 (2002)

    Google Scholar 

  6. Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: Proceeding of SIGIR 2008, pp. 475–482 (2008)

    Google Scholar 

  7. Murdock, V., Croft, W.B.: A translation model for sentence retrieval. In: Proceedings of HLT 2005, pp. 684–691. Association for Computational Linguistics, Stroudsburg (2005)

    Chapter  Google Scholar 

  8. Gao, J., He, X., Nie, J.Y.: Clickthrough-based translation models for web search: from word models to phrase models. In: Proceedings of CIKM 2010, pp. 1139–1148 (2010)

    Google Scholar 

  9. Metzler, D., Bernstein, Y., Croft, W.B., Moffat, A., Zobel, J.: Similarity measures for tracking information flow. In: CIKM 2005, pp. 517–524 (2005)

    Google Scholar 

  10. Lu, Y., He, J., Shan, D., Yan, H.: Recommending citations with translation model. In: Proceedings of CIKM 2011, pp. 2017–2020 (2011)

    Google Scholar 

  11. Fung, P., Cheung, P.: Mining very-non-parallel corpora: Parallel sentence and lexicon extraction via bootstrapping and em. In: Proceedings of EMNLP 2004, pp. 57–63 (2004)

    Google Scholar 

  12. Zhao, B., Vogel, S.: Adaptive parallel sentences mining from web bilingual news collection. In: Proceedings of ICDM 2002, p. 745 (2002)

    Google Scholar 

  13. Lv, Y., Zhai, C.: Positional language models for information retrieval. In: Proceedings of SIGIR 2009, pp. 299–306 (2009)

    Google Scholar 

  14. Wang, M., Si, L.: Discriminative probabilistic models for passage based retrieval. In: Proceedings of SIGIR 2008, pp. 419–426. ACM, New York (2008)

    Chapter  Google Scholar 

  15. Hearst, M.A., Plaunt, C.: Subtopic structuring for full-length document access. In: Proceedings of SIGIR 2003, pp. 59–68 (1993)

    Google Scholar 

  16. Bestgen, Y.: Improving text segmentation using latent semantic analysis: A reanalysis of Choi, Wiemer-hastings, and Moore (2001); Comput. Linguist. 32, 5–12 (2006)

    Google Scholar 

  17. Misra, H., Yvon, F., Cappé, O., Jose, J.: Text segmentation: A topic modeling perspective. Inf. Process. Manage. 47, 528–544 (2011)

    Article  Google Scholar 

  18. Callan, J.P.: Passage-level evidence in document retrieval. In: Proceedings SIGIR 1994, pp. 302–310 (1994)

    Google Scholar 

  19. Zobel, J., Moffat, A., Wilkinson, R., Sacks-Davis, R.: Efficient retrieval of partial documents. Inf. Process. Manage. 31, 361–377 (1995)

    Article  Google Scholar 

  20. He, Q., Pei, J., Kifer, D., Mitra, P., Giles, L.: Context-aware citation recommendation. In: Proceedings of WWW 2010, pp. 421–430 (2010)

    Google Scholar 

  21. McNee, S.M., Albert, I., Cosley, D., Gopalkrishnan, P., Lam, S.K., Rashid, A.M., Konstan, J.A., Riedl, J.: On the recommending of citations for research papers. In: Proceedings of CSCW 2002, pp. 116–125 (2002)

    Google Scholar 

  22. Zhou, D., Zhu, S., Yu, K., Song, X., Tseng, B.L., Zha, H., Giles, C.L.: Learning multiple graphs for document recommendations. In: Proceeding of WWW 2008, pp. 141–150 (2008)

    Google Scholar 

  23. Nascimento, C., Laender, A.H., da Silva, A.S., Gonçalves, M.A.: A source independent framework for research paper recommendation. In: Proceedings of JCDL 2011, pp. 297–306 (2011)

    Google Scholar 

  24. Kodakateri Pudhiyaveetil, A., Gauch, S., Luong, H., Eno, J.: Conceptual recommender system for citeseerx. In: Proceedings of RecSys 2009, pp. 241–244 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

He, J., Nie, JY., Lu, Y., Zhao, W.X. (2012). Position-Aligned Translation Model for Citation Recommendation. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds) String Processing and Information Retrieval. SPIRE 2012. Lecture Notes in Computer Science, vol 7608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34109-0_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34109-0_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34108-3

  • Online ISBN: 978-3-642-34109-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics