Identification of Navigation Lead Candidates Using Citation and Co-Citation Analysis

  • Robert MoroEmail author
  • Mate Vangel
  • Maria Bielikova
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9587)


Query refinement is an integral part of search, especially for the exploratory search scenarios, which assume that the users start with ill-defined information needs that change over time. In order to support exploratory search and navigation, we have proposed an approach of exploratory navigation in digital libraries using navigation leads. In this paper, we focus specifically on the identification of the navigation lead candidates using keyword extraction. For this purpose, we utilize the citation sentences as well as the co-citations. We hypothesize that they can improve the quality of the extracted keywords in terms of finding new keywords (that would not be otherwise discovered) as well as promoting the important keywords by increasing their relevance. We have quantitatively evaluated our method in the domain of digital libraries using experts’ judgement on the relevance of the extracted keywords. Based on our results, we can conclude that using the citations and the co-citations improves the results of extraction of the most relevant terms over the TF-IDF baseline.


Navigation leads Keyword extraction Domain modeling Citation analysis Co-citations Digital libraries 



This work was partially supported by the Cultural and Educational Grant Agency of the Slovak Republic, grant No. KEGA 009STU-4/2014, the Scientific Grant Agency of the Slovak Republic, grant No. VG 1/0646/15, and by the Slovak Research and Development Agency under the contract No. APVV-0208-10.


  1. 1.
    Bertin, M., Atanassova, I.: A Study of Lexical Distribution in Citation Contexts through the IMRaD Standarda. In: Proceedings of the 1st Workshop on Bibliometric-Enhanced Information Retrieval Co-located with 36th European Conference on Information Retrieval (ECIR 2014), pp. 5–12. CEUR-WS (2014)Google Scholar
  2. 2.
    Broder, A.: A taxonomy of web search. ACM SIGIR Forum. 36, 3–10 (2002)CrossRefGoogle Scholar
  3. 3.
    Councill, I.G., Giles, C.L., Kan, M.: ParsCit: an open-source CRF reference string parsing package. In: LREC 2008: Proceedings of the 6th International Conference on Language Resources and Evaluation, pp. 661–667. ELRA (2008)Google Scholar
  4. 4.
    Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D., Radev, D.R.: Blind men and elephants: what do citation summaries tell us about a research article? J. Am. Soc. Inf. Sci. Technol. 59, 51–62 (2008)CrossRefGoogle Scholar
  5. 5.
    Gipp, B., Beel, J.: Citation proximity analysis (CPA) - a new approach for identifying related work based on co-citation analysis. In: ISSI 2009: Proceedings of the 12th International Conference on Scientometrics and Informetrics, pp. 571–575. ISSI (2009)Google Scholar
  6. 6.
    Holub, M., Moro, R., Sevcech, J., Liptak, M., Bielikova, M.: Annota: towards enriching scientific publications with semantics and user annotations. D-Lib Mag. 20 (2014).
  7. 7.
    Klampfl, S., Kern, R.: An unsupervised machine learning approach to body text and table of contents extraction from digital scientific articles. In: Aalberg, T., Papatheodorou, C., Dobreva, M., Tsakonas, G., Farrugia, C.J. (eds.) TPDL 2013. LNCS, vol. 8092, pp. 144–155. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  8. 8.
    Liu, S., Chen, C.: The effects of co-citation proximity on co-citation analysis. In: ISSI 2011: Proceedings of the 13th International Conference of the International Society for Scientometrics and Informetrics, pp. 474–484 (2011)Google Scholar
  9. 9.
    Liu, S., Chen, C.: The differences between latent topics in abstracts and citation contexts of citing papers. J. Am. Soc. Inf. Sci. Technol. 64, 627–639 (2013)CrossRefGoogle Scholar
  10. 10.
    Marchionini, G.: Exploratory search: from finding to understanding. Commun. ACM. 49, 41–46 (2006)CrossRefGoogle Scholar
  11. 11.
    Moro, R., Bielikova, M.: Navigation leads selection considering navigational value of keywords. In: WWW 2015 Companion: Proceedings of the 24th International Conference on World Wide Web Companion, pp. 79–80. IW3C2, Geneva (2015)Google Scholar
  12. 12.
    Qazvinian, V., Radev, D.R., Özgür, A.: Citation summarization through keyphrase extraction. In: COLING 2010: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 895–903. Association for Computational Linguistics (2010)Google Scholar
  13. 13.
    Radev, D.R., Muthukrishnan, P., Qazvinian, V., Abu-Jbara, A.: The ACL anthology network corpus. Lang. Resour. Eval. 47, 919–944 (2013)CrossRefGoogle Scholar
  14. 14.
    Ritchie, A., Robertson, S., Teufel, S.: Comparing citation contexts for information retrieval. In: CIKM 2008: Proceedings of the 17th ACM Conference on Information and Knowledge Mining, pp. 213–222. ACM Press, New York (2008)Google Scholar
  15. 15.
    Robertson, S., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trendsin Inf. Retrieval 3(4), 333–389 (2009)CrossRefGoogle Scholar
  16. 16.
    White, R.W., Roth, R.A.: Exploratory Search: Beyond the Query-Response Paradigm. Morgan & Claypool, San Rafael (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Faculty of Informatics and Information TechnologiesSlovak University of Technology in BratislavaBratislavaSlovakia

Personalised recommendations