Query-Oriented Keyphrase Extraction

  • Minghui Qiu
  • Yaliang Li
  • Jing Jiang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7675)

Abstract

People often issue informational queries to search engines to find out more about some entities or events. While a Wikipedia-like summary would be an ideal answer to such queries, not all queries have a corresponding Wikipedia entry. In this work we propose to study query-oriented keyphrase extraction, which can be used to assist search results summarization. We propose a general method for keyphrase extraction for our task, where we consider both phraseness and informativeness. We discuss three criteria for phraseness and four ways to compute informativeness scores. Using a large Wikipedia corpus and 40 queries, our empirical evaluation shows that using a named entity-based phraseness criterion and a language model-based informativeness score gives the best performance on our task. This method also outperforms two state-of-the-art baseline methods.

Keywords

Keyphrase extraction phraseness informativeness language model 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bailey, P., Craswell, N., de Vries, A.P., Soboroff, I.: Overview of the TREC 2007 enterprise track. In: Proceedings of the 16th Text Retrieval Conference (2007)Google Scholar
  2. 2.
    Balog, K., Serdyukov, P., de Vries, A.P.: Overview of the TREC 2010 entity track. In: Proceedings of the 19th Text Retrieval Conference (2010)Google Scholar
  3. 3.
    Balog, K., de Vries, A.P., Serdyukov, P., Thomas, P., Westerveld, T.: Overview of the TREC 2009 entity track. In: Proceedings of the 18th Text Retrieval Conference (2009)Google Scholar
  4. 4.
    Blanco, R., Zaragoza, H.: Finding support sentences for entities. In: SIGIR, pp. 339–346 (2010)Google Scholar
  5. 5.
    Broder, A.: A taxonomy of web search. SIGIR Forum 36(2), 3–10 (2002)CrossRefGoogle Scholar
  6. 6.
    Craswell, N., de Vries, A.P., Soboroff, I.: Overview of the TREC-2005 enterprise track. In: Proceedings of the 14th Text Retrieval Conference (2005)Google Scholar
  7. 7.
    Demartini, G., Iofciu, T., de Vries, A.P.: Overview of the INEX 2009 Entity Ranking Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 254–264. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  8. 8.
    Demartini, G., Missen, M.M.S., Blanco, R., Zaragoza, H.: Entity summarization of news articles. In: SIGIR, pp. 795–796 (2010)Google Scholar
  9. 9.
    Demartini, G., Missen, M.M.S., Blanco, R., Zaragoza, H.: Taer: time-aware entity retrieval-exploiting the past to find relevant entities in news articles. In: CIKM, pp. 1517–1520 (2010)Google Scholar
  10. 10.
    Demartini, G., de Vries, A.P., Iofciu, T., Zhu, J.: Overview of the INEX 2008 Entity Ranking Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 243–252. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  11. 11.
    Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: ACL, pp. 363–370 (2005)Google Scholar
  12. 12.
    Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase extraction. In: IJCAI, pp. 668–673 (1999)Google Scholar
  13. 13.
    Hasan, K.S., Ng, V.: Conundrums in unsupervised keyphrase extraction: Making sense of the state-of-the-art. In: COLING, pp. 365–373 (2010)Google Scholar
  14. 14.
    Jansen, B.J., Booth, D.L., Spink, A.: Determining the informational, navigational, and transactional intent of Web queries. IP&M 44(3), 1251–1266 (2008)Google Scholar
  15. 15.
    Leouski, A.V., Croft, W.B.: An evaluation of techniques for clustering search results. Tech. rep., University of Massachusetts at Amherst (1996)Google Scholar
  16. 16.
    Mihalcea, R., Tarau, P.: TextRank: Bringing order into texts. In: EMNLP, Barcelona, Spain (2004)Google Scholar
  17. 17.
    Qazvinian, V., Radev, D.R., Ozgur, A.: Citation summarization through keyphrase extraction. In: COLING, Beijing, China, pp. 895–903 (2010)Google Scholar
  18. 18.
    Soboroff, I., de Vries, A.P., Craswell, N.: Overview of the TREC 2006 enterprise track. In: Proceedings of the 15th Text Retrieval Conference (2006)Google Scholar
  19. 19.
    Tomokiyo, T., Hurst, M.: A language model approach to keyphrase extraction. In: Proceedings of ACL Workshop on Multiword Expressions, pp. 33–40 (2003)Google Scholar
  20. 20.
    Turney, P.D.: Learning algorithms for keyphrase extraction. Information Retrieval 2(4), 303–336 (2000)CrossRefGoogle Scholar
  21. 21.
    Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: Proceedings of the 23rd National Conference on Artificial Intelligence, pp. 855–860 (2008)Google Scholar
  22. 22.
    Wan, X., Yang, J., Xiao, J.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In: ACL, pp. 552–559 (2007)Google Scholar
  23. 23.
    Zeng, H.J., He, Q.C., Chen, Z., Ma, W.Y., Ma, J.: Learning to cluster web search results. In: SIGIR, pp. 210–217 (2004)Google Scholar
  24. 24.
    Zhao, X., Jiang, J., He, J., Song, Y., Achanauparp, P., Lim, E.-P., Li, X.: Topical keyphrase extraction from twitter. In: ACL-HLT, pp. 379–388 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Minghui Qiu
    • 1
  • Yaliang Li
    • 1
  • Jing Jiang
    • 1
  1. 1.School of Information SystemsSingapore Management UniversitySingapore

Personalised recommendations