Skip to main content

Generating Semantic Aspects for Queries

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11503))

Abstract

Large document collections can be hard to explore if the user presents her information need in a limited set of keywords. Ambiguous intents arising out of these short queries often result in long-winded query sessions and many query reformulations. To alleviate this problem, in this work, we propose the novel concept of semantic aspects (e.g., \({\langle }\{\textsf {michael\text {-}phelps}\}, \{\textsf {athens, beijing, london}\}, [2004,2016] \rangle \) for the ambiguous query ) and present the xFactor algorithm that generates them from annotations in documents. Semantic aspects uplift document contents into a meaningful structured representation, thereby allowing the user to sift through many documents without the need to read their contents. The semantic aspects are created by the analysis of semantic annotations in the form of temporal, geographic, and named entity annotations. We evaluate our approach on a novel testbed of over 5,000 aspects on Web-scale document collections amounting to more than 450 million documents. Our results show the xFactor algorithm finds relevant aspects for highly ambiguous queries.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. The ClueWeb09 dataset. http://lemurproject.org/clueweb09/

  2. The ClueWeb12 dataset. http://lemurproject.org/clueweb12/

  3. List of lists of lists. https://en.wikipedia.org/wiki/List_of_lists_of_lists

  4. Maria Sharapova. https://en.wikipedia.org/wiki/Maria_Sharapova

  5. The New York Times Annotated Corpus. https://catalog.ldc.upenn.edu/LDC2008T19

  6. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994, pp. 487–499 (1994)

    Google Scholar 

  7. Ben-Yitzhak, O., et al.: Beyond basic faceted search. In: WSDM 2008, pp. 33–44 (2008)

    Google Scholar 

  8. Berberich, K., Bedathur, S., Alonso, O., Weikum, G.: A language modeling approach for temporal information needs. In: Gurrin, C., et al. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 13–25. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12275-0_5

    Chapter  Google Scholar 

  9. Bhagavatula, C.S., Noraset, T., Downey, D.: TabEL: entity linking in web tables. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 425–441. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_25

    Chapter  Google Scholar 

  10. Bianchi, F., Palmonari, M., Nozza, D.: Towards encoding time in text-based entity embeddings. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11136, pp. 56–71. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00671-6_4

    Chapter  Google Scholar 

  11. Nguyen, T.N., Kanhabua, N., Nejdl, W.: Multiple models for recommending temporal aspects of entities. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 462–480. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_30

    Chapter  Google Scholar 

  12. Blei, D.M., et al.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  13. Bordino, I., et al.: Beyond entities: promoting explorative search with bundles. Inf. Retr. J. 19(5), 447–486 (2016)

    Article  Google Scholar 

  14. Ceccarelli, D., et al.: Learning relatedness measures for entity linking. In: CIKM 2013, pp. 139–148 (2013)

    Google Scholar 

  15. Clarke, C.L.A., et al.: Novelty and diversity in information retrieval evaluation. In: SIGIR 2008, pp. 659–666 (2008)

    Google Scholar 

  16. Dou, Z., et al.: Finding dimensions for queries. In: CIKM 2011, pp. 1311–1320 (2011)

    Google Scholar 

  17. Gabrilovich, E., et al.: FACC1: freebase annotation of ClueWeb corpora, version 1 (release date 2013-06-26, format version 1, correction level 0), June 2013

    Google Scholar 

  18. Grau, B.C. et al.: SemFacet: faceted search over ontology enhanced knowledge graphs. In: ISWC 2016 (2016)

    Google Scholar 

  19. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(Suppl. 1), 5228–5235 (2004)

    Article  Google Scholar 

  20. Guo, J., et al.: Named entity recognition in query. In: SIGIR 2009, pp. 267–274 (2009)

    Google Scholar 

  21. Gupta, D., Berberich, K.: Identifying time intervals of interest to queries. In: CIKM 2014, pp. 1835–1838 (2014)

    Google Scholar 

  22. Hearst, M.A.: Search User Interfaces, 1st edn. Cambridge University Press, New York (2009)

    Book  Google Scholar 

  23. Hearst, M.A., Plaunt, C.: Subtopic structuring for full-length document access. In: SIGIR 1993. pp. 59–68 (1993)

    Google Scholar 

  24. Henry, J.: Providing knowledge panels with search results, 2 May 2013. https://www.google.com/patents/US20130110825. US Patent App. 13/566,489

  25. Hoffart, J., et al.: STICS: searching with strings, things, and cats. In: SIGIR 2014, pp. 1247–1248 (2014)

    Google Scholar 

  26. Hoffart, J., et al.: Robust disambiguation of named entities in text. In: EMNLP 2011, pp. 782–792 (2011)

    Google Scholar 

  27. Kong, W., Allan, J.: Extracting query facets from search results. In: SIGIR 2013, pp. 93–102 (2013)

    Google Scholar 

  28. Koutrika, G., et al.: Generating reading orders over document collections. In: ICDE 2015, pp. 507–518 (2015)

    Google Scholar 

  29. Li, C., et al.: Facetedpedia: Dynamic generation of query-dependent faceted interfaces for Wikipedia. In: WWW 2010, pp. 651–660 (2010)

    Google Scholar 

  30. Reinanda, R., et al.: Mining, ranking and recommending entity aspects. In: SIGIR 2015, pp. 263–272 (2015)

    Google Scholar 

  31. Santos, R.L.T., et al.: Search result diversification. Found. Trends® Inf. Retr. 9(1), 1–90 (2015)

    Article  Google Scholar 

  32. Schuhmacher, M., et al.: Ranking entities for web queries through text and knowledge. In: CIKM 2015, pp. 1461–1470 (2015)

    Google Scholar 

  33. Strötgen, J., Gertz, M.: Multilingual and cross-domain temporal tagging. Lang. Resour. Eval. 47(2), 269–298 (2013)

    Article  Google Scholar 

  34. Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a large ontology from wikipedia and wordnet. Web Semant. 6(3), 203–217 (2008)

    Article  Google Scholar 

  35. Tran, N.K., Tran, T., Niederée, C.: Beyond time: dynamic context-aware entity recommendation. In: Blomqvist, E., Maynard, D., Gangemi, A., Hoekstra, R., Hitzler, P., Hartig, O. (eds.) ESWC 2017. LNCS, vol. 10249, pp. 353–368. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58068-5_22

    Chapter  Google Scholar 

  36. Zhang, R., et al.: Learning recurrent event queries for web search. In: EMNLP 2010, pp. 1129–1139 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dhruv Gupta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gupta, D., Berberich, K., Strötgen, J., Zeinalipour-Yazti, D. (2019). Generating Semantic Aspects for Queries. In: Hitzler, P., et al. The Semantic Web. ESWC 2019. Lecture Notes in Computer Science(), vol 11503. Springer, Cham. https://doi.org/10.1007/978-3-030-21348-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-21348-0_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-21347-3

  • Online ISBN: 978-3-030-21348-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics