An effective suggestion method for keyword search of databases


This paper solves the problem of providing high-quality suggestions for user keyword queries over databases. With the assumption that the returned suggestions are independent, existing query suggestion methods over databases score candidate suggestions individually and return the top-k best of them. However, the top-k suggestions have high redundancy with respect to the topics. To provide informative suggestions, the returned k suggestions are expected to be diverse, i.e., maximizing the relevance to the user query and the diversity with respect to topics that the user might be interested in simultaneously. In this paper, an objective function considering both factors is defined for evaluating a suggestion set. We show that maximizing the objective function is a submodular function maximization problem subject to n matroid constraints, which is an NP-hard problem. An greedy approximate algorithm with an approximation ratio O(\(\frac {1}{1+n}\)) is also proposed. Experimental results show that our suggestion outperforms other methods on providing relevant and diverse suggestions.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3
Figure 4


  1. 1.

    For the extreme case where a certain topic is highly supported by all candidate suggestions, we can simply discard this common topic, or set the budget as k for this topic.

  2. 2.

    Actually the suggestion set \(\mathcal {H}^{*}\) is a common basis in \(\bigcap _{i=1}^{n}\mathcal {I}_{i}\) since we require \(|\mathcal {H}^{*}|\) = k.


  1. 1.

    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying Search Results. In: WSDM, pp 5–14 (2009)

  2. 2.

    Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., Sudarshan, S.: Keyword Searching and Browsing in Databases Using Banks. In: ICDE, pp 431–440 (2002)

  3. 3.

    Boldi, P., Bonchi, F., Castillo, C., Vigna, S.: From Dango to Japanese Cakes: Query Reformulation Models and Patterns. In: Web Intelligence, pp 183–190 (2009)

  4. 4.

    Carbonell, J.G., Goldstein, J.: The use of Mmr, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In: SIGIR, pp 335–336 (1998)

  5. 5.

    Carterette, B., Chandar, P.: Probabilistic Models of Ranking Novel Documents for Faceted Topic Retrieval. In: CIKM, pp 1287–1296 (2009)

  6. 6.

    Chen, H., Karger, D.R.: Less is More: Probabilistic Models for Retrieving Fewer Relevant Documents. In: SIGIR, pp 429–436 (2006)

  7. 7.

    Cucerzan, S., White, R.W.: Query Suggestion Based on User Landing Pages. In: SIGIR, pp 875–876 (2007)

  8. 8.

    Ding, B., Yu, J.X., Wang, S., Qin, L., Zhang, X., Lin, X.: Finding Top-K Min-Cost Connected Trees in Databases. In: ICDE, pp 836–845 (2007)

  9. 9.

    Fisher, M., Nemhauser, G., Wolsey, L.: An analysis of approximations for maximizing submodular set functions ii. Math. Prog. Study 8, 73–87 (1978)

    MathSciNet  Article  MATH  Google Scholar 

  10. 10.

    He, H., Wang, H., Yang, J., Yu, P.S.: Blinks: Ranked Keyword Searches on Graphs. In: SIGMOD Conference, pp 305–316 (2007)

  11. 11.

    Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: Yago2: a spatially and temporally enhanced knowledge base from wikipedia. Artif. Intell. 194, 28–61 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  12. 12.

    Jones, R., Rey, B., Madani, O., Greiner, W.: Generating Query Substitutions. In: WWW, pp 387–396 (2006)

  13. 13.

    Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional Expansion for Keyword Search on Graph Databases. In: VLDB, pp 505–516 (2005)

  14. 14.

    Kato, M.P., Sakai, T., Tanaka, K.: Structured Query Suggestion for Specialization and Parallel Movement: Effect on Search Behaviors. In: WWW, pp 389–398 (2012)

  15. 15.

    Lenat, D.B.: Cyc: a large-scale investment in knowledge infrastructure. Commun. ACM 38(11), 32–38 (1995)

    Google Scholar 

  16. 16.

    Li, G., Ooi, B.C., Feng, J., Wang, J., Zhou, L.: Ease: an Effective 3-In-1 Keyword Search Method for Unstructured, Semi-Structured and Structured Data. In: SIGMOD Conference, pp 903–914 (2008)

  17. 17.

    Lu, Y., Wang, W., Li, J., Liu, C.: Xclean: Providing Valid Spelling Suggestions for Xml Keyword Queries. In: ICDE, pp 661–672 (2011)

  18. 18.

    Pu, K.Q., Yu, X.: Keyword query cleaning. PVLDB 1(1), 909–920 (2008)

    Google Scholar 

  19. 19.

    Radlinski, F., Dumais, S.T.: Improving Personalized Web Search Using Result Diversification. In: SIGIR, pp 691–692 (2006)

  20. 20.

    Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: IJCAI, pp 448–453 (1995)

  21. 21.

    Zhai, C., Cohen, W.W., Lafferty, J.D.: Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic Retrieval. In: SIGIR, pp 10–17 (2003)

Download references

Author information



Corresponding author

Correspondence to Hai Huang.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Huang, H., Chen, Z., Liu, C. et al. An effective suggestion method for keyword search of databases. World Wide Web 20, 729–747 (2017).

Download citation


  • Query suggestion
  • Query reformulation and keyword recommendation