Skip to main content
Log in

Book search using social information, user profiles and query expansion with Pseudo Relevance Feedback

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Book Search has gained astounding popularity worldwide. Nowadays, users search the items/products online. Users who have not any idea about the product they look towards the social information and user profiles. Social information is further categorized into structured information (e.g. rating and tags) and unstructured information (reviews and annotations). Consequently, how to offer the best recommendation or suggestion of items to end users is becoming a hot topic among researchers. The retrieval and recommendation of relevant documents to the users is a key issue in many domain e.g. songs, accessories, movies, books, etc. In this paper, taking social books as an example, we propose a novel Pseudo Relevance Feedback (PRF) framework for retrieving and searching for relevant documents using social information and user profiles. Especially, we have redesigned a typical distribution-based term selection strategy and transformation-based term selection strategy. Terms are selected and weighted in hope to avoid word mismatch problem and to improve retrieval of the relevant document. Finally, we develop a searching system, where Learning-to-Rank technique is used to adaptively combine the results which are obtained from various PRF strategies with user profiles and social information. Our proposed methodology is extensively evaluated on INEX/CLEF Social Book Search Track (SBS) datasets to verify the effectiveness and robustness of the proposed method. As a result, our proposed method shows the best performance (nDCG@10) on all 3-years SBS track (Suggestion Task) datasets compared to other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. SBS definition is proposed by organizers in this site: https://inex.mmci.uni-saarland.de/tracks/books

  2. INEX, Initiative for the Evaluation of XML Retrieval.http://inex.mmci.unisaarland.de/data/documentcollection.jsp

  3. http://social-book-search.humanities.uva.nl/data/suggestion

  4. https://www.lemurproject.org/indri.php

  5. INEX, Initiative for the Evaluation of XML Retrieval. http://inex.mmci.unisaarland.de/data/documentcollection.jsp

  6. https://social-book-search.humanities.uva.nl/data

  7. http://social-book-search.humanities.uva.nl/results15

  8. http://www.librarything.com/topic/1220

  9. http://www.librarything.com/topic/37917

  10. http://www.postgresql.org

  11. We use the learning-to-rank tool, RankLib, which is available at http://people.cs.umass.edu/vdang/ranklib.html

References

  1. Amer NO, Mulhem P, Géry M, Abdulahhad K (2016) Word embedding for social book suggestion. In: Working notes of CLEF 2016 - conference and labs of the evaluation forum, Évora, Portugal, 5-8 September, pp 1136–1144

  2. Badache I, Boughanem M (2017) Fresh and diverse social signals: Any impacts on search?. In: Proceedings of the 2017 conference on conference human information interaction and retrieval, CHIIR 2017, Oslo, Norway, March 7–11, 2017, pp 155–164

  3. Bendersky M, Metzler D, Croft WB (2011) Parameterized concept weighting in verbose queries. In: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, ACM, pp 605–614

  4. Bogers T, Larsen B (2012) RSLIS at INEX 2012: social book search track. In: CLEF 2012 Evaluation labs and workshop, online working notes, Rome, Italy, september 17–20

  5. Bonnefoy L, Deveaud R, Bellot P (2012) Do social information help book search?. In: CLEF 2012 Evaluation labs and workshop, online working notes, Rome, Italy, September 17–20, p 2012

  6. Bot RS, Wu YfB (2004) Improving document representations using relevance feedback: the rfa algorithm. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management, ACM, pp 270–278

  7. Brandão WC, Santos RL, Ziviani N, Moura ES, Silva AS (2014) Learning to expand queries using entities. J Assoc Inf Sci Technol 65(9):1870–1883

    Article  Google Scholar 

  8. Cambazoglu BB (2010) Review of search engines: information retrieval in practice by croft, metzler and strohman. Inf Process Manage 46(3):377–379

    Article  Google Scholar 

  9. Carmel D, Uziel E, Guy I, Mass Y, Roitman H (2012) Folksonomy-based term extraction for word cloud generation. ACM Trans Intell Syst Technol (TIST) 3(4):60

    Google Scholar 

  10. Carpineto C, De Mori R, Romano G, Bigi B (2001) An information-theoretic approach to automatic query expansion. ACM Trans Inf Syst (TOIS) 19(1):1–27

    Article  Google Scholar 

  11. Chaa M, Nouali O (2015) CERIST at INEX 2015: social book search track. In: Working notes of CLEF 2015 - conference and labs of the evaluation forum, Toulouse, France, September 8–11

  12. Chaa M, Nouali O, Bellot P (2016) Verbose query reduction by learning to rank for social book search track. In: Working notes of CLEF 2016 - conference and labs of the evaluation forum, Évora, Portugal, 5-8 September, 2016, pp 1072–1078

  13. Dillon JV, Collins-Thompson K (2010) A unified optimization framework for robust pseudo-relevance feedback algorithms. In: Proceedings of the 19th ACM international conference on Information and knowledge management, ACM, pp 1069–1078

  14. Dunlop MD (1997) The effect of accessing nonmatching documents on relevance feedback. ACM Trans Inf Syst 15(2):137–153

    Article  Google Scholar 

  15. Efron M (2013) Query representation for cross-temporal information retrieval. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 383–392

  16. Eguchi K (2005) 2005 NTCIR-5 Query expansion experiments using term dependence models. In: Proceedings of the Fifth NTCIR workshop meeting on evaluation of information access technologies: information retrieval, question answering and cross-lingual information access, NTCIR-5, National Center of Sciences, Tokyo, Japan December 6-9

  17. Feng S, Zhang B, Yin X, Jin Z, Jin J, Wu J, Zhang L, Pan H, Fang F, Zhou F (2016) USTB at social book search 2016 suggestion task: Active books set and re-ranking. In: Working notes of CLEF 2016 - conference and labs of the evaluation forum, Évora, Portugal, 5–8 September, 2016, pp 1089–1096

  18. Ganguly D, Leveling J, Magdy W, Jones GJ (2011) Patent query reduction using pseudo relevance feedback. In: Proceedings of the 20th ACM international conference on information and knowledge management, ACM, pp 1953–1956

  19. Gao J, Nie JY, Wu G, Cao G (2004) Dependence language model for information retrieval. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 170–177

  20. Hafsi M, Géry M, Beigbeder M (2014) Lahc at INEX 2014: social book search track. In: Working Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014, pp 514–520

  21. Imhof M, Badache I, Boughanem M (2015) Multimodal social book search. In: 6th conference on multilingual and multimodal information access evaluation (CLEF 2015), pp 1

  22. Ko Y, An H, Seo J (2007) An effective snippet generation method using the pseudo relevance feedback technique. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 711–712

  23. Koolen M, Kamps J, Kazai G (2012) Social book search: comparing topical relevance judgements and book suggestions for evaluation. In: Proceedings of the 21st ACM international conference on information and knowledge management, ACM, pp 185–194

  24. Koolen M, Huurdeman HC, Kamps J (2013) Comparing topic representations for social book search. In: Working notes for CLEF 2013 conference, Valencia, Spain, September 23–26

  25. Koolen M, Kazai G, Preminger M, Doucet A (2013b) Overview of the inex 2013 social book search track. In: Information access evaluation meets multilinguality, multimodality, and visualization-fourth international conference of the cross-language evaluation forum, CLEF 2013, pp 26

  26. Koolen M, Bogers T, Kamps J (2015) Overview of the SBS 2015 suggestion track. In: Working notes of CLEF 2015 - conference and labs of the evaluation forum, Toulouse, France, September 8–11, p 2015

  27. Kumar R, Guggilla B, Pamula R (2016) Social book search track: Ism@clef’16 suggestion task. In: Working notes of CLEF 2016 - Conference and labs of the evaluation forum, Évora, Portugal, 5-8 September, 2016, pp 1130–1135

  28. Kumaran G, Allan J (2006) Simple questions to improve pseudo-relevance feedback results. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 661–662

  29. Lv Y, Zhai C (2009) A comparative study of methods for estimating query language models with pseudo feedback. In: Proceedings of the 18th ACM conference on information and knowledge management, ACM, pp 1895–1898

  30. Lv Y, Zhai C (2010) Positional relevance model for pseudo-relevance feedback. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval, ACM, pp 579–586

  31. Metzler D, Croft WB (2005) A markov random field model for term dependencies. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 472–479

  32. Metzler D, Croft WB (2007) Latent concept expansion using markov random fields. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 311–318

  33. Metzler D, Strohman T, Turtle HR, Croft WB (2004) Indri at TREC 2004: Terabyte track. In: Proceedings of the thirteenth text retrieval conference, TREC 2004, Gaithersburg, Maryland, USA, November 16-19, p 2004

  34. Miao J, Huang JX, Ye Z (2012) Proximity-based rocchio’s model for pseudo relevance. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 535–544

  35. Oliveira V, Gomes G, Belém F, Brandao W, Almeida J, Ziviani N, Gonçalves M (2012) Automatic query expansion based on tag recommendation. In: Proceedings of the 21st ACM international conference on information and knowledge management, ACM, pp 1985–1989

  36. Paik JH (2013) A novel tf-idf weighting scheme for effective ranking. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 343–352

  37. Rendle S, Balby Marinho L, Nanopoulos A, Schmidt-Thieme L (2009) Learning optimal ranking with tensor factorization for tag recommendation. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 727–736

  38. Reuter K (2007) Assessing aesthetic relevance: Children’s book selection in a digital library. J Am Soc Inf Sci Technol 58(12):1745–1763

    Article  Google Scholar 

  39. Robertson SE, Jones KS (1976) Relevance weighting of search terms. JASIS 27(3):129–146

    Article  Google Scholar 

  40. Robertson SE, Walker S, Jones S, Hancock-Beaulieu MM, Gatford M et al (1995) Okapi at trec-3. Nist Special Publication Sp 109:109

    Google Scholar 

  41. Rocchio JJ (1971) Relevance feedback in information retrieval. The SMART retrieval system: experiments in automatic document processing, pp 313–323

  42. Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18 (11):613–620

    Article  MATH  Google Scholar 

  43. Song R, Taylor MJ, Wen JR, Hon HW, Yu Y (2008) Viewing term proximity from a different perspective. In: European conference on information retrieval, Springer, pp 346–357

  44. Udupa R, Bhole A (2010) Investigating the suboptimality and instability of pseudo-relevance feedback. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval, ACM, pp 813–814

  45. de Ves E, Domingo J, Ayala G, Zuccarello P (2006) A novel bayesian framework for relevance feedback in image content-based retrieval systems. Pattern Recogn 39(9):1622–1632

    Article  MATH  Google Scholar 

  46. Wu H, Fang H (2013) An incremental approach to efficient pseudo-relevance feedback. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 553–562

  47. Xu Z, Akella R (2008) A bayesian logistic regression model for active relevance feedback. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 227–234

  48. Ye Z, Huang JX (2014) A simple term frequency transformation model for effective pseudo relevance feedback. In: Proceedings of the 37th international ACM SIGIR conference on research & development in information retrieval, ACM, pp 323–332

  49. Yu C, Luk W, Cheung T (1976) A statistical model for relevance feedback in information retrieval. J ACM (JACM) 23(2):273– 286

    Article  MathSciNet  MATH  Google Scholar 

  50. Zhang BW, Yin XC, Cui XP, Qu J, Geng B, Zhou F, Hao HW (2014a) Ustb at inex2014: social book search track. In: CLEF (Working Notes), pp 536–542

  51. Zhang BW, Yin XC, Cui XP, Qu J, Geng B, Zhou F, Song L, Hao HW (2014b) Social book search reranking with generalized content-based filtering. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management, ACM, pp 361–370

  52. Zhang B W, Yin X C, Zhou F (2016) A generic pseudo relevance feedback framework with heterogeneous social information. Inf Sci 367:909–926

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ritesh Kumar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, R., Bhanodai, G. & Pamula, R. Book search using social information, user profiles and query expansion with Pseudo Relevance Feedback. Appl Intell 49, 2178–2200 (2019). https://doi.org/10.1007/s10489-018-1383-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-1383-z

Keywords

Navigation