Skip to main content
Log in

On the analysis and evaluation of information retrieval models for social book search

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Social Book Search (SBS) studies how the Social Web impacts book retrieval. This impact is studied in two steps. In this first step, called the baseline run, the search index having bibliographic descriptions or professional metadata and user-generated content or social metadata is searched against the search queries and ranked using a retrieval model. In the second step, called re-ranking, the baseline search results are re-ordered using social metadata to see if the search relevance improves. However, this improvement in the search relevance can only be justified if the baseline run is made stronger by considering the contribution of the query, index, and retrieval model. Although the existing studies well-explored the role of query formulation and document representation, only a few considered the contribution of the retrieval models. Also, they experimented with a few retrieval models. This article fills this gap in the literature. It identifies the best retrieval model in the SBS context by experimenting with twenty-five retrieval models using the Terrier IR platform on the Amazon/LibraryThing dataset holding topic sets, relevance judgments, and a book corpus of 2.8 million records. The findings suggest that these retrieval models behave differently with changes in query and document representation. DirichletLM and InL2 are the best-performing retrieval models for a majority of the retrieval runs. The previous best-performing SBS studies would have produced better results if they had tested multiple retrieval models in selecting baseline runs. The findings confirm that the retrieval model plays a vital role in developing stronger baseline runs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. See details on INEX (The Initiative for the Evaluation of XML Retrieval) and CLEF (Conference and Labs of the Evaluation Forum) SBS Tracks/Lab available at social-book-search.humanities.uva.nl/#/overview

  2. http://terrier.org/docs/current/javadoc/index.html

  3. http://social-book-search.humanities.uva.nl/#/data/suggestion

  4. http://terrier.org/issues/projects/TR/issues/TR-563?filter=allissues

  5. https://www.activestate.com/products/perl/downloads/

  6. https://groups.google.com/g/social-book-search

  7. http://social-book-search.humanities.uva.nl/data/scripts/deduplicate_simple.pl

References

  1. Ahmad A, Constant N, Yang Y, Cer D (2019) ReQA: an evaluation for end-to-end answer retrieval models. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering, Hong Kong, China, November 4, 2019. Association for Computational Linguistics, pp 137–146. https://doi.org/10.18653/v1/D19-5819

  2. Amati G (2003) Probability models for information retrieval based on divergence from randomness. PhD Thesis, University of Glasgow, Glasgow, Scotland. Available at: https://theses.gla.ac.uk/1570/

  3. Amati G (2006) Frequentist and Bayesian approach to information retrieval. In: Lalmas M, MacFarlane A, Rüger S, Tombros A, Tsikrika T, Yavlinsky A (eds) The 28th European conference on Advances in Information Retrieval, ECIR'06, Berlin, Heidelberg, April 2006. Advances in information retrieval. Springer Berlin Heidelberg, pp 13–24. https://doi.org/10.1007/11735106_3

    Chapter  Google Scholar 

  4. Amati G, Rijsbergen CJV (2002) Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans Inf Syst 20(4):357–389. https://doi.org/10.1145/582415.582416

    Article  Google Scholar 

  5. Amati G, Ambrosi E, Bianchi M, Gaibisso C, Gambosi G 2007 FUB, IASI-CNR and University of Tor Vergata at TREC 2007 Blog track. In: Voorhees EM, Buckland LP (eds) 16th text REtrieval conference (TREC-2007), Gaithersburg, Maryland, USA, November 5–9, 2007. NIST Special Publication. National Institute of Standards and Technology. Available at: https://apps.dtic.mil/sti/citations/ADA512723. Accessed 21 July 2022

  6. Badache I, Boughanem M (2017) Fresh and Diverse Social Signals: Any Impacts on Search? In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval (CHIIR '17), Oslo, Norway. ACM, New York, NY, USA, pp 155–164. https://doi.org/10.1145/3020165.3020177

  7. Bellot P, Doucet A, Geva S, Gurajada S, Kamps J, Kazai G, Koolen M, Mishra A, Moriceau V, Mothe J, Preminger M, SanJuan E, Schenkel R, Tannier X, Theobald M, Trappett M, Wang Q (2013) Overview of INEX 2013. In: Forner P, Müller H, Paredes R, Rosso P, Stein B (eds) 4th International Conference of the CLEF Initiative, CLEF'13, Valencia, Spain, September 23–26, 2013. Information Access Evaluation. Multilinguality, Multimodality, and Visualization. Springer Berlin Heidelberg, pp 269–281. https://doi.org/10.1007/978-3-642-40802-1_27

    Chapter  Google Scholar 

  8. Bellot P, Bogers T, Geva S, Hall M, Huurdeman H, Kamps J, Kazai G, Koolen M, Moriceau V, Mothe J, Preminger M, SanJuan E, Schenkel R, Skov M, Tannier X, Walsh D (2014) Overview of INEX 2014. In: 5th International Conference of the CLEF Initiative, CLEF'14, Sheffield, UK, September 15–18, 2014. Information Access Evaluation. Multilinguality, Multimodality, and Interaction. Springer International Publishing, pp 212–228. https://doi.org/10.1007/978-3-319-11382-1_19

    Chapter  Google Scholar 

  9. Benkoussas C, Bellot P (2015) Information retrieval and graph analysis approaches for book recommendation. Sci World J 2015:8. https://doi.org/10.1155/2015/926418

    Article  Google Scholar 

  10. Benkoussas C, Hamdan H, Albitar S, Ollagnier A, Bellot P (2014) Collaborative filtering for book recommandation. In: Cappellato L, Ferro N, Halvey M, Kraaij W (eds) 5th international conference of the CLEF initiative, CLEF’14, Sheffield, UK, September 15–18, 2014. CEUR Workshop Proceeding. CEUR.org, pp 501–507. Available at: http://ceur-ws.org/Vol-1180/CLEF2014wn-Inex-BenkoussasEt2014.pdf. Accessed 21 July 2022

  11. Benkoussas C, Ollagnier A, Bellot P (2015) Book recommendation using information retrieval methods and graph analysis. In: Cappellato L, Ferro N, Jones GJF, Juan ES (eds) 6th International Conference of the CLEF Association, CLEF'15, Toulouse, France, September 8–11, 2015. CEUR Workshop Proceedings. Available at: http://ceurws. org/Vol-1391/8-CR.pdf, pp 1–8. Accessed 21 July 2022

  12. Bogers T, Larsen B (2012) RSLIS at INEX 2012: social book search track. In: Forner P, Karlgren J, Womser-hacker C, Ferro N (eds) Working notes for CLEF 2012 conference, Rome, Italy, September 17-20, 2012

    Google Scholar 

  13. Bogers T, Larsen B (2013) RSLIS at INEX 2013: social book search track. In: Forner P, Navigli R, Tufis D, Ferro N (eds) Working notes for CLEF 2013 conference Valencia, Spain, September 23-26, 2013

    Google Scholar 

  14. Bogers T, Christensen KW, Larsen B (2012) RSLIS at INEX 2011: social book search track. In: Focused retrieval of content and structure. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 45–56. https://doi.org/10.1007/978-3-642-35734-3_3

  15. Cao B, Hou C, Peng H, Fan J, Yang J, Yin J, Deng S (2019) Predicting e-book ranking based on the implicit user feedback. World Wide Web 22(2):637–655. https://doi.org/10.1007/s11280-018-0554-5

    Article  Google Scholar 

  16. Chaa M, Nouali O, Bellot P (2016) Verbose query reduction by learning to rank for social book search track. In: Balog K, Cappellato L, Ferro N, Macdonald C (eds) Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, Évora, Portugal, 5–8 September, 2016. CEUR Workshopt Proceedings, pp 1064–1071. http://ceur-ws.org/Vol-1609/16091072.pdf

  17. Chaa M, Nouali O, Bellot P (2017) New technique to deal with verbose queries in social book search. Paper presented at the proceedings of the international conference on web intelligence. Germany, Leipzig, August 23 - 26, 2017. ACM, New York, NY, USA, 799–806. https://doi.org/10.1145/3106426.3106481

  18. Chaa M, Nouali O, Bellot P (2018) Combining tags and reviews to improve social book search performance. In: Proceedings of the international conference of the cross-language evaluation forum for european languages. Experimental IR meets multilinguality, multimodality, and interaction. CLEF 2018, Avignon, France, September 10-14, 2018. Lecture Notes in Computer Science, vol 11018. Springer, Cham. https://doi.org/10.1007/978-3-319-98932-7_6

  19. Clinchant S, Gaussier E (2009) Bridging language modeling and divergence from randomness models: a log-logistic model for IR. In: Azzopardi L, Kazai G, Robertson S et al (eds) 2nd International Conference on the Theory of Information Retrieval, ICTIR 2009, Cambridge, UK, September 10–12, 2009. Lecture Notes in Computer Science, vol 5766. Springer, Berlin Heidelberg, pp 54–65. https://doi.org/10.1007/978-3-642-04417-5_6

  20. Clinchant S, Gaussier E (2010) Information-based models for ad hoc IR. In proceedings of the the 33rd international ACM SIGIR conference on Research and Development in information retrieval, SIGIR ‘10, Geneva, Switzerland, July 19–23, 2010. ACM, New York, NY, USA, pp 234–241. https://doi.org/10.1145/1835449.1835490

  21. Croft WB, Metzler D, Strohman T (2010) Search engines: information retrieval in practice. Pearson Education, Inc., United States. Available at: https://ciir.cs.umass.edu/irbook/. Accessed 21 July 2022.

  22. Dincer BT (2012) IRRA at TREC 2012: index term weighting based on divergence from independence model. In: Voorhees EM, Buckland LP (eds) 21st Text REtrieval Conference TREC'12, Gaitersburg, MD, USA, November 6–9, 2012. National Institute of Standards and Technology, pp 1–6

    Google Scholar 

  23. Ettaleb M, Latiri C, Bellot P (2018) A combination of reduction and expansion approaches to Deal with long natural language queries. Procedia Comput Sci 126:768–777. https://doi.org/10.1016/j.procs.2018.08.011

    Article  Google Scholar 

  24. Feng S-H, Zhang B-W, Yin X-C, Jin Z-X, Jin J-L, Wu J-W, Zhang L-L, Pan H-J, Fang F, Zhou F (2016) USTB at social book search 2016 suggestion task: active books set and re-ranking. In: Balog K, Cappellato L, Ferro N, Macdonald C (eds) Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, Évora, Portugal, 5–8 September, 2016. CEUR Workshop Proceedings, pp 1089–1096

    Google Scholar 

  25. Hafsi M, Géry M, Beigbeder M (2014) LaHC at INEX 2014: Social book search track. In: Cappellato L, Ferro N, Halvey M, Kraaij W (eds) 5th international conference of the CLEF association, CLEF’14, Sheffield, UK, September 15–18, 2014 2015. CEUR workshop proceedings. pp 514–520 Available at: http://ceur-ws.org/Vol-1180/CLEF2014wn-Inex-HafsiEt2014.pdf. Accessed 21 July 2022

  26. Hashemi H, Zamani H, Croft WB (2019) Performance prediction for non-factoid question answering. Paper presented at the the 2019 ACM SIGIR international conference on theory of information retrieval, ICTIR ‘19, Santa Clara, CA, USA, October 2–5, 2019. ACM, New York, NY, USA, pp 55–58. https://doi.org/10.1145/3341981.3344249

  27. Hiemstra D (2001) Using language models for information retrieval. PhD Thesis. University of Twente, AE Enschede, The Netherlands. Avaialable at: https://ris.utwente.nl/ws/portalfiles/portal/6042641/t000001d.pdf. Accessed 21 July 2022

  28. Holur P, Shahsavari S, Ebrahimzadeh E, Tangherlini TR, Roychowdhury V (2021) Modelling social readers: novel tools for addressing reception from online book reviews. R Soc Open Sci 8(12):210797. https://doi.org/10.1098/rsos.210797

    Article  Google Scholar 

  29. Htait A, Fournier S, Bellot P (2016) SBS 2016 : Combining query expansion result and books information score for book recommendation. In: Balog K, Cappellato L, Ferro N, Macdonald C (eds) Working notes of CLEF 2016 - conference and labs of the evaluation forum, Évora, Portugal, 5-8 September, 2016. CEUR workshop proceedings. pp 1115-1122

    Google Scholar 

  30. Htait A, Fournier S, Bellot P, Azzopardi L, Pasi G (2020) Using sentiment analysis for pseudo-relevance feedback in social book search. In Proceedings of the 2020 ACM SIGIR on international conference on theory of information retrieval (ICTIR '20), Virtual Event, Norway. ACM, New York, NY, USA, pp 29–32. https://doi.org/10.1145/3409256.3409847

  31. Imhof M (2016) BM25 for non-textual modalities in social book search. In: Balog K, Cappellato L, Ferro N, Macdonald C (eds) Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, Évora, Portugal, 5–8 September, 2016, pp 1123–1129. Available at: http://ceur-ws.org/Vol-1609/16091123.pdf Accessed 21 July 2022

  32. Imhof M, Braschler M (2018) A study of untrained models for multimodal information retrieval. Information Retrieval Journal 21(1):81–106. https://doi.org/10.1007/s10791-017-9322-x

    Article  Google Scholar 

  33. Imhof M, Badache I, Boughanem M Multimodal social book search. In: Cappellato L, Ferro N, Jones GJF, Juan ES (eds) 6th International Conference of the CLEF Association, CLEF'15, Toulouse, France, September 8-11, 2015 2015. CEUR Workshop Proceedings. p 9. Available at: http://ceur-ws.org/Vol-1180/CLEF2014wn-Inex-HafsiEt2014.pdf. Accessed 21 July 2022

  34. Jones KS, Walker S, Robertson SE (2000) A probabilistic model of information retrieval: development and comparative experiments part 2. Inf Process Manag 36(6):809–840. https://doi.org/10.1016/S0306-4573(00)00016-9

    Article  Google Scholar 

  35. Khalid S, Wu S (2020) Supporting scholarly search by query expansion and citation analysis. Eng Technol Appl Sci Res 10(4):6102–6108. https://doi.org/10.48084/etasr.3655

    Article  Google Scholar 

  36. Khalid S, Khusro S, Ullah I, Dawson-Amoah G (2019) On the current state of scholarly retrieval systems. Eng Technol Appl Sci Res 9(1):3863–3870. https://doi.org/10.48084/etasr.2448

    Article  Google Scholar 

  37. Khalid S, Wu S, Alam A, Ullah I (2019) Real-time feedback query expansion technique for supporting scholarly search using citation network analysis. J Inf Sci 47:1–13. https://doi.org/10.1177/0165551519863346

    Article  Google Scholar 

  38. Khusro S, Ullah I (2016) Towards a semantic book search engine. In: 10th International Conference on Open Source Systems & Technologies, ICOSST 2026, Lahore, Pakistan, December 15–17, 2016. IEEE, pp 106–113. https://doi.org/10.1109/ICOSST.2016.7838586

    Chapter  Google Scholar 

  39. Kocabaş İ, Dinçer BT, Karaoğlan B (2014) A nonparametric term weighting method for information retrieval based on measuring the divergence from independence. Inf Retr 17(2):153–176. https://doi.org/10.1007/s10791-013-9225-4

    Article  Google Scholar 

  40. Koolen M, Bogers T, Gäde M, Hall M, Huurdeman H, Kamps J, Skov M, Toms E, Walsh D (2015) Overview of the CLEF 2015 Social Book Search Lab. In: Mothe J, Savoy J, Kamps J et al (eds) 6th International Conference of the CLEF Association, CLEF'15, Toulouse, France, September 8–11, 2015. Experimental IR Meets Multilinguality, Multimodality, and Interaction. Lecture Notes in Computer Science. vol 9283. Springer International Publishing, pp 545–564. https://doi.org/10.1007/978-3-319-24027-5_51

  41. Koolen M, Bogers T, Gäde M, Hall M, Hendrickx I, Huurdeman H, Kamps J, Skov M, Verberne S, Walsh D (2016) Overview of the CLEF 2016 Social Book Search Lab. In: Fuhr N, Quaresma P, Gonçalves T et al (eds) 7th International Conference of the CLEF Association, CLEF 2016, Évora, Portugal, September 5–8, 2016. Experimental IR Meets Multilinguality, Multimodality, and Interaction. Springer International Publishing, pp 351–370. https://doi.org/10.1007/978-3-319-44564-9_29

    Chapter  Google Scholar 

  42. Kumar R, Pamula R (2020) Social book search: a survey. Artif Intell Rev 53(1):95–139. https://doi.org/10.1007/s10462-018-9647-x

    Article  Google Scholar 

  43. Kumar R, Guggilla B, Pamula R (2016) Social Book Search Track: ISM@CLEF'16 Suggestion Task. In: Balog K, Cappellato L, Ferro N, Macdonald C (eds) Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, Évora, Portugal, 5–8 September, 2016. CEUR Workshop Proceedings, pp 1130–1135. Accessed July 21, 2022. Available at: http://ceur-ws.org/Vol-1609/16091130.pdf. Accessed 21 July 2022

  44. Macdonald C, McCreadie R (2020) Documentation for terrier v5.X. School of Computing Science, University of Glasgow. https://github.com/terrier-org/terrier-core/blob/5.x/doc/index.md. Accessed 21 July 2022

  45. Macdonald C, He B, Plachouras V, Ounis I (2005) University of Glasgow at TREC 2005: Experiments in Terabyte and Enterprise Tracks with Terrier. In: Proceedings of the Fourteenth Text REtrieval Conference, TREC 2005, Gaithersburg, Maryland, USA, November 15–18, 2005. NIST Special Publication. National Institute of Standards and Technology (NIST), p 14

    Google Scholar 

  46. Macdonald C, Plachouras V, He B, Lioma C, Ounis I (2006) University of Glasgow at WebCLEF 2005: experiments in per-field normalisation and language specific stemming. In: Peters C, Gey FC, Gonzalo J et al (eds) 6th workshop of the cross-language evalution forum, CLEF 2005, Vienna, Austria, September 21–23, 2005. Accessing multilingual information repositories. Springer, Berlin Heidelberg, pp 898–907. https://doi.org/10.1007/11878773_100

    Chapter  Google Scholar 

  47. Macdonald C, McCreadie R, Santos R, Ounis I (2012) From puppy to maturity: experiences in developing terrier. In: SIGIR 2012 Workshop on Open Source Information Retrieval, Portland, Oregon, USA, August 16, 2012, pp 60–63

    Google Scholar 

  48. Manning CD, Raghavan P, Schütze H (2009) Introduction to information retrieval, Cambridge University Press. 2008. Available at: https://nlp.stanford.edu/IR-book/

  49. Metzler D, Croft WB (2004) Combining the language model and inference network approaches to retrieval. Inf Process Manag 40(5):735–750. https://doi.org/10.1016/j.ipm.2004.05.001

    Article  Google Scholar 

  50. Metzler D, Croft WB (2005) A Markov random field model for term dependencies. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '05). Salvador, Brazil. August 15 - 19, 2005. ACM, New York, NY, USA, pp 472–479. https://doi.org/10.1145/1076034.1076115

  51. Ould-Amer N, Géry M (2015) LaHC at CLEF 2015 SBS Lab. In: Cappellato L, Ferro N, Jones GJF, Juan ES (eds) 6th international conference of the CLEF association, CLEF'15, Toulouse, France, September 8–11, 2015. CEUR Workshop Proceedings. p 4. Available at: http://ceur-ws.org/Vol-1391/11-CR.pdf .Accessed 21 July 2022.

  52. Ould-Amer N, Mulhem P, Gery M (2015) LIG at CLEF 2015 SBS Lab. In: Cappellato L, Ferro N, Jones GJF, Juan ES (eds) 6th international conference of the CLEF association, CLEF'15, Toulouse, France, September 8–11, 2015. CEUR Workshop Proceedings. Available at:http://ceur-ws.org/Vol-1391/6-CR.pdf. Accessed 21 July 2022

  53. Ould-Amer N, Mulhem P, Géry M, Abdulahhad K (2016) Word embedding for social book suggestion. In: Balog K, Cappellato L, Ferro N, Macdonald C (eds) 7th international conference of the CLEF association, CLEF’16, Évora, Portugal, September 5–8, 2016. CEUR Workshop Proceedings. pp 1136–1144 Availble at:http://ceurws.org/Vol-1609/16091136.pdf. Accessed 21 July 2022

  54. Ounis I, Amati G, Plachouras V, He B, Macdonald C, Johnson D (2005) Terrier information retrieval platform. In: Losada DE, Fernández-Luna JM (eds) 27th European Conference on IR Research, ECIR'05, Santiago de Compostela, Spain, March 21–23, 2005. Springer, pp 517–519. https://doi.org/10.1007/978-3-540-31865-1_37

    Chapter  Google Scholar 

  55. Ounis I, Amati G, Plachouras V, He B, Macdonald C, Lioma C (2006) Terrier: a high performance and scalable information retrieval platform. In: ACM SIGIR’06 Workshop on Open Source Information Retrieval (OSIR 2006), Seattle, Washington, USA, 10th August, 2006, pp 18–25

    Google Scholar 

  56. Ounis I, Lioma C, Macdonald C, Plachouras V (2007) Research directions in terrier: a search engine for advanced retrieval on the web. CEPIS Upgrade Journal 8(1):49–56

    Google Scholar 

  57. Pannu M, James A, Bird RA (2014) comparison of information retrieval models. In: Maydan C, Liu X (eds) Western Canadian Conference on Computing Education, WCCCE'14, Richmond, BC, Canada, May 2014. ACM, pp 1–6. https://doi.org/10.1145/2597959.2597978

    Chapter  Google Scholar 

  58. Plachouras V, Ounis I (2007) Multinomial randomness models for retrieval with document fields. In: Amati G, Carpineto C, Romano G (eds) 29th european conference on IR Research, ECIR 2007, Rome, Italy, April 2–5, 2007. Advances in information retrieval. Springer Berlin Heidelberg, pp 28–39. https://doi.org/10.1007/978-3-540-71496-5_6

    Chapter  Google Scholar 

  59. Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. In: 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '98, Melbourne, Australia, August 1987. ACM, pp 275–281. https://doi.org/10.1145/290941.291008

    Chapter  Google Scholar 

  60. Pradhan T, Pal S (2020) A hybrid personalized scholarly venue recommender system integrating social network analysis and contextual similarity. Futur Gener Comput Syst 110(9):1139–1166. https://doi.org/10.1016/j.future.2019.11.017

    Article  Google Scholar 

  61. Ren J, Xia F, Chen X, Liu J, Hou M, Shehzad A, Sultanova N, Kong X (2021) Matching algorithms: fundamentals, applications and challenges. IEEE Trans Emerg Topics Comput Intell 5(3):332–350. https://doi.org/10.1109/TETCI.2021.3067655

    Article  Google Scholar 

  62. Robertson SE, Walker S (1994) Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: 17th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ‘94, London, August 1994. SIGIR ‘94. Springer, London, pp 232–241. https://doi.org/10.1007/978-1-4471-2099-5_24

    Chapter  Google Scholar 

  63. Robertson SE, Walker S, Beaulieu M, Willett P (1999) Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive Track. In: 7th Text REtrieval Conference, TREC 1998, Gaithersburg, Maryland, USA, November 9–11, 1998. NIST special publication, vol 500. National Institute of Standards and Technology, pp 253–264

    Google Scholar 

  64. Robertson S, Zaragoza H, Taylor M (2004) Simple BM25 extension to multiple weighted fields. In: 13th ACM International Conference on Information and Knowledge Management (CIKM'04) Washington, D.C., USA. November 8-13, 2004. ACM, New York, NY, USA pp 42–49. https://doi.org/10.1145/1031171.1031181

  65. Rücklé A, Swarnkar K, Gurevych I (2019) Improved cross-lingual question retrieval for community question answering. Paper presented at the the world wide web conference, WWW '19, San Francisco, CA, USA, May 2019

  66. Singhal A, Buckley C, Mitra M (1996) Pivoted document length normalization. Paper presented at the 19th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ‘96, Zurich, Switzerland, August 1996. https://doi.org/10.1145/3130348.3130365

  67. Singhal A, Choi J, Hindle D, Lewis DD, Pereira F (1999) AT&T at TREC-7. In: Voorhees EM, Harman DK (eds) 7th Text REtrieval Conference, TREC 1998, Gaithersburg, Maryland, USA, November 9–11, 1998. NIST special publication. National Institute of Standards and Technology, pp 239–252

    Google Scholar 

  68. Sparck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Doc 28(1):11–21. https://doi.org/10.1108/eb026526

    Article  Google Scholar 

  69. Sparck Jones K, Walker S, Robertson SE (2000) A probabilistic model of information retrieval: development and comparative experiments: part 1. Inf Process Manag 36(6):779–808. https://doi.org/10.1016/S0306-4573(00)00015-7

    Article  Google Scholar 

  70. Ullah I (2020) Improving social book search using structure semantics, bibliographic descriptions and social metadata. PhD Thesis. University of Peshawar, Pakistan Research Repository. Available at: http://prr.hec.gov.pk/jspui/handle/123456789/12098. Accessed 21 July 2022

  71. Ullah I, Khusro S (2016) In search of a semantic book search engine on the web: are we there yet? In: Silhavy R, Senkerik R, Oplatkova ZK, Silhavy P, Prokopova Z (eds) 5th computer science on-line conference 2016 (CSOC2016), Zlin, Czech Republic, 27–30 April, 2016. Artificial intelligence perspectives in intelligent systems. Springer International Publishing, pp 347–357. https://doi.org/10.1007/978-3-319-33625-1_31

    Chapter  Google Scholar 

  72. Ullah I, Khusro S (2020) Social book search: the impact of the social web on book retrieval and recommendation. Multimed Tools Appl 79(11):8011–8060. https://doi.org/10.1007/s11042-019-08591-0

    Article  Google Scholar 

  73. Ullah I, Khusro S, Ahmad I (2021) Improving social book search using structure semantics, bibliographic descriptions and social metadata. Multimed Tools Appl 80(4):5131–5172. https://doi.org/10.1007/s11042-020-09811-8

    Article  Google Scholar 

  74. Vishwakarma SK, Lakhtaria KI, Bhatnagar D, Sharma AK (2015) Monolingual Information Retrieval using Terrier: FIRE 2010 Experiments based on n-gram index. In: Soni AK, Lobiyal DK (eds) 3rd International Conference on Recent Trends in Computing 2015 (ICRTC-2015), Ghaziabad India, March 12–13, 2015. Procedia computer science, vol 2015. Elsevier, pp 815–820. https://doi.org/10.1016/j.procs.2015.07.484

  75. Wu S-H (2018) The CYUT system on social book search track since INEX 2013 to CLEF 2016. Libr Sci Inf Sci 43(2):14. https://doi.org/10.6245/JLIS.2017.432/733

    Article  Google Scholar 

  76. Wu H, Kazai G, Taylor M (2008) Book search experiments: investigating IR methods for the indexing and retrieval of books. In: Macdonald C, Ounis I, Plachouras V, Ruthven I, White RW (eds) 30th European Conference on IR Research, ECIR 2008, Glasgow, United Kingdom, March 30–April 3, 2008. Advances in information retrieval. Springer, Berlin, pp 234–245. https://doi.org/10.1007/978-3-540-78646-7_23

    Chapter  Google Scholar 

  77. Wu S-H, Hsieh Y-H, Chen L-P, Yang P-C (2016) Query expansion by word embedding in the suggestion track of CLEF 2016 social book search lab. In: Balog K, Cappellato L, Ferro N, Macdonald C (eds) Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, Évora, Portugal, 5–8 September, 2016, pp 1155–1165

    Google Scholar 

  78. Yin X-C, Zhang B-W, Cui X-P, Qu J, Geng B, Zhou F, Song L, Hao H-W (2016) ISART: a generic framework for searching books with social information. PLOS ONE 11(2):e0148479. https://doi.org/10.1371/journal.pone.0148479

    Article  Google Scholar 

  79. Zaragoza H, Craswell N, Taylor M, Saria S, Robertson S (2004) Microsoft Cambridge at TREC-13: Web and HARD tracks. In: Voorhees EM, Buckland LP (eds) 13th Text REtrieval Conference, TREC 2004, Gaithersburg, Maryland, USA, November 16–19, 2004. NIST Special Publication. National Institute of Standards and Technology, p 7. Available at: http://trec.nist.gov/pubs/trec13/papers/microsoft-cambridge.web.hard.pdf. Accessed 21 July 2022

  80. Zhai C (2001) Notes on the Lemur TFIDF model. Unpublished report. School of Computer Science, Cornegie Mellon University. Available at: http://lemurproject.org/lemur/tfidf.pdf

  81. Zhai C, Lafferty J (2004) A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst 22(2):179–214. https://doi.org/10.1145/984321.984322

    Article  Google Scholar 

Download references

Acknowledgments

We are thankful to Dr. Jaap Kamps from the University of Amsterdam and Dr. Marjin Koolen from the Royal Netherlands Academy of Arts and Sciences for providing us with the Amazon/LibraryThing dataset under the license agreement with the Initiative for the Evaluation of XML Retrieval (INEX). We also acknowledge their valuable discussions regarding the INEX/CLEF Social Book Search Tracks/Labs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irfan Ullah.

Ethics declarations

Competing interests

The authors explicitly declare that “No Competing Interests are at stake and there is No Conflict of Interest” with other people or organizations that could inappropriately influence or bias the content of the article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1. Details of the Selected Retrieval Models

Tables 4 shows the symbols used in Table 5 along with their descriptions and usage in the respective retrieval models and Table 5 lists the retrieval models used in the experiments using Terrier IR platform for this study.

Table 4 Symbols used in Table 4, their explanations, and use in the respective retrieval models
Table 5 A theoretical comparison of retrieval models implemented in Terrier IR platform

Appendix 2. Details of the Baseline Runs

Tables 6 and 7

Table 6 Search performance (relevance score) of the retrieval models against the 2016 topic fields & relevance judgements (1000 results per query)
Table 7 Search performance (relevance score) of the retrieval models against the 2016 topic fields & relevance judgements (5000 results per query)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ullah, I., Khusro, S. On the analysis and evaluation of information retrieval models for social book search. Multimed Tools Appl 82, 6431–6478 (2023). https://doi.org/10.1007/s11042-022-13417-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13417-7

Keywords

Navigation