Skip to main content
Log in

Evaluation of News Search Engines Based On Information Retrieval Models

  • Original Research
  • Published:
Operations Research Forum Aims and scope Submit manuscript

Abstract

News search engines are the exclusive search services for users’ news intake. Providing relevant query to a news search engine, the user gets back a single news result page consisting of various news articles aggregated from thousands of online news sources available on the World Wide Web. The availability and use of major news search engines like Bing news, Google news and Newslookup demand retrieval effectiveness evaluation of these search systems. In this paper, core retrieval models, namely, vector space model, Okapi BM25 and latent semantic indexing are used to evaluate retrieval effectiveness of news search engines for relevance effectiveness evaluation considering these models separately. Further, Monte-Carlo cross-entropy based rank aggregation technique is used to do more comprehensive relevance effectiveness evaluation by aggregating three individual rankings. Experimental results denote Google news’s performance to be better than the other two search engines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availability

The datasets supporting the conclusions of this article are included within the article.

References

  1. Aslam JA, Yilmaz E (2006) Inferring document relevance via average precision, in Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 601–602, Seattle, USA

  2. Clarke SJ, Willett P (1997) Estimating the recall performance of web search engines. Aslib Proc 49(7):184–189

    Article  Google Scholar 

  3. Harter SP, Hert CA (1997) Evaluation of information retrieval systems: approaches, issues and methods, in Williams, M. E. (ed.), ARIST, Information Today, Medford, NJ, USA, ARIST 32:3–94

  4. Hawking D, Bailey P, Grifiths K (2001) Measuring search engine quality. Information Retr 4:33–59

    Article  Google Scholar 

  5. Li L, Shang Y (2000) A new statistical method for performance evaluation of search engines, in Proceeding of 12th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2000, Vancouver, British Columbia, Canada 208–215

  6. Ali R, Beg MMS (2009) Automatic performance evaluation of web search systems using rough set based rank aggregation. in: U.S. Tiwari, T. J. Siddiqui, M. Radhakrishna & M. D. Tiwari (eds.) Proceedings of the 1st International Conference on Intelligent Human Computer Interaction., Springer, New Delhi 44–348

  7. Bokhari MU, Adhami MK (2015) A new criterion for evaluating news search systems. Commun Appl Electron 2(7):28–35, Foundation of Computer Science (FCS), NY, USA

  8. Bokhari MU, Adhami MK (2016) How well they retrieve fresh news items: news search engine perspective. Perspect Sci 8:469–471, Elsevier

  9. Bokhari MU, Adhami MK (2018) Retrieval effectiveness of news search engines: a theoretical framework. Int J Comput Appl 180(38):17–23, USA

  10. Bokhari MU, Adhami MK, Ali R (2019) Machine learning approach to evaluate news search systems, In Proceedings of IEEE International Conference on Electrical. Electronics and Computer Engineering (UPCON’19), AMU Aligarh, India 1–6

  11. Robertson SE, Zaragoza H (2009) The probabilistic relevance framework: BM25 and beyond, Foundations and trends in information retrieval, 3(4)333–389

  12. Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620

    Article  Google Scholar 

  13. Robertson SE, Walker S, Jones S, Beaulieu MH, Gatford M (1994). Okapi at TREC, in Proceedings of the Third Text Retrieval Conference (TREC 1994), Nov. 2-4, 1994, Gaithersburg, Maryland, USA. NIST Special Publication 190

  14. Lv Y, Zhai CX (2011) Lower bounding term frequency normalization, in Proceedings of CIKM-2011, Glasgow, Scotland, UK 7–16

  15. Zaragoza H, Craswell N, Taylor M, Saria S, Robertson SE (2004) Microsoft Cambridge at TREC-13: Web and HARD tracks, in Proceedings of TREC-2004. Nov. 16-19, 2004, Gaithersburg, Maryland, USA. https://trec.nist.gov/pubs/trec13/papers/microsoft-cambridge.web.hard.pdf

  16. Deerwester S (1988) Improving information retrieval with latent semantic indexing, in Proceedings of the 51st Annual Meeting of the American Society for Information Science (ASIS’88), Atlanta, Georgia, USA 25:36–40

  17. Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inform Sci 41(6):391–407

    Google Scholar 

  18. Ding C (1999) A similarity-based probability model for latent semantic indexing, in Proceedings of the 22nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, USA 59–65

  19. Dumais ST (2005) Latent semantic analysis. Ann Rev Inform Sci Technol 38:188–230. https://doi.org/10.1002/aris.1440380105

    Article  Google Scholar 

  20. Markovsky I (2012) Low-rank approximation: algorithms, implementations, applications. Springer, ISBN: 978-1-4471-2226

  21. Demmel J, Kahan W (1990) Accurate singular values of bi-diagonal matrices. SIAM J Sci Stat Comput 11(5):873–912

    Article  Google Scholar 

  22. Golub GH, Kahan W (1965) Calculating the singular values and pseudo-inverse of a matrix. J Soc Ind Appl Math Series B Num Anal 2(2):205–224. https://doi.org/10.1137/0702016

    Article  Google Scholar 

  23. Wall ME, Rechtsteiner A, Rocha LM (2003) Singular value decomposition and principal component analysis. In: Berrar DP, Dubitzky W, Granzow M (eds) A practical approach to microarray data analysis. Kluwer, Norwell, MA 91–109

    Chapter  Google Scholar 

  24. Pihur V, Datta Susmita, Datta Somnath (2007) Weighted rank aggregation of cluster validation measures: a Monte Carlo cross entropy approach. Bioinformatics 23(13):1607–1615. https://doi.org/10.1093/bioinformatics/btm158

    Article  Google Scholar 

  25. CRAN (2021) Packages, accessed on Feb. 21, 2021. https://cran.r-project.org/web/packages/available_packages_by_date.html

  26. Daniel W (1990) Spearman rank correlation coefficientChapter in book - applied nonparametric statistics (2nd ed.). Boston: PWS-Kent 358–365. ISBN 978-0-534-91976-4

  27. Spearman C (1987) The proof and measurement of association between two things. Am J Psychol 100(3/4):441–471. https://doi.org/10.2307/1422689

    Article  Google Scholar 

  28. Kendall MG (1970) Rank correlation methods (4th ed.). London: Griffin. ISBN 978-0-852-6419-96. OCLC 136868

  29. Nelson RB (2001) Kendall Tau metric, In Hazewinkel Michiel (ed.), Encyclopaedia of mathematics, Springer Science+ Business Media B. V. / Kluwer Academic Publishers, ISBN 978-1-55608-0104

  30. Prokhorov AV (2001) Kendall coefficient of rank correlation, In Hazewinkel Michiel (ed.), Encyclopaedia of mathematics, Springer Science+ Business Media B. V. / Kluwer Academic Publishers, ISBN 978-1-55608-010-4

Download references

Acknowledgements

The authors would like to express their great appreciations and gratitude to their respective institutions, namely, Aligarh Muslim University, India, and Sultan Qaboos University, Sultanate of Oman, for providing research facilities, technical supports and research environment that enabled us to complete this research task.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Afaq Ahmad.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of Topical Collection on Topical Collection on Research Developments in Theory and Algorithms of Numerical Analysis and Optimization: Proceedings from the Fifth Conference on Numerical Analysis and Optimization — NAOV 2020

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bokhari, M.U., Adhami, M.K. & Ahmad, A. Evaluation of News Search Engines Based On Information Retrieval Models. SN Oper. Res. Forum 2, 41 (2021). https://doi.org/10.1007/s43069-021-00081-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s43069-021-00081-0

Keywords

Navigation