Skip to main content
Log in

An approach of a quantum-inspired document ranking algorithm by using feature selection methodology

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

The main goal of an information retrieval system (IR) is ranking. Several methodologies were adopted with the integration of computing and advanced applied systems. However, traditional techniques have low stability, and machine learning techniques suffer from feature selection (FS) problems for exponentially growing data. This study speculates that combining quadratic unconstrained binary optimization (QUBO) with quantum annealing (QA) enhances the FS during the ranking process. QA is applied to reduce noisy and redundant data from the various state-of-the-art datasets in the field of IR. This study used the LETOR dataset. LETOR is divided into two versions: LETOR 3.0 (30,000 scientific documents) and LETOR 4.0 (more than 25 million documents). QA reveals the requisite quantum behavior in the findings using a processing unit known as the quantum processing unit (QPU). QPU addresses FS issues written as QUBO optimization problems successfully. The FS problem was QUBO formulated utilizing three quadratic models. A comparison was also made between the linear and QPU solvers employed in the ranking procedure. Quantum-based solutions outperform traditional methods in addressing ranking issues. Compared to the standard algorithms (LTR and LamdaMART), our suggested technique based on QA yields a normalized discounted cumulative gain of 0.39 and 0.80, respectively. QA can address real-world issues by offering practical solutions. QUBO and the QA process have improved feature selection for the ranking process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Choudhary L, Burdak BS (2012) Role of ranking algorithms for information retrieval. arXiv preprint arXiv. 1208:1926

  2. Wang S, Scells H, Mourad A, Zuccon G (2022) Seed-driven document ranking for systematic reviews: a reproducibility study. In: European conference on information retrieval. Springer

  3. Li H (2022) Learning to rank for information retrieval and natural language processing. Springer Nature

  4. Wang S, Dou Z, Zhu Y (2023) Heterogeneous graph-based context-aware document ranking. WSDM, Singapore, pp 777–780

    Google Scholar 

  5. Zhu Y, Nie J-Y, Su Y, Chen H, Zhang X, Dou Z (2022) From easy to hard: a dual curriculum learning framework for context-aware document ranking. In: Proceedings of the 31st ACM international conference on information & knowledge management

  6. Pang L, Lan Y, Guo J, Xu J, Xu J, Cheng X (2017) Deeprank: a new deep architecture for relevance ranking in information retrieval. In: Proceedings of the ACM on Conference on Information and Knowledge Management

  7. Geng X, Liu T-Y, Qin T, Li H (2007) Feature selection for ranking. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

  8. Burges CJ (2010) From ranknet to lambdarank to lambdamart: an overview. Learning 11(23–581):81

    Google Scholar 

  9. Marchesin S, Purpura A, Silvello G (2020) Focal elements of neural information retrieval models. An outlook through a reproducibility study. Inform Process Manage 57(6):102109

    Article  Google Scholar 

  10. Bender EM, Gebru T, McMillan-Major A, Shmitchell S (2021) On the dangers of stochastic parrots: can language models be toobig? In: Proceedings of the ACM conference on fairness, accountability, and transparency

  11. Dollen DV, Neukart F, Weimer D, Bäck T (2023) Predicting vehicle prices via quantum-assisted feature selection. Int J Inform Technol 15:1–9

  12. Ranjan R, Chhabra JK (2023) Automatic feature selection using enhanced dynamic crow search algorithm. Int J Inform Technol 8(1):1–6

  13. Banerjee A, Kumar E, Ravinder M (2023) Conditional deep clustering based transformed spatio-temporal features and fused distance for efficient video retrieval. Int J Inform Technol 6:1–7

  14. Avasthi S, Chauhan R, Acharjya DP (2023) Extracting information and inferences from a large text corpus. Int J Inform Technol 15(1):435–445

    Google Scholar 

  15. Chaudhary M, Pruthi J, Jain VK, Suryakant (2022) A novel squirrel search clustering algorithm for text document clustering. Int J Inform Technol 14(6):3277–3286

    Google Scholar 

  16. Steane A (1998) Quantum computing. Rep Prog Phys 61(2):117

    Article  MathSciNet  Google Scholar 

  17. Nembrini R, Dacrema MF, Cremonesi P (2021) Feature selection for recommender systems with quantum computing. Entropy 23(8):970

    Article  MathSciNet  Google Scholar 

  18. Kuanr M, Mohapatra P, Mittal S, Maindarkar M, Fouda MM, Saba L, Saxena S, Suri JS (2022) Recommender system for the efficient treatment of COVID-19 using a convolutional neural network model and image similarity. Diagnostics 12(11):2700

    Article  Google Scholar 

  19. Albash T, Lidar DA (2018) Adiabatic quantum computation. Rev Mod Phys 90(1):015002

    Article  MathSciNet  Google Scholar 

  20. Lewis M, Glover F (2017) Quadratic unconstrained binary optimization problem preprocessing: theory and empirical analysis. Networks 70(2):79–97

    Article  MathSciNet  Google Scholar 

  21. Su X, Yan X, Tsai CL (2012) Linear regression. Wiley Interdiscip Rev Comput Stat 4(3):275–294

    Article  Google Scholar 

  22. Ferreira AJ, Figueiredo MA (2012) Efficient feature selection filters for high-dimensional data. Pattern Recognit Lett 33(13):1794–1804

    Article  Google Scholar 

  23. Turati G, Dacrema MF, Cremonesi P (2022) Feature selection for classification with QAOA. In: IEEE international conference on quantum computing and engineering (QCE). IEEE

  24. Vargas S, Castells P (2011) Rank and relevance in novelty and diversity metrics for recommender systems. In: Proceedings of the fifth ACM conference on recommender systems

  25. Lv Y, Zhai C (2009) Adaptive relevance feedback in information retrieval. In: Proceedings of the 18th ACM conference on information and knowledge management

  26. Ibrahim OAS, Landa-Silva D (2018) An evolutionary strategy with machine learning for learning to rank in information retrieval. Soft Comput 22:3171–3185

    Article  Google Scholar 

  27. Emeriau P-E, Howard M, Mansfield S (2022) Quantum advantage in information retrieval. PRX Quantum 3(2):020307

    Article  Google Scholar 

  28. Melucci M (2015) Relevance feedback algorithms inspired by quantum detection. IEEE Trans Knowl Data Eng 28(4):1022–1034

    Article  Google Scholar 

  29. Song D, Lalmas M, Van Rijsbergen K, Frommholz I, Piwowarski B, Wang J, Zhang P, Zuccon G, Bruza P, Arafat S (2010) How quantum theory is developing the field of information retrieval. In: AAAI fall symposium series

  30. Balewski J, Amankwah MG, Van Beeumen R, Bethel E, Perciano T, Camps D (2023) Quantum-parallel vectorized data encodings and computations on trapped-ions and transmons QPUs. arXiv preprint arXiv:.07841

  31. Sachdeva K, Sachdeva R, Gupta H (2023) Quantum Computing in image processing. In: Recent developments in Electronics and Communication Systems. IOS Press, pp 25–30

  32. Zhao X, Zhao B, Xia Z, Wang X (2023) Information recoverability of noisy quantum states. Quantum 7:978

    Article  Google Scholar 

  33. Mitra B, Craswell N (2017) Neural models for information retrieval. arXiv preprint arXiv:.01509

  34. Preskill J (2018) Quantum computing in the NISQ era and beyond. Quantum 2:79

    Article  Google Scholar 

  35. Grover LK (2000) Synthesis of quantum superpositions by quantum computation. Phys Rev Lett 85(6):1334

    Article  Google Scholar 

  36. Jozsa R, Linden N (2003) On the role of entanglement in quantum-computational speed-up. Proc R Soc London Ser A Math Phys Eng Sci 459(2036):2011–2032

    Article  MathSciNet  MATH  Google Scholar 

  37. Glover F, Kochenberger G, Du Y (2018) A tutorial on formulating and using QUBO models. arXiv preprint arXiv:.11538

  38. Willsch D, Willsch M, De Raedt H, Michielsen K (2020) Support vector machines on the D-Wave quantum annealer. Comput Phys Commun 248:107006

    Article  MathSciNet  MATH  Google Scholar 

  39. Ushijima-Mwesigwa H, Negre CF, Mniszewski SM (2017) Graph partitioning using quantum annealing on the d-wave system. In: Proceedings of the second international workshop on post moores era supercomputing

  40. Adachi SH, Henderson MP (2015) Application of quantum annealing to training of deep neural networks. Preprint at https://arxiv.org/abs/1911.06356

  41. Amin MH, Andriyash E, Rolfe J, Kulchytskyy B, Melko R (2018) Quantum boltzmann machine. Phys Rev X 8(2):021050

    Google Scholar 

  42. Kuppili V, Biswas M, Edla DR, Prasad KR, Suri JS (2018) A mechanics-based similarity measure for text classification in machine learning paradigm. IEEE Trans Emerg Top Comput Intell 4(2):180–200

    Article  Google Scholar 

  43. Ferrari Dacrema M, Nembrini R, Zhou T-T, Cremonesi P (2021) Quantum annealing linear regression for collaborative filtering recommendations

  44. Ferrari Dacrema M, Felicioni N, Cremonesi P (2021) Optimizing the selection of recommendation carousels with quantum computing. In: Proceedings of the 15th ACM conference on recommender systems

  45. Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28

    Article  Google Scholar 

  46. Shrivastava VK, Londhe ND, Sonawane RS, Suri JS (2015) Exploring the color feature power for psoriasis risk stratification and classification: a data mining paradigm. Comput Biol Med 65:54–68

    Article  Google Scholar 

  47. Jović A, Brkić K, Bogunović N (2015) A review of feature selection methods with applications. In: 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO). IEEE

  48. Melucci M (2015) Introduction to information retrieval and quantum mechanics. Springer

  49. Uprety S, Gkoumas D, Song D (2020) A survey of quantum theory inspired approaches to information retrieval. ACM CSUR 53(5):1–39

    Google Scholar 

  50. Bhagawati R (2020) Clusters analyzer algorithm for informative acquaintances-quantum clustering algorithm. In: Fourth international conference on computing methodologies and communication (ICCMC). IEEE

  51. Venkateswara H, Lade P, Lin B, Ye J, Panchanathan S (2015) Efficient approximate solutions to mutual information based global feature selection. In: IEEE International Conference on Data Mining. IEEE

  52. Qin T, Liu T-Y, Xu J, Li H (2010) LETOR: a benchmark collection for research on learning to rank for information retrieval. Inf Retr 13:346–374

    Article  Google Scholar 

  53. Qin T, Liu T-Y (2013) Introducing LETOR 4.0 datasets arXiv preprint arXiv

  54. Gopalan N, Batri K, Selvan BS (2007) Adaptive selection of Top-m retrieval schemes for data fusion using Tabu search. In: International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007). IEEE

  55. Zeng Z, Zhang H, Zhang R, Yin C (2015) A novel feature selection method considering feature interaction. Pattern Recogn 48(8):2656–2666

    Article  Google Scholar 

  56. Boothby K, Bunyk P, Raymond J, Roy A (2020) Next-generation topology of d-wave quantum processors. Preprint at https://arxiv.org/abs/2005.00133

  57. Denchev VS, Boixo S, Isakov SV, Ding N, Babbush R, Smelyanskiy V, Martinis J, Neven H (2016) What is the computational value of finite-range tunneling? Phys Rev X 6(3):031015

    Google Scholar 

  58. Choi V (2008) Minor-embedding in adiabatic quantum computation: I. The parameter setting problem. Quantum Inf Process 7:193–209

    Article  MathSciNet  MATH  Google Scholar 

  59. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  60. McKinney W (2010) Data structures for statistical computing in python. In: Proceedings of the 9th Python in Science Conference. Austin

  61. Albanese D, Riccadonna S, Donati C, Franceschi P (2018) A practical tool for maximal information coefficient analysis. GigaScience 7(4):giy032

    Article  Google Scholar 

  62. Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J (2020) SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods 17(3):261–272

    Article  Google Scholar 

  63. Farina F, Camisa A, Testa A, Notarnicola I, Notarstefano G (2020) Disropt: a python framework for distributed optimization. IFAC-PapersOnLine 53(2):2666–2671

    Article  Google Scholar 

  64. Tague-Sutcliffe J, Blustein J (1995) A statistical analysis of the TREC-3 data. Nist special publication SP, pp 385–385

  65. Park LAF, Ramamohanarao K, Palaniswami M (2004) Fourier domain scoring: a novel document ranking method. IEEE Trans Knowl Data Eng 16(5):529–539

    Article  Google Scholar 

  66. Zuccon G, Azzopardi L (2010) Using the quantum probability ranking principle to rank interdependent documents. In: Advances in Information Retrieval: 32nd European Conference on IR Research, ECIR 2010, Milton Keynes, March 28–31, 2010. Proceedings 32. Springer

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rupam Bhagawati.

Ethics declarations

Conflict of interest

Authors have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhagawati, R., Subramanian, T. An approach of a quantum-inspired document ranking algorithm by using feature selection methodology. Int. j. inf. tecnol. 15, 4041–4053 (2023). https://doi.org/10.1007/s41870-023-01543-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-023-01543-w

Keywords

Navigation