Advances in information retrieval collection on the European conference on information retrieval 2023

Kamps, Jaap; Goeuriot, Lorraine; Crestani, Fabio

doi:10.1007/s10791-024-09442-9

Advances in information retrieval collection on the European conference on information retrieval 2023

Editorial
Open access
Published: 23 May 2024

Volume 27, article number 9, (2024)
Cite this article

Download PDF

You have full access to this open access article

Discover Computing Aims and scope Submit manuscript

Advances in information retrieval collection on the European conference on information retrieval 2023

Download PDF

Jaap Kamps¹,
Lorraine Goeuriot² &
Fabio Crestani³

215 Accesses
Explore all metrics

Abstract

This paper introduces the Collection on ECIR 2023. The 45th European Conference on Information Retrieval (ECIR 2023) was held in Dublin, Ireland, during April 2–6, 2023. The conference was the largest ECIR ever, and brought together hundreds of researchers from Europe and abroad. A selection of papers shortlisted for the best paper awards was asked to submit expanded versions appearing in this Discover Computing (formerly the Information Retrieval Journal) Collection on ECIR 2023. First, an analytic paper on incorporating first stage retrieval status values as input in neural cross-encoder re-rankers. Second, new models and new data for a new task of temporal natural language inference. Third, a weak supervision approach to video retrieval overcoming the need for large-scale human labeled training data. Together, these papers showcase the breadth and diversity of current research on information retrieval.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The 45th European Conference on Information Retrieval (ECIR 2023) was held in Dublin, Ireland, during April 2–6, 2023. The conference was the largest ECIR ever, and brought together hundreds of researchers from Europe and abroad. For those who like numbers: first, ECIR received 489 submissions in total (including 228 full and 153 short papers) excluding further workshop submissions. Second, the technical program committee consisted of 624 reviewers in total (including 27 chairs, 124 SPC, 473 reviewers) with many serving on multiple tracks. Third, the proceedings necessitated a third volume, and contains a total of 175 papers in 2151 pages [10,11,12].

For this Collection on ECIR 2023, we asked the authors of selected ECIR 2023 full papers that were shortlisted for the best paper awards to submit an extended version of their paper. This led to three papers that are published in this Collection of Discover Computing. The extended papers contain at least 30% new content. Examples of extensions are enhancements that improve the techniques described in the ECIR 2023 paper; as well as tests on additional data-sets that reveal behaviors that differ from the originally published claims and that provide further insights into the methods being described. Among the papers in this Collection are extensions of two papers that received an award at ECIR 2023.

2 Papers in the collection

The editors of this Collection invited five of the full papers short-listed for the ECIR Best Paper Award to submit an expanded version of their paper for this Collection of Discover Computing on ECIR 2023. Three of these expanded ECIR papers were accepted for this Collection after peer review and further revisions.

2.1 First stage retrieval scores in cross-encoders

The first paper in this volume is based on Askari, Abolghasemi, Pasi, Kraaij, and Verberne [2]:

Askari A, Abolghasemi A, Pasi G, et al (2023) Injecting the BM25 score as text improves bert-based re-rankers. In: [10], pp 66–83, doi: 10.1007/978-3-031-28244-7-5, URL https://doi.org/10.1007/978-3-031-28244-7_5.

This paper presents a novel approach for combining first-stage lexical retrieval models and Transformer-based re-rankers, by including the relevance score of the lexical model as a token in the middle of the input of the cross-encoder re-ranker. This idea was motivated by the finding that BERT models can capture numeric information. Evaluation on the MSMARCO Passage collection and the TREC DL collections shows that the proposed method significantly improves over all cross-encoder re-rankers as well as the common interpolation methods.

The extended version of the paper in the Collection is [3]:

Askari A, Abolghasemi A, Pasi G, et al (2024) Injecting the score of the first-stage retriever as text improves BERT-based re-rankers. Discover Computing 27(4). doi: 10.1007/s10791-024-09435-8, URL https://doi.org/10.1007/s10791-024-09435-8.

2.2 Temporal natural language inference

The second paper in this volume is based on Hosokawa, Jatowt, and Sugiyama [7]:

Hosokawa T, Jatowt A, Sugiyama K (2023) Temporal natural language inference: Evidence-based evaluation of temporal text validity. In: [10], pp 441–458, doi: 10.1007/978–3-031–28244–7-28, URL https://doi.org/10.1007/978-3-031-28244-7_28

Their paper investigates Temporal Natural Language Inference, inspired by traditional natural language reasoning to determine the temporal validity of text content. The authors first construct their own data-set for this task and train several machine learning models, and then propose an effective method for learning information from an external knowledge base that gives hints on temporal commonsense knowledge. Using the prepared data-set, it introduces a new machine learning model that incorporates the information from the knowledge base and demonstrate that this model outperforms state-of-the-art approaches in the proposed task.

The extended version of the paper in the Collection is [8]:

Hosokawa T, Jatowt A, Sugiyama K (2024) Text validity reassessment: Commonsense reasoning about information obsoleteness. Discover Computing 27(4). doi: 10.1007/s10791-024-09433-w, URL https://doi.org/10.1007/s10791-024-09433-w.

2.3 Weakly supervised video retrieval

The third paper in this volume is based on Madasu, Aflalo, Stan, Tseng, Bertasius,and Lal [14]:

Madasu A, Aflalo E, Stan GBM, et al (2023b) Improving video retrieval using multilingual knowledge transfer. In: [10], pp 669-684, doi: 10.1007/978-3-031-28244-7-42, URL https://doi.org/10.1007/978-3-031-28244-7_42.

This paper investigates video retrieval, which has seen tremendous progress with the development of vision-language models, however requires labeled data which is a huge manual effort. This paper uses state-of-the-art machine translation models to construct pseudo ground-truth multilingual video-text pairs, and learns a multilingual video-text representation in a common embedding space based on pre-trained multilingual models. Experimental results demonstrate that this approach achieves state-of-the-art results for English video retrieval data-sets, and superior performance on a multilingual video retrieval benchmark.

Due to very expedient processing by the Journal this paper was already publishing in a regular issue [13]:

Madasu A, Aflalo E, Stan GBM, et al (2023a) Mumur: Multilingual multimodal universal retrieval. Inf Retr J 26(1):5. doi: 10.1007/S10791-023-09422-5, URL https://doi.org/10.1007/s10791-023-09422-5.

This paper has been added now to the Collection on ECIR 2023 of Discover Computing.

3 IRJ papers presented at ECIR 2023

In addition to ECIR papers now becoming part of Discover Computing, three original IRJ 2022 papers were presented at ECIR 2023 in Dublin.

3.1 Impact of shallow pooling on evaluation

The first IRJ 2022 paper presented at ECIR 2023 was [1]. In contrast with traditional information retrieval test collections, such as those developed by TREC, the MS MARCO data-sets employ substantially more queries (thousands versus dozens) but with substantially fewer known relevant items per query (often just one). To understand the implications for the leader-board, the pooled the top document from available runs near the top of the passage ranking leader-board for over 500 queries. We employed crowd-sourced workers to make preference judgments over these pools and re-evaluated the runs. Their results support our concerns that current MS MARCO data-sets may no longer be able to recognize genuine improvements in rankers, and shallow pooling and judging of additional higher ranked documents could be beneficial.

3.2 Exact matching in pre-trained language models

The second IRJ 2022 paper presented at ECIR 2023 was [4]. Pre-trained language models such as BERT were found to be an effective soft matching model. However, exact matching is still an essential signal for assessing the relevance of a document to an information-seeking query aside from semantic matching. They explored strategies for integrating exact matching signals using marker tokens to highlight exact term-matches between the query and the document. This simple marking approach significantly improves over the common vanilla baseline. Results show that traditional information retrieval cues such as exact matching are still valuable for large pre-trained contextualized models such as BERT.

3.3 Implicit item relations in session-based recommendation

The third IRJ 2022 paper presented at ECIR 2023 was [5]. Previous work on session-based recommendation has considered sequences of items that users have interacted with sequentially, which fails to capture other relationships between items that go beyond the inspection order. They propose Star Graph Neural Networks with Highway Networks (SGNN-HN) for session-based recommendation. The proposed model applies a star graph neural network to model the complex transition relationship between items in an ongoing session. The results show that this can outperform the state-of-the-art models in terms of Recall and MRR for session-based recommendation.

4 Other notable ECIR 2023 papers

4.1 Full papers

Three ECIR full papers received recognition at the conference. In fact, we have already discussed two of them now included in expanded form as part of this Collection .

First, the ECIR 2023 Best Paper Award was presented to [7], on temporal natural language inference. We discussed this paper above in Sect. 2.2.

Second, the ECIR 2023 Best Student Paper Award was presented to [14], on weakly supervised video retrieval. We discussed this paper above in Sect. 2.3.

Third, an Honorable Mention for the Best Paper Award was presented to [9]. Their paper investigates Question Paraphrasing Identification (QPI), a task of determining whether a pair of interrogative sentences (i.e., questions) are paraphrases of each other. The paper proposes an intention-aware neural model for QPI. Question words (e.g., “when”) and blocks (e.g., “what time”) are extracted as features for revealing intentions, used to regulate pairwise question encoding explicitly and implicitly, within Conditional Variational AutoEncoder (CVAE) and multi-task VAE frameworks. This model outperforms the state-of-the-art QPI models on benchmark corpora QQP, LCQMC and BQ for both English and Chinese QPI tasks.

4.2 Short papers

In addition to the ECIR full papers, also two ECIR short paper received recognition.

First, the ECIR 2023 Best Short Paper Award was presented to [6]. Their paper investigates “Doc2Query,” the process of expanding the content of a document before indexing using a sequence-to-sequence model. However, these models are known to be prone to “hallucinating” content that is not present in the source text. This paper explores techniques for filtering out these harmful queries prior to indexing. The paper finds that using a relevance model to remove poor-quality queries can improve the retrieval effectiveness of Doc2Query.

Second, the ECIR 2023 Best Student Short Paper Award was presented to [15]. Their paper investigates conversational search, which has evolved as a new information retrieval paradigm marking a shift from traditional search systems towards interactive dialogues with intelligent search agents. It conducts a laboratory study to investigate open-ended search behavior for navigation through unknown information landscapes. The paper identifies core dialogue acts and their interrelations that enable users to discover domain knowledge, but also derives design suggestions for conversational search systems.

5 Conclusion

This is the third Collection resulting from the memorandum of agreement between the BCS IRSG that organizes ECIR and the Springer Discover Computing (formerly the Information Retrieval Journal) editorial board. We are proud to see three excellent papers as a result from this cooperation. Many thanks go out to the anonymous reviewers that helped the authors to substantially improve their work.

In addition to ECIR papers now published in Discover Computing, there were three original IRJ papers that were presented at ECIR 2023 in Dublin. We feel that this cross-fertilization between Discover Computing and the European Conference on Information Retrieval is of mutual benefit, and helps advance research in information retrieval.

References

Arabzadeh N, Vtyurina A, Yan X, et al. Shallow pooling for sparse labels. Inf Retr J. 2022;25(4):365–85. https://doi.org/10.1007/s10791-022-09411-0.
Article Google Scholar
Askari A, Abolghasemi A, Pasi G, et al. Injecting the BM25 score as text improves bert-based re-rankers. Berlin: Springer; 2023. p. 66–83.
Google Scholar
Askari A, Abolghasemi A, Pasi G, et al. Injecting the score of the first-stage retriever as text improves BERT-based re-rankers. Discov Comput. 2024. https://doi.org/10.1007/s10791-024-09435-8.
Article Google Scholar
Boualili L, Moreno JG, Boughanem M. Highlighting exact matching via marking strategies for ad hoc document ranking with pretrained contextualized language models. Inf Retr J. 2022;25(4):414–60. https://doi.org/10.1007/s10791-022-09414-x.
Article Google Scholar
Cai F, Pan Z, Song C, et al. Exploring latent connections in graph neural networks for session-based recommendation. Inf Retr J. 2022;25(3):329–63. https://doi.org/10.1007/s10791-022-09412-z.
Article Google Scholar
Gospodinov M, MacAvaney S, Macdonald C. Doc2query-: when less is more. Berlin: Springer; 2023. p. 414–22.
Google Scholar
Hosokawa T, Jatowt A, Sugiyama K. Temporal natural language inference: evidence-based evaluation of temporal text validity. Berlin: Springer; 2023. p. 441–58.
Google Scholar
Hosokawa T, Jatowt A, Sugiyama K. Text validity reassessment: commonsense reasoning about information obsoleteness. Discov Comput. 2024. https://doi.org/10.1007/s10791-024-09433-w.
Article Google Scholar
Jin Z, Hong Y, Peng R, et al. Intention-aware neural networks for question paraphrase identification. Berlin: Springer; 2023. p. 474–88.
Google Scholar
Kamps J, Goeuriot L, Crestani F, et al (eds) , Advances in Information Retrieval - 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part I, Lecture Notes in Computer Science, 2023a; vol 13980, Springer, https://doi.org/10.1007/978-3-031-28244-7
Kamps J, Goeuriot L, Crestani F, et al (eds) , Advances in Information Retrieval—45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part II, Lecture Notes in Computer Science, 2023b; vol 13981, Springer, https://doi.org/10.1007/978-3-031-28238-6
Kamps J, Goeuriot L, Crestani F, et al (eds) Advances in Information Retrieval—45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part III, Lecture Notes in Computer Science,2023c; vol 13982, Springer, https://doi.org/10.1007/978-3-031-28241-6
Madasu A, Aflalo E, Stan GBM, et al. Mumur: multilingual multimodal universal retrieval. Inf Retr J. 2023;26(1):5. https://doi.org/10.1007/s10791-023-09422-5.
Article Google Scholar
Madasu A, Aflalo E, Stan GBM, et al. Improving video retrieval using multilingual knowledge transfer. Berlin: Springer; 2023. p. 669–84.
Google Scholar
Schneider P, Afzal A, Vladika J, et al. Investigating conversational search behavior for domain exploration. Berlin: Springer; 2023. p. 608–16.
Google Scholar

Download references

Acknowledgements

We thank IRJ/DC editors in chief, in particular Vannessa Murdock for her help with this Collection and Leif Azzopardi with his help in selecting the IRJ papers presented at ECIR 2023. We are grateful to the ECIR 2023 Best Paper committee, in particular chair Suzan Verberne (University of Leiden, chair) and members Christin Seifert (University of Duisburg-Essen); Martin Halvey (University of Strathclyde), and Carsten Eickhoff (University of Tübingen). We finally thank and acknowledge Springer for sponsoring the best-paper award handed out at ECIR 2023. Realizing the Collection in Discover Computing (formerly the Information Retrieval Journal) took an incredible amount of effort over a year, nothing short of a roller-coaster ride due to the changes in the journal. The transitioning of the journal to the new and broader scope, still including Information Retrieval as a core part, happened during the process of publishing this Collection, now as part of Discover Computing. The editors of the Collection want to express their sincere thanks to the incoming staff of Discover Computing. In particular Associate Editor Ayesha Eduljee of Springer Nature, India helped realize this Collection. Commissioning Editor Daisy Guo was also instrumental in realizing this inaugural Collection of Discover Computing.

Author information

Authors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Jaap Kamps
Université Grenoble Alpes, Grenoble, France
Lorraine Goeuriot
Università della Svizzera Italiana, Lugano, Switzerland
Fabio Crestani

Authors

Jaap Kamps
View author publications
You can also search for this author in PubMed Google Scholar
Lorraine Goeuriot
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Crestani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jaap Kamps.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kamps, J., Goeuriot, L. & Crestani, F. Advances in information retrieval collection on the European conference on information retrieval 2023. Discov Computing 27, 9 (2024). https://doi.org/10.1007/s10791-024-09442-9

Download citation

Published: 23 May 2024
DOI: https://doi.org/10.1007/s10791-024-09442-9

Advances in information retrieval collection on the European conference on information retrieval 2023

Abstract

1 Introduction

2 Papers in the collection

2.1 First stage retrieval scores in cross-encoders

2.2 Temporal natural language inference

2.3 Weakly supervised video retrieval

3 IRJ papers presented at ECIR 2023

3.1 Impact of shallow pooling on evaluation

3.2 Exact matching in pre-trained language models

3.3 Implicit item relations in session-based recommendation

4 Other notable ECIR 2023 papers

4.1 Full papers

4.2 Short papers

5 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Advances in information retrieval collection on the European conference on information retrieval 2023

Abstract

1 Introduction

2 Papers in the collection

2.1 First stage retrieval scores in cross-encoders

2.2 Temporal natural language inference

2.3 Weakly supervised video retrieval

3 IRJ papers presented at ECIR 2023

3.1 Impact of shallow pooling on evaluation

3.2 Exact matching in pre-trained language models

3.3 Implicit item relations in session-based recommendation

4 Other notable ECIR 2023 papers

4.1 Full papers

4.2 Short papers

5 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation