1 Introduction

The 45th European Conference on Information Retrieval (ECIR 2023) was held in Dublin, Ireland, during April 2–6, 2023. The conference was the largest ECIR ever, and brought together hundreds of researchers from Europe and abroad. For those who like numbers: first, ECIR received 489 submissions in total (including 228 full and 153 short papers) excluding further workshop submissions. Second, the technical program committee consisted of 624 reviewers in total (including 27 chairs, 124 SPC, 473 reviewers) with many serving on multiple tracks. Third, the proceedings necessitated a third volume, and contains a total of 175 papers in 2151 pages [10,11,12].

For this Collection on ECIR 2023, we asked the authors of selected ECIR 2023 full papers that were shortlisted for the best paper awards to submit an extended version of their paper. This led to three papers that are published in this Collection of Discover Computing. The extended papers contain at least 30% new content. Examples of extensions are enhancements that improve the techniques described in the ECIR 2023 paper; as well as tests on additional data-sets that reveal behaviors that differ from the originally published claims and that provide further insights into the methods being described. Among the papers in this Collection are extensions of two papers that received an award at ECIR 2023.

2 Papers in the collection

The editors of this Collection invited five of the full papers short-listed for the ECIR Best Paper Award to submit an expanded version of their paper for this Collection of Discover Computing on ECIR 2023. Three of these expanded ECIR papers were accepted for this Collection after peer review and further revisions.

2.1 First stage retrieval scores in cross-encoders

The first paper in this volume is based on Askari, Abolghasemi, Pasi, Kraaij, and Verberne [2]:

This paper presents a novel approach for combining first-stage lexical retrieval models and Transformer-based re-rankers, by including the relevance score of the lexical model as a token in the middle of the input of the cross-encoder re-ranker. This idea was motivated by the finding that BERT models can capture numeric information. Evaluation on the MSMARCO Passage collection and the TREC DL collections shows that the proposed method significantly improves over all cross-encoder re-rankers as well as the common interpolation methods.

The extended version of the paper in the Collection is [3]:

  • Askari A, Abolghasemi A, Pasi G, et al (2024) Injecting the score of the first-stage retriever as text improves BERT-based re-rankers. Discover Computing 27(4). doi: 10.1007/s10791-024-09435-8, URL https://doi.org/10.1007/s10791-024-09435-8.

2.2 Temporal natural language inference

The second paper in this volume is based on Hosokawa, Jatowt, and Sugiyama [7]:

  • Hosokawa T, Jatowt A, Sugiyama K (2023) Temporal natural language inference: Evidence-based evaluation of temporal text validity. In: [10], pp 441–458, doi: 10.1007/978–3-031–28244–7-28, URL https://doi.org/10.1007/978-3-031-28244-7_28

Their paper investigates Temporal Natural Language Inference, inspired by traditional natural language reasoning to determine the temporal validity of text content. The authors first construct their own data-set for this task and train several machine learning models, and then propose an effective method for learning information from an external knowledge base that gives hints on temporal commonsense knowledge. Using the prepared data-set, it introduces a new machine learning model that incorporates the information from the knowledge base and demonstrate that this model outperforms state-of-the-art approaches in the proposed task.

The extended version of the paper in the Collection is [8]:

  • Hosokawa T, Jatowt A, Sugiyama K (2024) Text validity reassessment: Commonsense reasoning about information obsoleteness. Discover Computing 27(4). doi: 10.1007/s10791-024-09433-w, URL https://doi.org/10.1007/s10791-024-09433-w.

2.3 Weakly supervised video retrieval

The third paper in this volume is based on Madasu, Aflalo, Stan, Tseng, Bertasius,and Lal [14]:

This paper investigates video retrieval, which has seen tremendous progress with the development of vision-language models, however requires labeled data which is a huge manual effort. This paper uses state-of-the-art machine translation models to construct pseudo ground-truth multilingual video-text pairs, and learns a multilingual video-text representation in a common embedding space based on pre-trained multilingual models. Experimental results demonstrate that this approach achieves state-of-the-art results for English video retrieval data-sets, and superior performance on a multilingual video retrieval benchmark.

Due to very expedient processing by the Journal this paper was already publishing in a regular issue [13]:

This paper has been added now to the Collection on ECIR 2023 of Discover Computing.

3 IRJ papers presented at ECIR 2023

In addition to ECIR papers now becoming part of Discover Computing, three original IRJ 2022 papers were presented at ECIR 2023 in Dublin.

3.1 Impact of shallow pooling on evaluation

The first IRJ 2022 paper presented at ECIR 2023 was [1]. In contrast with traditional information retrieval test collections, such as those developed by TREC, the MS MARCO data-sets employ substantially more queries (thousands versus dozens) but with substantially fewer known relevant items per query (often just one). To understand the implications for the leader-board, the pooled the top document from available runs near the top of the passage ranking leader-board for over 500 queries. We employed crowd-sourced workers to make preference judgments over these pools and re-evaluated the runs. Their results support our concerns that current MS MARCO data-sets may no longer be able to recognize genuine improvements in rankers, and shallow pooling and judging of additional higher ranked documents could be beneficial.

3.2 Exact matching in pre-trained language models

The second IRJ 2022 paper presented at ECIR 2023 was [4]. Pre-trained language models such as BERT were found to be an effective soft matching model. However, exact matching is still an essential signal for assessing the relevance of a document to an information-seeking query aside from semantic matching. They explored strategies for integrating exact matching signals using marker tokens to highlight exact term-matches between the query and the document. This simple marking approach significantly improves over the common vanilla baseline. Results show that traditional information retrieval cues such as exact matching are still valuable for large pre-trained contextualized models such as BERT.

3.3 Implicit item relations in session-based recommendation

The third IRJ 2022 paper presented at ECIR 2023 was [5]. Previous work on session-based recommendation has considered sequences of items that users have interacted with sequentially, which fails to capture other relationships between items that go beyond the inspection order. They propose Star Graph Neural Networks with Highway Networks (SGNN-HN) for session-based recommendation. The proposed model applies a star graph neural network to model the complex transition relationship between items in an ongoing session. The results show that this can outperform the state-of-the-art models in terms of Recall and MRR for session-based recommendation.

4 Other notable ECIR 2023 papers

4.1 Full papers

Three ECIR full papers received recognition at the conference. In fact, we have already discussed two of them now included in expanded form as part of this Collection .

First, the ECIR 2023 Best Paper Award was presented to [7], on temporal natural language inference. We discussed this paper above in Sect. 2.2.

Second, the ECIR 2023 Best Student Paper Award was presented to [14], on weakly supervised video retrieval. We discussed this paper above in Sect. 2.3.

Third, an Honorable Mention for the Best Paper Award was presented to [9]. Their paper investigates Question Paraphrasing Identification (QPI), a task of determining whether a pair of interrogative sentences (i.e., questions) are paraphrases of each other. The paper proposes an intention-aware neural model for QPI. Question words (e.g., “when”) and blocks (e.g., “what time”) are extracted as features for revealing intentions, used to regulate pairwise question encoding explicitly and implicitly, within Conditional Variational AutoEncoder (CVAE) and multi-task VAE frameworks. This model outperforms the state-of-the-art QPI models on benchmark corpora QQP, LCQMC and BQ for both English and Chinese QPI tasks.

4.2 Short papers

In addition to the ECIR full papers, also two ECIR short paper received recognition.

First, the ECIR 2023 Best Short Paper Award was presented to [6]. Their paper investigates “Doc2Query,” the process of expanding the content of a document before indexing using a sequence-to-sequence model. However, these models are known to be prone to “hallucinating” content that is not present in the source text. This paper explores techniques for filtering out these harmful queries prior to indexing. The paper finds that using a relevance model to remove poor-quality queries can improve the retrieval effectiveness of Doc2Query.

Second, the ECIR 2023 Best Student Short Paper Award was presented to [15]. Their paper investigates conversational search, which has evolved as a new information retrieval paradigm marking a shift from traditional search systems towards interactive dialogues with intelligent search agents. It conducts a laboratory study to investigate open-ended search behavior for navigation through unknown information landscapes. The paper identifies core dialogue acts and their interrelations that enable users to discover domain knowledge, but also derives design suggestions for conversational search systems.

5 Conclusion

This is the third Collection resulting from the memorandum of agreement between the BCS IRSG that organizes ECIR and the Springer Discover Computing (formerly the Information Retrieval Journal) editorial board. We are proud to see three excellent papers as a result from this cooperation. Many thanks go out to the anonymous reviewers that helped the authors to substantially improve their work.

In addition to ECIR papers now published in Discover Computing, there were three original IRJ papers that were presented at ECIR 2023 in Dublin. We feel that this cross-fertilization between Discover Computing and the European Conference on Information Retrieval is of mutual benefit, and helps advance research in information retrieval.