Highlighting exact matching via marking strategies for ad hoc document ranking with pretrained contextualized language models

Boualili, Lila; Moreno, Jose G.; Boughanem, Mohand

doi:10.1007/s10791-022-09414-x

Highlighting exact matching via marking strategies for ad hoc document ranking with pretrained contextualized language models

Published: 06 August 2022

Volume 25, pages 414–460, (2022)
Cite this article

Download PDF

Information Retrieval Journal Aims and scope Submit manuscript

Highlighting exact matching via marking strategies for ad hoc document ranking with pretrained contextualized language models

Download PDF

521 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Pretrained language models (PLMs) exemplified by BERT have proven to be remarkably effective for ad hoc ranking. As opposed to pre-BERT models that required specialized neural components to capture different aspects of query-document relevance, PLMs are solely based on transformers where attention is the only mechanism used for extracting signals from term interactions. Thanks to the transformer’s cross-match attention, BERT was found to be an effective soft matching model. However, exact matching is still an essential signal for assessing the relevance of a document to an information-seeking query aside from semantic matching. We assume that BERT might benefit from explicit exact match cues to better adapt to the relevance classification task. In this work, we explore strategies for integrating exact matching signals using marker tokens to highlight exact term-matches between the query and the document. We find that this simple marking approach significantly improves over the common vanilla baseline. We empirically demonstrate the effectiveness of our approach through exhaustive experiments on three standard ad hoc benchmarks. Results show that explicit exact match cues conveyed by marker tokens are beneficial for BERT and ELECTRA variant to achieve higher or at least comparable performance. Our findings support that traditional information retrieval cues such as exact matching are still valuable for large pretrained contextualized models such as BERT.

Rethink Training of BERT Rerankers in Multi-stage Retrieval Pipeline

An in-depth analysis of passage-level label transfer for contextual document ranking

Article 08 December 2023

Koustav Rudra, Zeon Trevor Fernando & Avishek Anand

Injecting the BM25 Score as Text Improves BERT-Based Re-rankers

1 Introduction

Pretrained Language Models (PLMs), such as BERT (Devlin et al., 2019), ELECTRA (Clark et al., 2020) and T5 (Raffel et al., 2020), have become the core components for building highly effective ranking models. The success of PLMs is largely owed to the heavy pre-training on language modeling objectives on the one hand, and learning deeply-contextualized representations of input sequences using the transformer architecture (Vaswani et al., 2017) on the other. Thanks to the fine-tuning strategy and the availability of large publicly-released training datasets, applying a PLM to document ranking is straightforward. Nogueira and Cho (2019) was the first to propose a simple application of BERT to text ranking using fine-tuning on the large public MS MARCO (Nguyen et al., 2016) dataset. In this work, BERT was deployed as a relevance classifier trained to estimate the probability each document is “relevant” w.r.t a given query.

Table 1 Extracts from top ranked passages by Vanilla BERT for the query: “causes of left ventricular hypertrophy”

Highlighting exact matching via marking strategies for ad hoc document ranking with pretrained contextualized language models

Abstract

Similar content being viewed by others

Rethink Training of BERT Rerankers in Multi-stage Retrieval Pipeline

An in-depth analysis of passage-level label transfer for contextual document ranking

Injecting the BM25 Score as Text Improves BERT-Based Re-rankers

1 Introduction

2 Background and related work

2.1 Exact matching in pre-BERT models

2.2 PLMs for multi-stage reranking

2.3 PLMs for sparse and dense retrieval

2.4 Understanding BERT’s success

3 Augmenting pretrained contextualized language models with exact match signals

3.1 Model architecture

3.2 Exact match marking

3.2.1 Marker-token type

3.2.2 Marking level

4 Experimental setup

4.1 Datasets

4.2 Baselines

4.2.1 Lexical retrieval baselines

4.2.2 Sparse retrieval baselines

4.2.3 Dense retrieval baselines

4.2.4 Reranking baselines

4.3 Training

4.4 Inference

5 Results and analysis

5.1 Performance of the models augmented with exact match marking

5.1.1 In-domain effectiveness

5.1.2 Out-of-domain effectiveness

5.1.3 In-domain versus out-of-domain effectiveness.

5.2 Contribution of the first-stage retriever scores to the end-to-end effectiveness

5.2.1 Impact of interpolating BM25 scores

5.2.2 Impact of exact match marking

5.2.3 Contribution of BM25 scores

5.3 Multi-phase fine-tuning

5.3.1 Phase-wise marking

5.4 Impact of exact match marking on ELECTRA variant

5.4.1 In-domain effectiveness

5.4.2 Zero-shot transfer setting

5.4.3 Multi-phase fine-tuning

5.5 Comparison with state-of-the-art baselines

5.5.1 Comparison in the same experimental design.

5.5.2 Comparison with different experimental designs

6 Discussion and future work

7 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1. Results using the BERT core with all marking strategies

1.1 Appendix 1.1 Zero-shot transfer setting

1.2 Appendix 1.2 Multi-phase fine-tuning setting

Appendix 2. Results using the ELECTRA core with all marking strategies

1.1 Appendix 2.1 Zero-shot transfer setting

1.2 Appendix 2.2 Multi-phase fine-tuning setting

Appendix 3. TREC deep learning track document ranking task

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation