The BNB Distribution for Text Modeling
- Cite this paper as:
- Clinchant S., Gaussier E. (2008) The BNB Distribution for Text Modeling. In: Macdonald C., Ounis I., Plachouras V., Ruthven I., White R.W. (eds) Advances in Information Retrieval. ECIR 2008. Lecture Notes in Computer Science, vol 4956. Springer, Berlin, Heidelberg
We first review in this paper the burstiness and aftereffect of future sampling phenomena, and propose a formal, operational criterion to characterize distributions according to these phenomena. We then introduce the Beta negative binomial distribution for text modeling, and show its relations to several models (in particular to the Laplace law of succession and to the tf-itf model used in the Divergence from Randomness framework of ). We finally illustrate the behavior of this distribution on text categorization and information retrieval experiments.
Unable to display preview. Download preview PDF.