Most of today's state-of-the-art retrieval models, including BM25 and language modeling, are grounded in probabilistic principles. Having a working understanding of these principles can help researchers understand existing retrieval models better and also provide industrial practitioners with an understanding of how such models can be applied to real world problems.
This half-day tutorial will cover the fundamentals of two dominant probabilistic frameworks for Information Retrieval: the classical probabilistic model and the language modeling approach. The elements of the classical framework will include the probability ranking principle, the binary independence model, the 2-Poisson model, and the widely used BM25 model. Within language modeling framework, we will discuss various distributional assumptions and smoothing techniques. Special attention will be devoted to the event spaces and independence assumptions underlying each approach. The tutorial will outline several techniques for modeling term dependence and addressing vocabulary mismatch. We will also survey applications of probabilistic models in the domains of cross-language and multimedia retrieval. The tutorial will conclude by suggesting a set of open problems in probabilistic models of IR.
Attendees should have a basic familiarity with probability and statistics. A brief refresher of basic concepts, including random variables, event spaces, conditional probabilities, and independence will be given at the beginning of the tutorial. In addition to slides, some hands on exercises and examples will be used throughout the tutorial.
|Title of host publication||SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL|
|Editors||HH Chen, EN Efthimiadis, J Savoy, F Crestani, S MarchandMaillet|
|Place of Publication||NEW YORK|
|Publisher||ASSOC COMPUTING MACHINERY|
|Number of pages||1|
|Publication status||Published - 2010|
|Event||33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - Geneva|
Duration: 19 Jul 2010 → 23 Jul 2010
|Conference||33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval|
|Period||19/07/10 → 23/07/10|