21st Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
Melbourne, Australia
August 24 - 28, 1998
TUTORIAL T2
Models in Information Retrieval
Presenter
Fredric C. Gey,
UC Berkeley
Time
Monday 24 August, 9:00am--12:30pm.
Location
Melbourne Town Hall, Swanston Street, Melbourne.
Description
The three major theoretical models in information retrieval
are Boolean/logic, vector space, and probabilistic. This tutorial
will explain the unique characteristics and problems of each
model and how each model has evolved along different lines.
Modern variants of the basic models are explained.
The attendees of this tutorial will obtain a basic understanding
of the major theoretical models upon which modern text retrieval
software is based. The tutorial should provide each participant
with a starting point for further self-education.
Schedule
15 min.
Background and historical development
Luhn and statistical text characteristics
Statistical weights and the IDF concept
45 min
Boolean set and logic models
Fuzzy logic (RUBRIC/TOPIC)
Weighted boolean and P-Norm (INQUERY)
Recent logic models
45 min
Vector space and geometric models
Basic vector similarity measures
Generalized vector space model
Latent Semantic Indexing
Pivoted normalization similarity
45 min
Probabilistic models
Probabilistic indexing and querying
2- Poisson and OKAPI
Relevance weights and relevance feedback
Inference nets and neural network approaches
Regression models
15 min
Performance measurement and analysis
Recall, precision, fallout measures
Limitations to performance assessment --
Interjudge consistency, completeness
Statistical significance tests
Materials: 110 course overheads will be provided as well as 4
pages of bibliography of references covered
Audience
This course is designed to provide a
fast-paced yet rigorous introduction to the basic models of
Information Retrieval for academic and industrial research and
development computer scientists whose background lies outside
the Information Retrieval area.
Biography of presenter
Fredric Gey's research specializes
in probabilistic document retrieval using logistic regression
techniques. He directs the UC Berkeley entries to the TREC
conferences, and will be the General Chairman for
SIGIR99 to be held at the University of California, Berkeley
during the summer of 1999. He holds a PhD in Information
Science from UC Berkeley.
Cost
The charge for registration is $A150 per tutorial.
Registrants will receive a copy of the
notes for the tutorial, and morning/afternoon tea.
All tutorials are offered on an only-if-demand-warrants
basis;
and full refunds will be given for tutorials
that are cancelled because of low enrolments.
Tutorial notes will also be available for sale on an individual
basis at the conference registration desk.
sigir98@cs.mu.oz.au,
27 April 1998.