The WebCluster Project. Using Clustering for Mediating Access to the
World Wide Web.
The WebCluster Project. Using Clustering for Mediating Access to the
World Wide Web.
Mourad Mechkour
The Robert Gordon University,
School of Computer and Mathematical Sciences,
Aberdeen, AB25 1HG, UK.
David J. Harper
The Robert Gordon University,
School of Computer and Mathematical Sciences,
Aberdeen, AB25 1HG, UK.
Gheorghe Muresan
The Robert Gordon University,
School of Computer and Mathematical Sciences,
Aberdeen, AB25 1HG, UK.
Abstract
We present in this poster the WebCluster project, in which we propose
and implement an innovative approach to improve the effectiveness of
information retrieval on the World Wide Web. Our approach is based on
combining mediated access to the WWW with document clustering. In this
approach we use a source document collection, which is well structured
and specific to the user interest domain, as a filter on the WWW. This
filter will limit the user scope of view to the documents considered as
relevant regarding this document collection. The basic techniques used
in this project are document clustering, which allows us to extract the
inherent structure of the source collection, and different search
strategies (cluster based, best match retrieval, browsing). The software
developed is a two part tool : a clustering framework (CF), that allow
users to choose the best clustering method for their document
collection, and an end user interface for accessing the services of this
CF (clustering, and different retrieval strategies). WebCluster's final
goal is to allow a user or group of users to have their own organization
of the WWW, combine different retrieval strategies (for example cluster
based search or browsing) using the same interface.
SIGIR'98
24-28 August 1998
Melbourne, Australia.
sigir98@cs.mu.oz.au.