The WebCluster Project. Using Clustering for Mediating Access to the World Wide Web.

The WebCluster Project. Using Clustering for Mediating Access to the World Wide Web.


Mourad Mechkour
The Robert Gordon University, School of Computer and Mathematical Sciences, Aberdeen, AB25 1HG, UK.

David J. Harper
The Robert Gordon University, School of Computer and Mathematical Sciences, Aberdeen, AB25 1HG, UK.

Gheorghe Muresan
The Robert Gordon University, School of Computer and Mathematical Sciences, Aberdeen, AB25 1HG, UK.


Abstract

We present in this poster the WebCluster project, in which we propose and implement an innovative approach to improve the effectiveness of information retrieval on the World Wide Web. Our approach is based on combining mediated access to the WWW with document clustering. In this approach we use a source document collection, which is well structured and specific to the user interest domain, as a filter on the WWW. This filter will limit the user scope of view to the documents considered as relevant regarding this document collection. The basic techniques used in this project are document clustering, which allows us to extract the inherent structure of the source collection, and different search strategies (cluster based, best match retrieval, browsing). The software developed is a two part tool : a clustering framework (CF), that allow users to choose the best clustering method for their document collection, and an end user interface for accessing the services of this CF (clustering, and different retrieval strategies). WebCluster's final goal is to allow a user or group of users to have their own organization of the WWW, combine different retrieval strategies (for example cluster based search or browsing) using the same interface.


SIGIR'98
24-28 August 1998
Melbourne, Australia.
sigir98@cs.mu.oz.au.