A Framework for Information Discovery Systems


Daryl J. D'Souza
Department of Computer Science, RMIT, GPO Box 2476V, Melbourne 3001, Australia.
djds@cs.rmit.edu.au

James A. Thom
Department of Computer Science, RMIT, GPO Box 2476V, Melbourne 3001, Australia.
jat@cs.rmit.edu.au


Abstract

Information discovery systems range from controlled, static, topic-specific collections of homogeneous documents to uncontrolled, highly dynamic, collections of heterogeneous documents. Techniques are required to dispatch the user queries to collections that are most likely to contain documents satisfying the information need of the user. Dispatching queries in this way to find relevant documents is the information discovery problem. This paper surveys scalable solutions that have been developed to solve the information discovery problem. In varying degree such developments have contributed to query dispatch and document retrieval in uncontrolled, heterogeneous information discovery systems. Research in this area is motivated by the need for associative access to documents in the Internet which is experiencing rapid growth in terms of user base, data volume and information diversity. We propose an abstract framework that captures the main components of information discovery systems and into which existing systems may be mapped. Within the context of this framework we then suggest some related research problems.
Conference Home Page