SIGIR'98 papers: Experiments in Japanese Text Retrieval and Routing using the NEAT System

Experiments in Japanese Text Retrieval and Routing using the NEAT System


Gareth J. F. Jones
Research and Development Center, Toshiba Corporation, Kawasaki 210-8582, Japan

Tetsuya Sakai
Research and Development Center, Toshiba Corporation, Kawasaki 210-8582, Japan

Masahiro Kajiura
Research and Development Center, Toshiba Corporation, Kawasaki 210-8582, Japan

Kazuo Sumita
Research and Development Center, Toshiba Corporation, Kawasaki 210-8582, Japan


Abstract

This paper describes a structured investigation into the retrieval of Japanese text. The study includes a comparison of different indexing strategies for documents and queries, investigation of term weighting strategies principally derived for use with English texts, and the application of relevance feedback for query expansion. Results on the standard BMIR-J1 and BMIR-J2 Japanese retrieval collections indicate that term weighting transfers well to Japanese text. Indexing using dictionary based morphological analysis and character strings are both shown to be individually effective, but marginally better in combination. We also demonstrate that relevance feedback can be used effectively for query expansion in Japanese routing applications.


SIGIR'98
24-28 August 1998
Melbourne, Australia.
sigir98@cs.mu.oz.au.