SIGIR'98 posters: Automatic Acquisition of Phrasal Knowledge for English-Chinese Bilingual Information Retrieval

Automatic Acquisition of Phrasal Knowledge for English-Chinese Bilingual Information Retrieval


Ming-Jer Lee
Institute of Information Science, Academia Sinica, Taipei, Taiwan, ROC

Lee-Feng Chien
Institute of Information Science, Academia Sinica, Taipei, Taiwan, ROC


Abstract

Extraction of phrasal knowledge, such as proper names, domain-specific keyphrases and lexical templates from a domain-specific text collection are significant for developing effective information retrieval systems for the Internet. In this paper, we are going to introduce our ongoing research on automatic phrasal knowledge acquisition for English-Chinese bilingual texts. The underlying techniques consist of adaptive keyphrase extraction, lexical template extraction, phrase translation extraction and high-order Markov language model construction. In addition to the increase of retrieval effectiveness, IR systems based on these techniques are expected able to perform much better in many aspects, such as automatic term suggestion, information filtering, text classification and cross-language information retrieval, etc.


SIGIR'98
24-28 August 1998
Melbourne, Australia.
sigir98@cs.mu.oz.au.