SIGIR'98 posters: Automatic acquisition of terminological relations from a corpus for query expansion

Automatic acquisition of terminological relations from a corpus for query expansion

Sta Jean-David
EDF Electricite de France
Direction des Etudes et Recherches
1, av. du General De Gaulle
Clamart 92141 France


One of the means used for query expansin consists in adding related terms from a thesaurus to the terms of the query. How efficient this technique can be depends upon the presence of terminological relations in the thesaurus. The experiment that is described here consists in assessing the capacity of various statistic measures of associations between terms, to highlight terminological relations. From a corpus of over 6,000 documents, the associations between more than 5,000 terms two by two were compared to the 70,000 terminological relations of a thesaurus. The measure based on context likeness is the most effective.