SIGIR'98 papers: A Study on Retrospective and On-Line Event Detection
A Study on Retrospective and On-Line Event Detection
Yiming Yang
School of Computer Science,
Carnegie Mellon University,
Pittsburgh, PA 15213-3702, USA
Jaime Carbonell
School of Computer Science,
Carnegie Mellon University,
Pittsburgh, PA 15213-3702, USA
Thomas Pierce
School of Computer Science,
Carnegie Mellon University,
Pittsburgh, PA 15213-3702, USA
Abstract
This paper investigates the use and extension of text retrieval and
clustering techniques for event detection. The task is to
automatically detect novel events from a temporally-ordered stream of
news stories, either retrospectively or as the stories arrive. We
applied hierarchical and non-hierarchical document clustering
algorithms to a corpus of 15,836 stories, focusing on the exploitation
of both content and temporal information. We found the resulting
cluster hierarchies highly informative for retrospective detection of
previously unidentified events, effectively supporting both query-free
and query-driven retrieval. We also found that temporal distribution
patterns of document clusters provide useful information for
improvement in both retrospective detection and on-line detection of
novel events. In an evaluation using manually labelled events to
judge the system-detected events, we obtained a result of 82% in the
F_1 measure for retrospective detection, and a F_1 value of 42%
for on-line detection.
SIGIR'98
24-28 August 1998
Melbourne, Australia.
sigir98@cs.mu.oz.au.