15th String Processing and Information Retrieval Symposium
Melbourne Australia, November 10-12

Invited Speakers

Gad Landau

Photo of Gad Landau, Invited Speaker at SPIRE 2008

Gad M. Landau has a Ph.D. in Computer Science from the University of Tel Aviv, Israel, 1987. He is a Professor of the Department of Computer Science of Haifa University, Israel and a Research Professor at the Department of Computer and Information Science at Polytechnic University, NY, USA. He was the head of both departments. Landau has co-authored over 100 refereed papers in the areas of String Algorithm, Computational Biology and Data Structures. He was the Co-Chair of the program committees of CPM 2001 and 2008, and is editor of the Journal of Discrete Algorithms. [abstract]

David Hawking

Photo of David Hawking, Invited Speaker at SPIRE 2008

Dr. David Hawking is chief scientist at the internet and enterprise search company Funnelback Pty Ltd, a CSIRO spinoff based in Canberra, Australia. Funnelback search technology has won a number of awards and now supports around 100 customers in Australia, Britain and Canada, mostly in government, education, finance and careers.

David is also an adjunct professor at the Australian National University and supervises PhD students at ANU and the University of Sydney. He is an author of around 50 refereed publications and was a coordinator of the Web track at the international Text Retrieval Conference (TREC) from 1997-2004. In this role he was responsible for the creation and distribution of text retrieval benchmark collections now in use at over 120 research organisations worldwide. He was a program chair of the ACM SIGIR conference in 2003 and 2006.

In 2003 he was awarded an honorary doctorate from the University of Neuchâtel in Switzerland for his contributions to the objective evaluation of search quality. He won the Chris Wallace award for contribution to computer science research in Australasia, for the years 2001-2003.

In his spare time he decodes obscure file formats, programs in PostScript, procrastinates over home maintenance and plays netball and Ultimate.

Homepage and publications list

Funnelback

[abstract]

Abstracts

Gad Landau, Approximate Runs - Revisited

The problem of finding repeats within a string is an important computational problem with applications in data compression and in the field of molecular biology. Both exact and inexact repeats occur frequently in the genome, and certain repeats are known to be related to human diseases.

A multiple tandem repeat in a sequence S is a (periodic) substring r of S of the form r=ua u', where u (the period) is a prefix of r, u' is a prefix of u and a >= 2. A run is a maximal (non-extendable) multiple tandem repeat. An approximate run is a run with errors (i.e. the repeated subsequences are similar but not identical).

Many measures have been proposed that capture the similarity among all periods. We may measure the number of errors between consecutive periods, between all periods, or between each period and a consensus string. Another possible measure is the number of positions in the periods that may differ.

In this talk I will survey a range of our results in this area. Various parts of this work are joint work with Maxime Crochemore, Gene Myers, Jeanette Schmidt and Dina Sokol.


David Hawking, "Search is a Solved Problem" and other annoying fallacies

Since Google became a celebrity in the early noughties, many poeple with the power to control and direct research resources have taken the view that there is no more research to be done on the problem of information retrieval. In reality, there are so many variants of "the search problem" that not all have been catalogued, and few have been solved to the point where we can rely absolutely on the quality of results. Apparently no-one told the Web search companies that the problem was solved as, since that time, they have researched and developed a range of new search facilities and invested heavily in improving their basic products. Google, Yahoo! and Microsoft all maintain search R&D teams much larger than the biggest University computer science departments!

Through my involvement with the Funnelback internet and enterprise search company I have worked on many twists on the information retrieval problem which are not modelled in well-known test collections, and not encountered in basic Web search. In my talk I will try to outline some of the issues in trying to apply information retrieval and string processing theory into commercial practice.

Sponsors