Frontiers in Linguistically Annotated Corpora 2006
A Merged Workshop with
7th International Workshop on Linguistically Interpreted Corpora (LINC-2006)
and
Frontiers in Corpus Annotation III
Coling/ACL 2006
Sydney Convention and Exhibition Centre
Sydney, Australia
July 22, 2006
Background
Large linguistically interpreted corpora play an increasingly important role for machine learning, evaluation, psycholinguistics as well as theoretical linguistics. Many research groups are engaged in the creation of corpus resources annotated with morphological, syntactic, semantic, discourse and other linguistic information for a variety of languages. In the tradition of previous LINC (http://www.delph-in.net/events/05/linc/) and Frontiers (http://nlp.cs.nyu.edu/meyers/Frontiers_Workshop.html) workshops, we aim to bring together these activities in order to identify and disseminate best practice in the development and utilization of linguistically interpreted corpora.
Goals
The goals of the workshop are two-fold: (1) to exchange and propagate research results with respect to the annotation, conversion and exploitation of corpora taking into account different applications and theoretical investigations in the field of language technology and research; and (2) work towards a consensus on issues crucial to the advancement of the field of corpus annotation. In particular, we would like to focus on questions like:
- How can a system developer take advantage of the multitude of annotation efforts with completely different underlying assumptions, annotation schemata, etc.?
- How might one merge different annotation of the same data into one single unified representation?
- How can closely related schemes be applied across languages?
Working Groups
There will be two invited "working group" presentations. Each working group will consist of a group of researchers with the expressed purpose of laying out the dimensions of some crucial problem facing the field of corpus annotation, particularly problems involving merging annotation and extending annotation to new languages, genres and modalities. There are currently two working groups:
- Annotation Compatibility: A roadmap of the compatibility of current annotation schemes with each other.
- Low-density Languages: A discussion of low density languages and the problems associated with them.
We will attempt to lay out clearly and precisely the assumptions on such topics held by members of the annotation community and in doing so, we hope to both: (1) lay the foundations for the meaningful integration of annotation resources; and (2) assess the limitations of integrated approaches. See here for progress of each of the working groups.
Student Award
Václav Novák was chosen as the recipient of the Innovative Student Annotation Award. Congratulations, Václav!
Target Audience
Those interested in creating and using existing and future annotated corpora. This includes annotators, lexicographers, system developers and those designing NLP system evaluation tasks for the NLP community.
Programme
09.00 - 09.10 | Opening remarks | |
09.10 - 09.30 | Challenges for annotating images for sense disambiguation Cecilia Ovesdotter Alm, Nicolas Loeff and David A. Forsyth [SLIDES] |
|
09.30 - 10.00 | A Semi-Automatic Method for Annotating a Biomedical Proposition Bank Wen-Chi Chou, Richard Tzong-Han Tsai, Ying-Shan Su, Wei Ku, Ting-Yi Sung and Wen-Lian Hsu |
|
10.00 - 10.30 | How and Where do People Fail with Time: Temporal Reference
Mapping Annotation by Chinese and English Bilinguals Yang Ye and Steven Abney |
|
10.30 - 11.00 | Coffee Break | |
11.00 - 11.30 | Probing the space of grammatical variation: induction of cross-lingual
grammatical constraints from treebanks Felice Dell'Orletta, Alessandro Lenci, Simonetta Montemagni and Vito Pirrelli |
|
11.30 - 12.00 | Low-density Languages Working Group presentation [SLIDES] | |
12.00 - 12.30 | Annotation Compatibility Working Group presentation [SLIDES] | |
12.30 - 14.00 | Lunch | |
14.00 - 14.30 | Manual Annotation of Opinion Categories in Meetings Swapna Somasundaran, Janyce Wiebe, Paul Hoffmann and Diane Litman [SLIDES] |
|
14.30 - 15.00 | The Hinoki Sensebank — A Large-Scale Word Sense Tagged Corpus of
Japanese — Takaaki Tanaka, Francis Bond and Sanae Fujita [SLIDES] |
|
15.00 - 15.30 | Issues in Synchronizing the English Treebank and PropBank Olga Babko-Malaya, Ann Bies, Ann Taylor, Szuting Yi, Martha Palmer, Mitch Marcus, Seth Kulick and Libin Shen |
|
15.30 - 16.00 | Coffee Break | |
16.00 - 16.30 | On Distance between Deep Syntax and Semantic Representation Václav Novák |
|
16.30 - 17.30 | Discussion | |
17.30 - 17.40 | Closing Remarks | |
Alternate Papers | ||
Corpus annotation by generation Elke Teich, John A. Bateman and Richard Eckart |
||
Constructing an English Valency Lexicon Jiri Semecky and Silvie Cinkova |
Language
All papers will be presented in EnglishWorkshop Chairs
- Adam Meyers
- New York University, USA
- Shigeko Nariyama
- University of Melbourne, Australia
- Timothy Baldwin
- University of Melbourne, Australia
- Francis Bond
- NTT Communication Science Laboratories, Japan
Programme Committee
- Lars Ahrenberg (Linköpings Universitet)
- Kathy Baker (U.S. Dept. of Defense)
- Steven Bird (University of Melbourne)
- Alex Chengyu Fang (City University Hong Kong)
- David Farwell (Computing Research Laboratory, New Mexico State University)
- Chuck Fillmore (International Computer Science Institute, Berkeley)
- Anette Frank (DFKI)
- John Fry (SRI International)
- Eva Hajicova (Center for Computational Linguistics, Charles University, Prague)
- Erhard W. Hinrichs (University of Tübingen)
- Ed Hovy (International Sciences Institute)
- Baden Hughes (University of Melbourne)
- Emi Izumi (NICT)
- Tsai Jia-Lin (Tung Nan Institute of Technology)
- Aravind Joshi (University of Pennsylvania, Philadelphia)
- Sergei Nirenburg (University of Maryland, Baltimore County)
- Stephan Oepen (University of Oslo)
- Boyan A. Onyshkevych (U.S. Dept. of Defense)
- Kyonghee Paik (KLI)
- Martha Palmer (University of Colorado)
- Gerald Penn (University of Toronto)
- Manfred Pinkal (DFKI)
- Massimo Poesio (University of Essex)
- James Pustejovsky (Brandeis University)
- Owen Rambow (Columbia University)
- Peter Rossen Skadhauge (Copenhagen Business School)
- Beth Sundheim (SPAWAR Systems Center)
- Janyce Wiebe (University of Pittsburgh)
- Nianwen Xue (University of Pennsylvania)