14th Workshop on Building and Using Comparable Corpora
DETAILS FOR ATTENDANCE
The workshop will take place online through Zoom. Details for participation are provided on the main conference Web site. This is the Zoom link to connect to the workshop. In the unlikely case of unforeseen problems with the Zoom session, a new link will be provided here.
The workshop proceedings (full PDF; full list of BibTeX entries) are available on the ACL
Anthology. See also below direct links for each individual paper to ACL anthology page, PDF and BibTeX entry.
Programme
All times are in UTC+0. For a time zone converter and time difference calculator, see e.g. https://www.timeanddate.com/worldclock/converter.html.
Here are a few examples of time conversions for the workshop's starting and closing times:
UTC-7: San Francisco | UTC-4: Baltimore | UTC+0: Reykjavik | UTC+1: Dartmouth, Dublin, Leeds | UTC+2: Antwerp, Barcelona, Göttingen, Mainz, Munich, Paris, Prague | UTC+3: Varna | UTC+5:30: Kochi, Mumbai | UTF+9: Fukuoka | |
---|---|---|---|---|---|---|---|---|
Starting | 1:00 (am) | 4:00 (am) | 8:00 | 9:00 | 10:00 | 11:00 | 13:30 | 17:00 |
Closing | 9:00 (am) | 12:00 (noon) | 16:00 | 17:00 | 18:00 | 19:00 | 21:30 | 1:00 (am) |
8:00-8:05 | Welcome |
8:05-9:00 | Invited presentation:
Machine Translation in Low Resource Setting [PDF] [BIB] Pushpak Bhattacharyya |
9:00-9:25 | EM Corpus: a comparable corpus for a less-resourced language pair Manipuri-English [PDF] [BIB] Rudali Huidrom, Yves Lepage and Khogendra Khomdram |
9:25-9:40 | Coffee break |
9:40-10:05 | Mining Bilingual Word Pairs from Comparable Corpus using Apache Spark Framework [PDF] [BIB] Sanjanasri JP, Vijay Krishna Menon, Soman KP and Krzysztof Wolk |
10:05-10:30 | Employing Wikipedia as a resource for Named Entity Recognition in Morphologically complex under-resourced languages [PDF] [BIB] Aravind Krishnan, Stefan Ziehe, Franziska Pannach and Caroline Sporleder |
10:30-10:55 | Semi-Automated Labeling of Requirement Datasets for Relation Extraction [PDF] [BIB] Jeremias Bohn, Jannik Fischbach, Martin Schmitt,Hinrich Schütze and Andreas Vogelsang |
10:55-11:20 | A Dutch Dataset for Cross-lingual Multilabel Toxicity Detection [PDF] [BIB] Ben Burtenshaw and Mike Kestemont |
11:20-12:10 | Lunch break |
12:10-13:05 | Invited presentation:
Language modeling and AI Tomas Mikolov |
13:05-13:30 | Syntax-aware Transformers for Neural Machine Translation: The Case of Text to Sign Gloss Translation [PDF] [BIB] Santiago Egea Gómez, Euan McGill and Horacio Saggion |
13:30-13:55 | Effective Bitext Extraction from Comparable Corpora Using a Combination of Three Different Approaches [PDF] [BIB] Steinþór Steingrímsson, Pintu Lohar, Hrafn Loftsson and Andy Way |
13:55-14:10 | Coffee break |
14:10-14:35 | Majority Voting with Bidirectional Pre-translation For Bitext Retrieval [PDF] [BIB] Alexander G. Jones and Derry Tanti Wijaya |
14:35-15:00 | On Pronunciations in Wiktionary: Extraction and Experiments on Multilingual Syllabification and Stress Prediction [PDF] [BIB] Winston Wu and David Yarowsky |
15:00-15:55 | Invited presentation:
Large-scale Deep Learning for Low-Resource AI Sujith Ravi |
15:55-16:00 | Closing |