13th Workshop on Building and Using Comparable Corpora

Details for attendance

The workshop takes place as an online event on May 11 from 9:15 to 16:35 UTC+2 (Central European Summer Time).
Link to attend: https://bbb.limsi.fr/b/pie-636-a4v
This uses BigBlueButton (BBB), a free software (http://docs.bigbluebutton.org/) which works through WebRTC, meaning that you just need to use your Web browser, no need to install specific software. This works with Chrome, Firefox, Safari, and also with the most recent version of Edge (Jan. 2020). On your smartphone or tablet use your Web browser too: Chrome, Firefox, Safari. There is no way to connect using a simple phone on this installation of BBB.
Attendance is free but please send e-mail to reinhardrapp (at) gmx (dot) de to obtain further information.

All attendees

For planning purposes, we encourage (but do not require) attendees to announce their intention to participate to pz (at) limsi (dot) fr.
Just connect to the above-specified BBB link, enter your given name and family name, and join the meeting.


You can either
Please test your connection and presentation before the workshop. For this purpose, the presentation room is already opened for authors ahead of time. Just connect to the above-specified BBB link, enter your name and join or start the meeting, click on the "+" button at the bottom left of the window, click on "become a presenter" if needed.




9:15-9:30 Opening
9:30-10:20 Session 1: Invited Presentation Holger Schwenk, Facebook AI Research

Session 2: Shared task: Bilingual Lexicon Induction from Comparable Corpora
10:20-10:40 Overview of the Fourth BUCC Shared Task: Bilingual Dictionary Induction from Comparable Corpora
Reinhard Rapp, Pierre Zweigenbaum and Serge Sharoff
10:40-11:00 TALN/LS2N Participation at the BUCC Shared Task: Bilingual Dictionary Induction from Comparable Corpora
Martin Laville, Amir Hazem and Emmanuel Morin
11:00-11:20 Coffee break
11:20-11:40 LMU Bilingual Dictionary Induction System with Word Surface Similarity Scores for BUCC 2020
Silvia Severini, Viktor Hangya, Alexander Fraser and Hinrich Schütze
11:40-12:00 BUCC2020: Bilingual Dictionary Induction using Cross-lingual Embedding
Sanjanasri JP, Vijay Krishna Menon and Soman KP
12:00-13:00 Lunch break
13:00-13:50 Session 3: Invited Presentation Jörg Tiedemann, University of Helsinki

Session 4: Corpus Construction
13:50-14:10 Constructing a Bilingual Corpus of Parallel Tweets
Hamdy Mubarak, Sabit Hassan and Ahmed Abdelali
14:10-14:30 cEnTam: Creation and Validation of a New English-Tamil Bilingual Corpus
Sanjanasri JP, Premjith B, Vijay Krishna Menon and Soman KP
14:30-14:50 Coffee break

Session 5: Semantics
14:50-15:10 Automatic Creation of Correspondence Table of Meaning Tags from Two Dictionaries in One Language Using Bilingual Word Embedding
Teruo Hirabayashi, Kanako Komiya, Masayuki Asahara and Hiroyuki Shinnou
15:10-15:30 Mining Semantic Relations from Comparable Corpora through Intersections of Word Embeddings
Špela Vintar, Larisa Grčić Simeunović, Matej Martinc, Senja Pollak and Uroš Stepišnik

Session 6: Machine Translation
15:30-15:50 Benchmarking Multidomain English-Indonesian Machine Translation
Tri Wahyu Guntara, Alham Fikri Aji and Radityo Eko Prasojo
15:50-16:10 Reducing the Search Space for Parallel Sentences in Comparable Corpora
Rémi Cardon and Natalia Grabar
16:10-16:30 Line-a-line: A Tool for Annotating Word-Alignments
Maria Skeppstedt, Magnus Ahltorp, Gunnar Eriksson and Rickard Domeij
16:30-16:35 Closing