BUCC, 13th Workshop on Building and Using Comparable Corpora
BUCC 2020 is held on-line, live
The workshop takes place as an online event on May 11 from 9:15 to 16:35 UTC+2 (Central European Summer Time).
See the Program page for details. Attendance is free but please send e-mail to reinhardrapp (at) gmx (dot) de to obtain further information.
TOPICS
We solicit contributions on all topics related to comparable corpora, including but not limited to the following:
- Human translations
- Automatic and semi-automatic methods
- Methods to mine parallel and non-parallel corpora from the Web
- Tools and criteria to evaluate the comparability of corpora
- Parallel vs non-parallel corpora, monolingual corpora
- Rare and minority languages, across language families
- Multi-media/multi-modal comparable corpora
- Human translations
- Language learning
- Cross-language information retrieval & document categorization
- Bilingual projections
- Machine translation
- Writing assistance
- Machine learning techniques using comparable corpora
- Induction of morphological, grammatical, and translation rules from comparable corpora
- Extraction of parallel segments or paraphrases from comparable corpora
- Extraction of bilingual and multilingual translations of single words and multi-word expressions, proper names, and named entities from comparable corpora
- Induction of multilingual word classes from comparable corpora
- Cross-language distributional semantics and word embeddings
IMPORTANT DATES
SUBMISSION INFORMATION
Please follow the style sheet and templates provided for the main conference at https://lrec2020.lrec-conf.org/en/submission2020/authors-kit/. Papers should be submitted as a PDF file at
https://www.softconf.com/lrec2020/BUCC2020/. Submissions must describe original and unpublished work and range from four (4) to eight (8) pages plus unlimited references.
Reviewing will be double blind, so the papers should not reveal the authors' identity. Accepted papers will be published in the workshop proceedings.
Double submission policy: Parallel submission to other meetings or publications is possible but must be immediately notified to the workshop organizers.
Information from the LREC organizers
Please make sure that your papers take into account the following information about the LRE Map, the “Share your LRs!” initiative and the ISLRN number.
Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will
have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone
can deposit and share data.
As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2020 endorses the need to uniquely Identify LRs through the use of
the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.