7th Workshop on Building and Using Comparable Corpora; Co-located with LREC 2014

BUCC, 7th Workshop on Building and Using Comparable Corpora
Building Resources for Machine Translation Research

Co-located with LREC 2014
Reykjavik (Iceland)
27 May 2014

http://comparable.limsi.fr/bucc2014/
Submission site:
https://www.softconf.com/lrec2014/BUCC2014/
Journal special issue: JNLE
Invited speaker: Chris Callison-Burch
University of Pennsylvania

TOPICS

We solicit contributions including but not limited to the following topics.

Topics related to the special theme:

Methods and tools for collecting and processing MT data, including crowdsourcing
Methods and tools for quality control
Tools for efficient annotation
Bilingual term and named entity collections
Multilingual treebanks, wordnets, propbanks, etc.
Comparable corpora with parallel units annotated
Comparable corpora for under-resourced languages and specific domains
Multilingual corpora with rich annotations: POS tags, NEs, dependencies, semantic roles, etc.
Data for special applications: patent translation, movie subtitles, MOOCs, meetings, chat-rooms, social media, etc.
Legal issues with collecting and redistributing data and generating derivatives

Building Comparable Corpora:

Human translations
Automatic and semi-automatic methods
Methods to mine parallel and non-parallel corpora from the Web
Tools and criteria to evaluate the comparability of corpora
Parallel vs non-parallel corpora, monolingual corpora
Rare and minority languages, across language families
Multi-media/multi-modal comparable corpora

Applications of comparable corpora:

Human translations
Language learning
Cross-language information retrieval & document categorization
Bilingual projections
Machine translation
Writing assistance

Mining from Comparable Corpora:

Extraction of parallel segments or paraphrases from comparable corpora
Extraction of bilingual and multilingual translations of single words and multi-word expressions; proper names, named entities, etc.

Note that an edited book “Building and Using Comparable Corpora” has just been published by Springer.

Chapter 1, an introduction and state of the art on the topic, is now freely available on Springer’s Web site: Overviewing Important Aspects of the Last 20 Years of Research in Comparable Corpora.

IMPORTANT DATES

23 February 2014	Deadline for submission of full papers
10 March 2014	Notification of acceptance
27 March 2014	Camera-ready papers due
27 May 2014	Workshop date

SUBMISSION INFORMATION

Papers should follow the LREC main conference formatting details at http://lrec2014.lrec-conf.org/en/submission/authors-kit/ and should be submitted as a PDF-file via the START workshop manager at https://www.softconf.com/lrec2014/BUCC2014/.

Contributions can be short or long papers. Short paper submission must describe original and unpublished work without exceeding six (6) pages. Characteristics of short papers include: a small, focused contribution; work in progress; a negative result; an opinion piece; an interesting application nugget. Long paper submissions must describe substantial, original, completed and unpublished work without exceeding ten (10) pages.

Reviewing will be double blind, so the papers should not reveal the authors’ identity. Accepted papers will be published in the workshop proceedings.

Double submission policy: Parallel submission to other meetings or publications is possible but must be immediately notified to the workshop organizers.

When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.), to enable their reuse, replicability of experiments, including evaluation ones, etc.

For further information, please contact Pierre Zweigenbaum mailto:pz(erase_at)limsi(erase_dot)fr

Plain-text CFP : bucc2014-cfp.txt
PDF CFP : bucc2014-cfp.pdf
Last modified: 12 Jul 2014

JOURNAL SPECIAL ISSUE

Authors of selected papers will be encouraged to submit substantially extended versions of their manuscripts to an upcoming special issue on “Machine Translation Using Comparable Corpora” of the Journal of Natural Language Engineering.

BUCC, 7th Workshop on Building and Using Comparable Corpora Building Resources for Machine Translation Research

TOPICS

IMPORTANT DATES

SUBMISSION INFORMATION

JOURNAL SPECIAL ISSUE

BUCC, 7th Workshop on Building and Using Comparable Corpora
Building Resources for Machine Translation Research