The 17th Workshop on Building and Using Comparable Corpora (BUCC)
TOPICS
We solicit contributions on all topics related to comparable (and parallel) corpora, including but not limited to the following:
- Building Comparable Corpora:
- Automatic and semi-automatic methods
- Methods to mine parallel and non-parallel corpora from the web
- Tools and criteria to evaluate the comparability of corpora
- Parallel vs non-parallel corpora, monolingual corpora
- Rare and minority languages, across language families
- Multi-media/multi-modal comparable corpora
- Applications of comparable corpora:
- Human translation
- Language learning
- Cross-language information retrieval & document categorization
- Bilingual and multilingual projections
- (Unsupervised) Machine translation
- Writing assistance
- Machine learning techniques using comparable corpora
- Mining from Comparable Corpora:
- Cross-language distributional semantics, word embeddings and pre-trained multilingual transformer models
- Extraction of parallel segments or paraphrases from comparable corpora
- Methods to derive parallel from non-parallel corpora (e.g. to provide for low-resource languages in neural machine translation)
- Extraction of bilingual and multilingual translations of single words, multi-word expressions, proper names, named entities, sentences, and paraphrases from comparable corpora, etc.
- Induction of morphological, grammatical, and translation rules from comparable corpora
- Induction of multilingual word classes from comparable corpora
- Comparable Corpora in the Humanities:
- Comparing linguistic phenomena across languages in contrastive linguistics
- Analyzing properties of translated language in translation studies
- Studying language change over time in diachronic linguistics
- Assigning texts to authors via authors' corpora in forensic linguistics
- Comparing rhetorical features in discourse analysis
- Studying cultural differences in sociolinguistics
- Analyzing language universals in typological research
IMPORTANT DATES
Deadlines are “anywhere on Earth.”
6 Mar 2024 Extended paper submission deadline 25 Mar 2024 Notification of acceptance 7 Apr 2024 Camera-ready final papers 20 May 2024 Workshop dateFor updates, please follow the present Web page.
PRACTICAL INFORMATION
Workshop registration is via the main conference registration site.
The workshop proceedings will be published in the ACL Anthology.
SUBMISSION GUIDELINES
Please follow the style sheet and templates (for LaTeX, Overleaf, Open Office, and MS-Word) provided for the main conference.
Papers should be submitted as a PDF file using the START conference manager.
Submissions must describe original and unpublished work and range from 4 to 8 pages plus unlimited references. Camera-ready final versions may use one more page than the initial submission to take into account the reviewers' comments.
Reviewing will be double blind, so the papers should not reveal the authors' identity. Accepted papers will be published in the workshop proceedings, which will be included in the ACL Anthology.
Double submission policy: Parallel submission to other meetings or publications is possible but must be immediately (i.e. as soon as known to the authors) notified to the workshop organizers by e-mail.
Presentation slides
Due to the size of the conference rooms it is recommended to use 36 pt fonts for the presentation slides. posters
Posters
The size of posters holders is 90cm x 150 cm and the format is vertical (Portrait). The Poster Boards cannot accommodate Landscape posters. You can print your poster in Portrait A0 (84,1 x 118,9cm).
For further information and updates see the present Web page.