18th Workshop on Building and Using Comparable Corpora

Program: Monday, 20 Jan, 2025

 8:50–9:00 Opening and introduction
 9:00–10:30 Multilingual corpus development
 Bilingual resources for Moroccan Sign Language Generation and Standard Arabic Skills Improvement of Deaf Children
Abdelhadi Soudi, Corinne Vinopol and Kristof Van Laerhoven
 Towards the Creation of a Large Scale Moroccan Sign Language Corpus
Abdelhadi Soudi, Corinne Vinopol and Kristof Van Laerhoven
 Harmonizing Annotation of Turkic Postverbial Constructions: A Comparative Study of UD Treebanks
Arofat Akhundjanova
 10:30–11:00 Coffee break, morning
 11:00–13:00 Multilinguality of Large Language Models
 Towards Truly Open, Language-Specific, Safe, Factual, and Specialized Large Language Models
Preslav Nakov
 Make Satire Boring Again: Reducing Stylistic Bias of Satirical Corpus by Utilizing Generative LLMs
Asli Umay Ozturk, Recep Firat Cekinel and Pinar Karagoz
 BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch Language
Ehsan Lotfi, Nikolay Banar and Walter Daelemans
 13:00–14:00 Lunch
 14:00–16:00 Diversity of language resources
 Comparable Corpora: Opportunities for New Research Directions
Kenneth Ward Church
 Can a Neural Model Guide Fieldwork? A Case Study on Morphological Data Collection
Aso Mahmudi, Borja Herce, Demian Inostroza Améstica, Andreas Scherbakov, Eduard H. Hovy and Ekaterina Vylomova
 SELEXINI – a large and diverse automatically parsed corpus of French
Manon Scholivet, Agata Savary, Louis Estève, Marie Candito and Carlos Ramisch
 16:00–16:30 Coffee break, afternoon
 16:30–17:30 Machine Translation and Cross-lingual Processing
 Refining Dimensions for Improving Clustering-based Cross-lingual Topic Models
Chia-Hsuan Chang, Tien Yuan Huang, Yi-Hang Tsai, Chia-Ming Chang and San-Yih Hwang
 The Role of Handling Attributive Nouns in Improving Chinese-To-English Machine Translation
Adam Meyers, Rodolfo Joel Zevallos, John E. Ortega and Lisa Wang
 17:30–17:45 Closing remarks
Last modified: 16 Dec 2024