Dear Colleagues,
We are pleased to inform you that we will be hosting the "Shared Task: Low-Resource Indic Language Translation" again this year as part of WMT 2024. Following the outstanding success and enthusiastic participation witnessed in the previous year's edition, we are excited to continue this important initiative. Despite recent advancements in machine translation (MT), such as multilingual translation and transfer learning techniques, the scarcity of parallel data remains a significant challenge, particularly for low-resource languages.
The WMT 2024 Indic Machine Translation Shared Task aims to address this challenge by focusing on low-resource Indic languages from diverse language families. Specifically, we are targeting languages such as Assamese, Mizo, Khasi, Manipuri, Nyishi, Bodo, Mising, and Kokborok.
The task for this year features two categories based on the availability of training data:
*Category 1: Moderate Training Data Available*
- en-as: English ⇔ Assamese - en-lus: English ⇔ Mizo - en-kha: English ⇔ Khasi - en-mni: English ⇔ Manipuri - en-nshi: English ⇔ Nyishi
*Category 2: Very Limited Training Data*
- en-bodo: English ⇔ Bodo - en-mrp: English ⇔ Mising - en-trp: English ⇔ Kokborok
Participants are encouraged to develop MT systems that can produce high-quality translations despite limited data availability. Key areas for exploration include leveraging monolingual data, investigating multilingual approaches, and exploring transfer learning techniques.
*Important Dates:*
- Release of training/dev data: 25 May, 2024 - Test data release: 13 July, 2024 - Run Submission deadline: 28 July, 2024 - System description/workshop paper submission deadline: TBA, 2024 (follow EMNLP/WMT page) - Notification of Acceptance: TBA, 2024 (follow EMNLP/WMT page) - Camera-ready: TBA, 2024 (follow EMNLP/WMT page) - Workshop Dates: follow EMNLP/WMT main page
The organizing committee comprises experts from various institutions dedicated to advancing MT research in low-resource language settings.
*Organizers:*
- Santanu Pal, Wipro AI Lab, London, UK - Partha Pakray, National Institute of Technology, Silchar, India - Sandeep Kumar Dash, National Institute of Technology, Mizoram, India - Lenin Laitonjam, National Institute of Technology, Mizoram, India - Pankaj Kundan Dadure, University of Petroleum and Energy Studies, Dehradun, India - Arnab Maji, North-Eastern Hill University, India - Lyngdoh Sarah, North-Eastern Hill University, India - Anupam Jamatia, National Institute of Technology Agartala, India - Koj Sambyo, National Institute of Technology Arunachal Pradesh, India
For inquiries and further information, please contact us at lrilt.wmt24@gmail.com. Additionally, you can find more details and updates on the task through the following link: Task Link: https://www2.statmt.org/wmt24/indic-mt-task.html https://www2.statmt.org/wmt24/indic-mt-task.html.
To register for the event, please fill out the registration form available here https://docs.google.com/forms/d/e/1FAIpQLSd8LwriqdLLhVNAvUWEcGRJmKuBFQZ9BR_TKpb6VYZEnyGU0g/viewform?pli=1. ( https://docs.google.com/forms/d/e/1FAIpQLSd8LwriqdLLhVNAvUWEcGRJmKuBFQZ9BR_T... )
We look forward to your participation and contributions to advancing low-resource Indic language translation.
Best regards
Team Low-Resource Indic Language Translation
WMT 2024