We are excited to announce the 2nd edition of the Open Language Data Initiative shared task at WMT25, co-located with EMNLP 2025.
**TASK DESCRIPTION**
The primary goal of this shared task is to expand OLDI’s open datasets to more languages. We are soliciting contributions to the following:
- The MT evaluation dataset FLORES+., - The MT Seed dataset., - Other high-quality, massively-parallel and open-source datasets.,
Contributions may consist of either the addition of entirely new languages, varieties or dialects to the above datasets, or substantial improvements to existing datasets. To describe and publicise their contributions, task participants will be asked to submit a 4-6 page paper to be presented at the WMT 2025 conference.
**IMPORTANT DATES**
All dates follow WMT/EMNLP.
- Paper and data submission deadline: 14 August, - Notification of acceptance: 13 September,
**MORE INFORMATION**
- Shared task website: https://www2.statmt.org/wmt25/open-data.html, - OLDI website: https://oldi.org/