[Apologies for cross-posting]
Terminology Translation Task at WMT2025 - Call for Participation
We are excited to announce the third Shared Task on Terminology Translationhttps://www2.statmt.org/wmt25/terminology.html, which would be run within the 10th Conference on Machine Translation (WMT2025) in Suzhou, China.
TL;DR: - We test the sentence-level and document-level translation of the texts in finance and IT domains, given the explicit terminology. - The language pairs are: English -> {Spanish, German, Russian, Chinese}, Chinese -> English. - We evaluate the overall quality of translation, terminology success rate and consistency. Additionally, we compare the performance of systems given no terms provided, proper terminology and random terms. - The task starts on 20th June 2025 AOE, the submission deadline is 20th July 2025 AOE. - Please pre-register via Google Forms here: https://forms.gle/ZSn2pNJkQJAzHFnA6 .
OVERVIEW
The advances in neural MT and LLM-assisted translation of the last decade show nearly human quality in general domain translation at least for the high-resource languages. However, when it comes to specialized domains like science, finance, or legal texts, where the correct and consistent use of special terms is crucial, the task is far from being solved. The Terminology Shared Task aims to assess the extent to which machine translation models can utilize additional information regarding the translation of terminologies. Compared to two previous editions, 2021 and 2023, the new test data have more various test cases, are more consistent in domains for each translation direction, and are broader in language coverage.
TASK DESCRIPTION
Track №1: Sentence/Paragraph-Level Translation
You will be provided with sequence of input sentences long, and small terminology dictionaries that will correspond only to the terms present in the given sentence.
Language Pairs:
* en-de (English → German) * en-ru (English → Russian) * en-es (English → Spanish)
Domain: information technology
Track №2: Document-Level Translation
The setup is similar to Track №1, with two exceptions: the length of the input texts now equals the document, and the dictionaries correspond to the whole set of input texts (i.e. they are corpus-level). This makes the task close to the real-life setup (where the dictionaries exist independently from the texts), while it may complicate the implementation (since for the solutions that require storing the whole dictionary it will take more memory). Additionally, for the whole document setup, the problem of the consistent usage of terms is becoming more important.
Language Pairs: en-zh-Hant (English → Traditional Chinese) zh-Hant-en (Traditional Chinese → English)
Domain: finance
EVALUATION
Terminology Modes: You are expected to compare your system’s performance under three modes:
1. No terminology: the system is only provided with input sentences/documents. 2. Proper terminology: the system is provided with input texts (same as 1.) and dictionaries of the format {source_term: target_term}. 3. Random terminology: the system is provided with input texts and translation dictionaries of the same format as in 2. The difference is that the dictionary items are not special terms but words randomly drawn from input texts. This mode is of special interest since we want to measure to what extent the proper term translations help to improve the system performance (2.), as opposed to an arbitrary broader input that does not contain the domain-specific terminology.
Metrics:
1. Overall Translation Quality: we will evaluate the general aspects of machine translation outputs such as fluency, adequacy and grammaticality. We will do that with the general MT automatic metrics such as BLEU or COMET. In addition to that, we will pay special attention to the grammaticality of the translated terms. 2. Terminology Success Rate: This metric assesses the ability of the system to accurately translate technical terms given the specialized vocabulary. This will be carried out by comparing the occurrences of the correct term translations (i.e. the ones present in the dictionary) to the output terms. The goal is to have a higher success rate that will show adherence to dictionary translations. 3. Terminology Consistency: for domains such as science or legal texts, the consistent use of an introduced term throughout the text is crucial. In other words, we want a system to not only pick up a correct term in a target language but to use it consistently once it is chosen. This will be evaluated by comparing all translations of a given source term in a text and measuring the percentage of deviations from the most consistent translation. This metric is more important for the Document-Level track, but it will be used for both tracks.
IMPORTANT DATES All dates are end of Anywhere on Earth (AoE).
Data snippets released: 7th May 2025 Dev data released: 22nd May 2025 Test data release, task starts: 20th June 2025 (postponed) Submission deadline: 20th July 2025 (postponed) Paper submission to WMT25: in-line with WMT25 Camera-ready submission to WMT25: in-line with WMT25 Conference in Suzhou, China: 05-09 November 2025
SUBMISSION GUIDELINES
0. Please notify us about your participation prior to submission. This is optional, but will be very helpful for us for better understanding of our workload after submission. Please do it through this Google Form: https://forms.gle/ZSn2pNJkQJAzHFnA6 1. Check your submission files with the validation script. It will be published at test date publication. 2. Write a description of your system (optional). 3. Submit your system via Google Forms. The Google form with all necessary submission details will be published at the test set date.
All details on submission as well as FAQ can be found at the webpage of the shared task.
ORGANIZERS
* Kirill Semenov (University of Zurich), main contact: FirstNаmе [dоt] LаstNаmе {аt} uzh /dоt/ ch * Nathaniel Berger (Heidelberg University) * Pinzhen Chen (University of Edinburgh & Aveni.ai) * Xu Huang (Nanjing University) * Arturo Oncevay (JP Morgan) * Dawei Zhu (Amazon) * Vilém Zouhar (ETH Zurich)
WEBSITE: https://www2.statmt.org/wmt25/terminology.html
In case of query, please send an email to Kirill Semenov (see email above).