Dear all,
Today, the data freeze of the MultiLexNorm 2 shared task is in effect.
As defined in the previous iteration of the task, lexical normalization is: The task of transforming an utterance into its standard form, word by word, including both one-to-many (1-n) and many-to-one (n-1) replacements.
This time, the focus is on non-Indo-European languages. We have manged to obtain (new) datasets for: Thai, Vietnamese, Indonesian, Japanese, and Korean.
More information can be found on: https://noisy-text.github.io/2025/multi-lexnorm.html#
Deadlines: Data available: Nov 15, 2024 Data freeze: Jan 14, 2025 Test data: Jan 25, 2025 Final Evaluation: Feb 07, 2025 Paper deadline: Feb 25, 2025 Paper reviewed: Mar 01, 2025 Camera ready: Mar 10, 2025 Workshop: May 03, 2025 (TBD)
Best, The organizers