Call for participation: MultiLexNorm 2: Multilingual Lexical Normalization - Corpora

14 Jan 2025


      Dear all,
Today, the data freeze of the MultiLexNorm 2 shared task is in effect.
As defined in the previous iteration of the task, lexical normalization is:
The task of transforming an utterance into its standard form, word by word,
including both one-to-many (1-n) and many-to-one (n-1) replacements.
This time, the focus is on non-Indo-European languages. We have manged
to obtain (new) datasets for: Thai, Vietnamese, Indonesian, Japanese, and 
Korean.
More information can be found on: https://noisy-text.github.io/2025/multi-lexnorm.html#
Deadlines:
Data available: Nov 15, 2024
Data freeze: Jan 14, 2025
Test data: Jan 25, 2025
Final Evaluation: Feb 07, 2025
Paper deadline: Feb 25, 2025
Paper reviewed: Mar 01, 2025
Camera ready: Mar 10, 2025
Workshop: May 03, 2025 (TBD)
Best, 
The organizers