TextDetox CLEF-2024
We are glad to invite you to participate in the first of its kind multilingual Text Detoxification shared task!
https://pan.webis.de/clef24/pan24-web/text-detoxification.html
TL;DR
Task formulation: transfer a text style from toxic to neutral (i.e. what a f**k is this about? -> what is this about?)
9 Languages: English, Spanish, Chinese, Hindi, Arabic, German, Russian, Ukrainian, and Amharic
More details:
Identification of toxicity in user texts is an active area of research. Today, social networks such as Facebook, Instagram are trying to address the problem of toxicity. However, they usually simply block such kinds of texts. We suggest a proactive reaction to toxicity from the user. Namely, we aim at presenting a neutral version of a user message which preserves meaningful content. We denote this task as text detoxification.
In this competition, we suggest you create detoxification systems for 9 languages from several linguistic families. However, the availability of training corpora will differ between the languages. For English and Russian, the parallel corpora of several thousand toxic-detoxified pairs (as presented above) are available. So, you can fine-tune text generation models on them. For other languages, for the dev phase, no such corpora will be provided. The main challenge of this competition will be to perform an unsupervised and cross-lingual detoxification.
You are very welcome to test all modern LLMs on text detoxification and safety with our data as well as experiment with different unsupervised approaches based on MLMs or other paraphrasing methods!
The final leaderboard will be built on a manual evaluation of a test set subset performed via crowdsourcing at Toloka.ai platform.
In the end, you will have an opportunity to write and then present a paper at CLEF 2024 (https://clef2024.imag.fr/) which will take place in Grenoble, France!
Important Dates
February 1, 2024: First data available and run submission opens.
April 22, 2024: Registration closes.
May 6, 2024: Run submission deadline and results out.
May 31, 2024: Participants paper submission.
July 8, 2024: Camera-ready participant papers submission.
September 9-12, 2024: CLEF Conference in Grenoble and Touché Workshop.
Best regards, The CLEF-2024 TextDetox Shared Task Organizers