Event Notification Type: Call for Participation
Website: https://sites.google.com/view/autextification https://sites.google.com/view/iberautextification
We kindly invite you to participate in the IberLEF 2024 shared task -
Iber AuTexTification
Automated Text Identification on Languages of the Iberian Peninsula
This shared task will take place as part of IberLEF 2024 https://sites.google.com/view/iberlef-2024/tasks, the 6th Workshop on Iberian Languages Evaluation Forum at the SEPLN 2024 Conference, which will be held in Valladolid, Spain on the 26th of September, 2023.
This is the second version of the AuTexTification at IberLEF 2023 shared task (Sarvazyan et al., 2023). We extend our previous task in three dimensions: more models, more domains and more languages from the Iberian Peninsula (in a multilingual fashion), aiming to build more generalizable detectors and attributors. In this task, participants must develop models that exploit clues about linguistic form and meaning to identify automatically generated texts from a wide variety of models, domains, and languages. We plan to include LLMs like GPT-3.5, GPT-4, LLaMA, Coral, Command, Falcon, MPT, among others. New domains like essays, or dialogues, and cover the most prominent languages from the Iberian Peninsula: Spanish, Catalan, Basque, Galician, Portuguese, and English (in Gibraltar). We propose two different subtasks:
-
Subtask 1 (Human or Generated): Participants will be provided a text, and they will have to determine whether the text has been automatically generated or not. We encourage participants to develop models that generalize to new LLMs, writing styles, and domains.
-
Subtask 2 (Model Attribution): Participants will be provided an automatically generated text, and they will have to determine what LLM generated it.
The novelty of this edition is to detect in a multilingual (languages from the Iberian peninsula such as Spanish, English, Catalan, Gallego, Euskera, and Portuguese), multi-domain (news, reviews, essays, dialogues, Wikipedia, wikiHow, tweets, emails, etc.), and multi-model (GPT, LLaMA, Mistral, Cohere, Anthropic, MPT, Falcon, etc.) setup, whether a text has been automatically generated or not, and, if generated, identify the model that generated the text. The datasets of this edition are built using TextMachina https://github.com/Genaios/TextMachina, a Python framework that aids the creation of high-quality, unbiased datasets to build robust models for MGT-related tasks such as detection, attribution, boundary, and mix-case detection.
To foster engagement and reward dedication, we will award the best participant in each subtask with 500€ sponsored by Genaios https://genaios.ai/.
Important Links
-
Task Website https://sites.google.com/view/iberautextification -
GitHub Repository https://github.com/Genaios/IberAuTexTification -
Slack workspace https://join.slack.com/t/iberautextification/shared_invite/zt-2c28ezgwy-lHHM6ASHnqLY2YQ8mlPgdQ&sa=D&sntz=1&usg=AOvVaw1oYekQiDZ0_C_-N79NtReu
-
Google Groups https://groups.google.com/g/iberautextification -
Registration https://sites.google.com/view/iberautextification/data
Important Dates
-
March 22, 2023: Release of training data -
April 21, 2023: Release of test data -
May 10, 2023: Participant system results submission -
May 17, 2023: Results notification -
June 3, 2023: Paper submission
-
June 16, 2023: Paper peer-reviewed -
July 4, 2023: Camera-ready paper version
Task organizers
-
José Ángel González (Genaios https://genaios.ai/) Contact Email: jose.gonzalez@genaios.ai
-
Areg Sarvazyan (Genaios https://genaios.ai/) Contact Email: areg.sarvazyan@genaios.ai -
Marc Franco-Salvador (Genaios https://genaios.ai/) -
Francisco Rangel (Genaios https://genaios.ai/) -
Paolo Rosso (Universitat Politècnica de València https://www.upv.es/)
Please reach out to the organizers organizers.autextification@gmail.com or join the Slack https://join.slack.com/t/iberautextification/shared_invite/zt-2c28ezgwy-lHHM6ASHnqLY2YQ8mlPgdQ&sa=D&sntz=1&usg=AOvVaw1oYekQiDZ0_C_-N79NtReu workspace to connect with the other participants and organizers.