[Iber AuTexTification@IberLEF2024] Call for Participation - Corpora

3 Mar 2024


      Event Notification Type: Call for Participation
Website: https://sites.google.com/view/autextification
https://sites.google.com/view/iberautextification
We kindly invite you to participate in the IberLEF 2024 shared task -
Iber AuTexTification
Automated Text Identification on Languages of the Iberian Peninsula
This shared task will take place as part of IberLEF 2024
https://sites.google.com/view/iberlef-2024/tasks, the 6th Workshop on
Iberian Languages Evaluation Forum at the SEPLN 2024 Conference, which will
be held in Valladolid, Spain on the 26th of September, 2023.
This is the second version of the AuTexTification at IberLEF 2023 shared
task (Sarvazyan et al., 2023). We extend our previous task in three
dimensions: more models, more domains and more languages from the Iberian
Peninsula (in a multilingual fashion), aiming to build more generalizable
detectors and attributors. In this task, participants must develop models
that exploit clues about linguistic form and meaning to identify
automatically generated texts from a wide variety of models, domains, and
languages. We plan to include LLMs like GPT-3.5, GPT-4, LLaMA, Coral,
Command, Falcon, MPT, among others. New domains like essays, or dialogues,
and cover the most prominent languages from the Iberian Peninsula: Spanish,
Catalan, Basque, Galician, Portuguese, and English (in Gibraltar).  We
propose two different subtasks:
-
Subtask 1 (Human or Generated): Participants will be provided a text,
   and they will have to determine whether the text has been automatically
   generated or not. We encourage participants to develop models that
   generalize to new LLMs, writing styles, and domains.
-
Subtask 2 (Model Attribution): Participants will be provided an
   automatically generated text, and they will have to determine what LLM
   generated it.
The novelty of this edition is to detect in a multilingual (languages from
the Iberian peninsula such as Spanish, English, Catalan, Gallego, Euskera,
and Portuguese), multi-domain (news, reviews, essays, dialogues, Wikipedia,
wikiHow, tweets, emails, etc.), and multi-model (GPT, LLaMA, Mistral,
Cohere, Anthropic, MPT, Falcon, etc.) setup, whether a text has been
automatically generated or not, and, if generated, identify the model that
generated the text. The datasets of this edition are built using TextMachina
https://github.com/Genaios/TextMachina, a Python framework that aids the
creation of high-quality, unbiased datasets to build robust models for
MGT-related tasks such as detection, attribution, boundary, and mix-case
detection.
To foster engagement and reward dedication, we will award the best
participant in each subtask with 500€ sponsored by Genaios
https://genaios.ai/.
Important Links
-
Task Website https://sites.google.com/view/iberautextification
   -
GitHub Repository https://github.com/Genaios/IberAuTexTification
   -
Slack workspace
   https://join.slack.com/t/iberautextification/shared_invite/zt-2c28ezgwy-lHHM6ASHnqLY2YQ8mlPgdQ&sa=D&sntz=1&usg=AOvVaw1oYekQiDZ0_C_-N79NtReu
-
Google Groups https://groups.google.com/g/iberautextification
   -
Registration https://sites.google.com/view/iberautextification/data
Important Dates
-
March 22, 2023: Release of training data
   -
April 21, 2023: Release of test data
   -
May 10, 2023: Participant system results submission
   -
May 17, 2023: Results notification
   -
June 3, 2023: Paper submission
-
June 16, 2023: Paper peer-reviewed
   -
July 4, 2023: Camera-ready paper version
Task organizers
-
José Ángel González (Genaios https://genaios.ai/) Contact Email:
   jose.gonzalez@genaios.ai
-
Areg Sarvazyan (Genaios https://genaios.ai/) Contact Email:
   areg.sarvazyan@genaios.ai
   -
Marc Franco-Salvador (Genaios https://genaios.ai/)
   -
Francisco Rangel (Genaios https://genaios.ai/)
   -
Paolo Rosso (Universitat Politècnica de València https://www.upv.es/)
Please reach out to the organizers organizers.autextification@gmail.com
or join the Slack
https://join.slack.com/t/iberautextification/shared_invite/zt-2c28ezgwy-lHHM6ASHnqLY2YQ8mlPgdQ&sa=D&sntz=1&usg=AOvVaw1oYekQiDZ0_C_-N79NtReu
workspace to connect with the other participants and organizers.