[Corpora-List] CFP AuTexTification 🤖 vs 👩🏻: Automatic Text Identification shared task at IberLEF 2023

21 Mar 2023


      Hola Luis
¿qué tal?
Acabo de ver en Corpora-list que estás a tope con temas de chatbots.
A lo mejor ya te ha llegado la info: estamos organizando una tarea que  
puede que os pueda interesar.
A ver si participas ;-)
Saludos
Paolo
-----
*Apologies for cross-posting*
Do you believe machine generated text is becoming an issue? Are you  
interested in boosting research to automatically detect machine  
generated text? 🤖👩🏻
We cordially invite all researchers and practitioners from all fields  
to participate in the AuTexTification task. If interested, register  
yourself in the shared task through this link: https://lnkd.in/dzBZsYiD
Once registered and training phase started, the datasets will be sent  
to your email along with a password. Look for more information  
regarding task description, schedules, or submissions through the  
Autextification web page: https://sites.google.com/view/autextification
More information on the shared task
The new era of automatic content generation has surged through  
powerful causal language models like GPT, PALM, or Bloom that can be  
used to spread untruthful news, human-looking reviews, or opinions.  
Thus, it is imperative to develop technology to automatically detect  
generated text for content moderation and to attribute generated text  
to specific models to protect intellectual property or to distill  
responsibilities. In this context, we propose the “Automatic Text  
Identification” (AuTexTification) shared task, to boost research and  
development of automatic systems to detect automatically generated  
text, obtained by state-of-the-art language models, in English and  
Spanish.
We propose two subtasks: (i) Human or Generated, where given a  
text participants will have to determine whether a text has been  
automatically generated or not; and (ii) Model Attribution, where  
participants will have to determine what model generated a text. The  
generation models used to generate the text are of increasing number  
of neural parameters, ranging from 2 to 175 billion, meaning that  
participants' systems should be versatile enough to detect a diverse  
set of text generation models and writing styles.
In the training phase, participants will be provided with two  
partitions for subtask 1, i.e., English and Spanish partitions, with  
binary labels 👩🏻 and 🤖. Similarly, a partition per language will be  
released for subtask 2. It will include six labels (A, B, C, D, E, and  
F), each label representing a text generation model. Later, the  
unlabeled test data will be released.
Important Dates
March 22, 2023: Release of training data
April 21, 2023: Release of test data
May 10, 2023: Participant system results submission
May 17, 2023: Results notification
June 3, 2023: Paper submission
June 16, 2023: Paper peer-reviewed
July 4, 2023: Camera-ready paper version
September 26, 2023: Conference
Task organizers
José Ángel González (Symanto) Contact Email: jose.gonzalez@symanto.com
Areg Sarvazyan (Symanto) Contact Email: areg.sarvazyan@symanto.com
Marc Franco-Salvador (Symanto)
Francisco Rangel (Symanto)
Berta Chulvi (Universitat Politècnica de València)
Paolo Rosso (Universitat Politècnica de València)
Please reach out to the organizers or join the Slack workspace to  
connect with the other participants and organizers:  
https://lnkd.in/di_zaMHf

2026

2025

2024

2023

2022

[Corpora-List] CFP AuTexTification 🤖 vs 👩🏻: Automatic Text Identification shared task at IberLEF 2023