Call for Participation
BEA 2023 Shared Task: Generating AI Teacher Responses in Educational Dialogues
https://sig-edu.org/sharedtask/2023
SHARED TASK DESCRIPTION Conversational agents offer promising opportunities for education. They can fulfill various roles (e.g., intelligent tutors and service-oriented assistants) and pursue different objectives (e.g., improving student skills and increasing instructional efficiency) (Wollny et al. 2021). Among all of these different vocations of an educational chatbot, the most prevalent one is the *AI teacher* helping a student with skill improvement and providing more opportunities to practice. Some recent meta-analyses have even reported a significant effect of chatbots on skill improvement, for example in language learning (Bibauw et al. 2022). What is more, current advances in AI and natural language processing have led to the development of conversational agents that are founded on more powerful generative language models.
Despite these promising opportunities, the use of powerful generative models as a foundation for downstream tasks also presents several crucial challenges. In the educational domain in particular, it is important to ascertain whether that foundation is solid or flimsy. Bommasani et al. (2021: pp. 67-72) stressed that, if we want to put these models into practice as AI teachers, it is imperative to determine whether they can (a) speak to students like a teacher, (b) understand students, and (c) help students improve their understanding. Therefore, Tack and Piech (2022) formulated the* AI teacher test challenge*: How can we test whether state-of-the-art generative models are good AI teachers, capable of replying to a student in an educational dialogue?
Following the AI teacher test challenge, we organize a *first shared task on the generation of teacher language in educational dialogues*. The goal of the task is to use NLP and AI methods to generate teacher responses in real-world samples of teacher-student interactions. These samples are taken from the Teacher Student Chatroom Corpus (Caines et al. 2020; Caines et al. 2022). Each training sample is composed of a dialogue context (i.e., several teacher-student utterances) as well as the teacher’s response. For each test sample, participants are asked to submit their best generated teacher response.
The purpose of the task is to *benchmark the ability of generative models to act as AI teachers, replying to a student in a teacher-student dialogue*. Submissions will be ranked according to several automated dialogue evaluation metrics, with the top submissions selected for further human evaluation. During this manual evaluation, human raters will compare a pair of teacher responses in terms of three abilities: can speak like a teacher, can understand a student, can help a student (Tack & Piech 2022). As such, we adopt an evaluation method that is akin to ACUTE-Eval for evaluating dialogue systems (Li et al. 2019).
*PARTICIPATION* The shared task is hosted on *CodaLab* (Pavao et al. 2022). Anyone participating in the shared task will be asked to:
1. Register on the CodaLab https://codalab.lisn.upsaclay.fr/ platform. 2. Fill in the registration form https://forms.gle/iAdKCq3dRS9srzjc6 with their CodaLab ID. Participants must comply with the terms and conditions of the task and the TSCC data outlined in the form. 3. Register for the CodaLab competition https://codalab.lisn.upsaclay.fr/competitions/11705 using the CodaLab ID. We will only accept people who submitted the registration form. Note that you can participate as a member of one team only.
*IMPORTANT DATES*
*Fri Mar 24, 2023* Training data release *Mon May 1, 2023* Test data release *Fri May 5, 2023* Final submissions due *Mon May 8, 2023* Results announced *Fri May 12, 2023* Human evaluation results announced *Mon May 22, 2023* System papers due *Fri May 26, 2023* Paper reviews returned *Tue May 30, 2023* Camera-ready papers due *Mon June 12, 2023* Pre-recorded video due *July 13, 2023* BEA Workshop at ACL
*ORGANIZERS*
Anaïs Tack, KU Leuven; Ekaterina Kochmar, MBZUAI; Zheng Yuan, King’s College London; Serge Bibauw, Universidad Central del Ecuador; Chris Piech, Stanford University
*Webpage*: https://sig-edu.org/sharedtask/2023 https://sig-edu.org/sharedtask/2023