We are pleased to announce the ClinSkill QA shared task, co-located with the BioNLP workshop at ACL 2026.
TASK HOMEPAGE: https://whunextgen.github.io/ClinicalskillQA/
INTRODUCTION
Multimodal large language models (MLLMs) have the potential to support clinical training and assessment by assisting medical experts in interpreting procedural videos and verifying adherence to standardized workflows. Reliable deployment in these settings requires evidence that models can continuously interpret students’ actions during clinical skill assessments, which underpins MLLMs’ understanding of clinical skills. Systematically evaluating and improving MLLMs’ understanding of clinical skills and their continuous perception in clinical skill assessment scenarios is therefore essential for building reliable and high-impact AI systems for medical education. To address this need, the shared task on medical question answering targets clinical skill assessment scenarios.
IMPORTANT DATES
Release of task data : Jan 30, 2026 Paper submission deadline: Apr 17, 2026 Notification of acceptance: May 4, 2026 Camera-ready paper due: May 12, 2026 BioNLP Workshop Date: July 3 or 4, 2026
Note that all deadlines are 23:59:59 AoE (UTC-12).
TASK DEFINITION
ClinSkill QA formulates clinical skill understanding and continuous perception for clinical skill assessment as an ordering task: the MLLM is required to arrange shuffled key frames into a coherent sequence of clinical actions and to provide explanations for the resulting order. The dataset is constructed from video clips of medical student clinical procedures, collected from Zhongnan Hospital of Wuhan University and cofun (http://www.curefun.com/). This study was approved by the Institutional Review Board (IRB), and all data collection and processing followed relevant ethical guidelines.
DATASET
ClinSkill QA is built on 200 sets of shuffled key frames extracted from three types of clinical skill videos. Each set of key frames represents a sequence of continuous actions and is accompanied by expert-annotated ground-truth ordering and order rationales.
EVALUATION
For evaluation, we use Task Accuracy (exact ordering) and Pairwise Accuracy (the fraction of adjacent pairs correctly ordered) for the ordering results, and BertScore as well as an LLM-as-judge(G-Eval) for assessing the quality of the ordering explanations.
For the i-th sample (a set of shuffled keyframes):
Ordering evaluation - Task Accuracy - Pairwise Accuracy
Rationale evaluation - BertScore -LLM-as-Judge(G-Eval)
REGISTRATION AND SUBMISSION
Registration and Submission will be done via CodaBench (Link will be available soon on the task home page) Each team is allowed up to ten successful submissions on CodaBench. All shared task participants are invited to submit a paper describing their systems to the Proceedings of BioNLP 2026 (https://aclweb.org/aclwiki/BioNLP_Workshop) at ACL 2026 (https://2026.aclweb.org/). Papers must follow the submission instructions of the BioNLP 2026 workshop (https://aclweb.org/aclwiki/BioNLP_Workshop).
ORGANIZERS
Xiyang Huang, School of Artifical Intelligence, Wuhan University Yihuai Xu, School of Artifical Intelligence, Wuhan University Zhiyuan Chen, School of Artifical Intelligence, Wuhan University Keying Wu, School of Artifical Intelligence, Wuhan University Jiayi Xiang, School of Artifical Intelligence, Wuhan University Buzhou Tang, School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen Renxiong Wei, Zhongnan Hospital of Wuhan University Yanqing Ye, Zhongnan Hospital of Wuhan University Jinyu Chen, Zhongnan Hospital of Wuhan University Cheng Zeng, School of Artifical Intelligence, Wuhan University Min Peng, School of Artifical Intelligence, Wuhan University Qianqian Xie, School of Artifical Intelligence,Wuhan University Sophia Ananiadou, Department of Computer Science, The University of Manchester --
Paul Thompson Research Fellow Department of Computer Science National Centre for Text Mining Manchester Institute of Biotechnology University of Manchester 131 Princess Street Manchester M1 7DN UK http://personalpages.manchester.ac.uk/staff/Paul.Thompson/