BioCreative IX Challenge and Workshop CFP Large Language Models for Clinical and Biomedical NLP at IJCAI Where, When: The BioCreative IX workshophttps://www.ncbi.nlm.nih.gov/research/bionlp/biocreative9 will run with IJCAI 2025https://2025.ijcai.org/, August 16-22, 2025, In Montreal, CA. BioCreative IX: The 9th BioCreative workshop seeks to attract researchers interested in developing and evaluating automatic methods of extracting medically relevant information from clinical data and aims to bring together the medical NLP community and the healthcare researchers and practitioners. The challenge tracks explore MedHopQA, a dataset for benchmarking LLM-based reasoning systems with disease-centered question answers, ToxHabits, a task exploring the information extraction related to substance use and abuse in Spanish clinical content, and Sentence segmentation of real clinical notes using MIMIC-II clinical notes. We also will feature paper submissions on relevant topics and poster/tool demonstrations. Important Dates March - April: Team Registration May 12, 2025: Testing predictions, Evaluation results May 19, 2025: Submission of participants papers deadline Jun 06, 2025: Notification of accepted papers deadline Aug 16- Aug 22 2025: IJCAI 2025
Workshop Proceedings and Special Issue: The BioCreative IX Proceedings will host all the submissions from participating teams, and they will be freely available by the time of the workshop. In addition, select papers will be invited for a journal BioCreative IX special issue for work that passes their peer-review process. More details and information to submit will be posted in June.
Participation: Teams can participate in one or more of these tracks. Team registration will continue until April 30th, when final commitment is requested. To register a team go to the Registration Formhttps://forms.gle/xbQp158cn5pgJ1oj9. If you have restrictions accessing Google forms please send e-mail to BiocreativeChallenge@gmail.com.
Call for Papers We welcome submissions on work that describes research on similar topics to the three challenges, as well as:
* Development of benchmarking datasets for clinical NLP * Creating and evaluating synthetic data using LLMs and its impact for downstream tasks * Creative use of data augmentation for increasing tool accuracy and trustworthiness * Use of LLMs to streamline annotation tasks * NLP-systems capable of identifying entities in multilingual corpora * NLP-systems capable of semantic interoperability across different terminologies/ ontologies for efficient data curation * Integrating ontologies and knowledge bases for factual LLM production * Annotated corpora and other resources for health care and biomedical data modelling All submissions will be considered for poster presentations and tool demonstrations at the workshop.
BioCreative IX Tracks: Track 1: MedHopQA Large language models (LLMs) are commonly evaluated on their capabilities to answer questions in various domains, and it has become clear that robust QA datasets are critical to ensure proper evaluation of LLMs prior to their deployment in real-world biomedical or healthcare related applications. This track aims to advance the development of LLM-based systems that are capable of answering questions that involve multi-step reasoning. We have created a resource consisting of 1,000 question-answer pairs - focusing on diseases, genes and chemicals, mostly pertaining to rare diseases - based on public information in Wikipedia. The participants are encouraged to use any training data they wish to design and develop their NLP system agents that understand asserted information on genes, diseases, chemicals etc. and are able to answer multi-step reasoning questions involving such information. This track builds on the previous success in biomedical QA benchmarking (e.g., PubMedQA and BioASQ, MedQA) but differs from them in the fact that for MedHopQA it is necessary to employ a multi-step reasoning process to find the correct answer. Track 3: ToxHabits There is a pressing need to extract information related to substance use and abuse more systematically, including not only smoking and alcohol abuse but also other harmful drugs and substances from clinical content. These toxic habits have a considerable health impact on a variety of medical conditions and also affect the action of prescribed medications. To make such information actionable, it is critical to not only detect instances of consumption, but also to characterize certain aspects related to it, such as duration or mode of administration. Some initial efforts have been made to automatically detect social determinants of health, including smoking status, for content in English, but very limited efforts have been made for content in other languages. Therefore, we propose the ToxHabits track to address the automatic extraction of substance use and abuse information from clinical cases in Spanish. This task will consist of three subtasks: (a) toxic habit mention recognition, (b) detection of relevant clinical modifiers related to substance abuse, as well as (c) toxic habit condition QA challenge. Track 2: Sentence segmentation of real-life clinical notes Sentence segmentation is a fundamental linguistic task and is widely used as a pre-processing step in many NLP tasks. Although the development of LLMs and the sparse attention mechanism in transformer networks have reduced the necessity of sentence level inputs in some NLP tasks, many models are designed and tested only for shorter sequences. The need for sentence segmentation is particularly pronounced in clinical notes, as most clinical NLP tasks depend on this information for annotation and model training. In this shared task, we challenge participants to detect sentence boundaries (spans) for MIMIC-III clinical notes, where fragmented and incomplete sentences, complex graphemic devices (e.g. abbreviations, and acronyms), and markups are common. To encourage generalizability to multi-domain texts, participants will receive annotated texts from newswire articles and biomedical literature, in addition to clinical notes, for model development and evaluation. Organizing Committee
* Dr. Rezarta Islamaj, National Library of Medicine * Dr. Graciela Gonzalez-Hernandez, Cedars-Sinai Medical Center * Dr. Martin Krallinger, Barcelona Supercomputing Center * Dr. Zhiyong Lu, National Library of Medicine
---------------------------------------------------------- Rezarta Islamaj
National Library of Medicine Rezarta.Islamaj@nih.govmailto:Rezarta.Islamaj@nih.gov