BIONLP 2023 and Shared Tasks @ ACL 2023 https://aclweb.org/aclwiki/BioNLP_Workshop#SHARED_TASKS_2023
WORKSHOP OVERVIEW AND SCOPE The BioNLP workshop associated with the ACL SIGBIOMED special interest group has established itself as the primary venue for presenting foundational research in language processing for the biological and medical domains. The workshop is running every year since 2002 and continues getting stronger. BioNLP welcomes and encourages work on languages other than English, and inclusion and diversity. BioNLP truly encompasses the breadth of the domain and brings together researchers in bio- and clinical NLP from all over the world. The workshop will continue presenting work on a broad and interesting range of topics in NLP. The interest to biomedical language has broadened significantly due to the COVID-19 pandemic and continues to grow: as access to information becomes easier and more people generate and access health-related text, it becomes clearer that only language technologies can enable and support adequate use of the biomedical text.
BioNLP 2023 will be particularly interested in language processing that supports DEIA (Diversity, Equity, Inclusion and Accessibility). The work on detection and mitigation of bias and misinformation continues to be of interest. Research in languages other than English, particularly, under-represented languages, and health disparities are always of interest to BioNLP.
Other active areas of research include, but are not limited to:
Tangible results of biomedical language processing applications; Entity identification and normalization (linking) for a broad range of semantic categories; Extraction of complex relations and events; Discourse analysis; Anaphora/coreference resolution; Text mining / Literature based discovery; Summarization; Τext simplification; Question Answering; Resources and strategies for system testing and evaluation; Infrastructures and pre-trained language models for biomedical NLP (Processing and annotation platforms); Development of synthetic data & data augmentation; Translating NLP research into practice; Getting reproducible results.
SHARED TASKS 2023 Shared Tasks on Summarization of Clinical Notes and Scientific Articles
The first task focuses on Clinical Text.
Task 1A. Problem List Summarization Automatically summarizing patients’ main problems from the daily care notes in the electronic health record can help mitigate information and cognitive overload for clinicians and provide augmented intelligence via computerized diagnostic decision support at the bedside. The task of Problem List Summarization aims to generate a list of diagnoses and problems in a patient’s daily care plan using input from the provider’s progress notes during hospitalization.This task aims to promote NLP model development for downstream applications in diagnostic decision support systems that could improve efficiency and reduce diagnostic errors in hospitals. This task will contain 768 hospital daily progress notes and 2783 diagnoses in the training set, and a new set of 300 daily progress notes will be annotated by physicians as the test set. The annotation methods and annotation quality have previously been reported here. The goal of this shared task is to attract future research efforts in building NLP models for real-world decision support applications, where a system generating relevant and accurate diagnoses will assist the healthcare providers’ decision-making process and improve the quality of care for patients.
Shared Task 1A Registration: https://forms.gle/yp6TKD66G8KGpweN9
Please join our Google discussion group for the important update: https://groups.google.com/g/bionlp2023problemsumm
Important Dates:
Registration Started: January 13th, 2023 Releasing of training and validation data: January 13th, 2023 Releasing of test data: April 13th, 2023 System submission deadline: April 20th, 2023 System papers due date: May 4th, 2023 Notification of acceptance: June 1st, 2023 Camera-ready system papers due: June 13th, 2023 BioNLP Workshop Date: July 13th or 14th, 2023
Task 1A Organizers:
Majid Afshar, Department of Medicine University of Wisconsin - Madison. Yanjun Gao, University of Wisconsin Madison. Dmitriy Dligach, Department of Computer Science at Loyola University Chicago. Timothy Miller, Boston Children’s Hospital and Harvard Medical School. Task 1B. Radiology report summarization Radiology report summarization is a growing area of research. Given the Findings and/or Background sections of a radiology report, the goal is to generate a summary (called an Impression section) that highlights the key observations and conclusions of the radiology study.
The research area of radiology report summarization currently faces an important limitation: most research is carried out on chest X-rays. To palliate these limitations, we propose two datasets: A shared summarization task that includes six different modalities and anatomies, totalling 79,779 samples, based on the MIMIC-III database.
A shared summarization task on chest x-ray radiology reports with images and a brand new out-of-domain test-set from Stanford.
SEE MORE at: https://vilmedic.app/misc/bionlp23/sharedtask
Task 1B Organizers:
Jean-Benoit Delbrouck, Stanford University. Maya Varma, Stanford University.
Task 2. Lay Summarization of Biomedical Research Articles Biomedical publications contain the latest research on prominent health-related topics, ranging from common illnesses to global pandemics. This can often result in their content being of interest to a wide variety of audiences including researchers, medical professionals, journalists, and even members of the public. However, the highly technical and specialist language used within such articles typically makes it difficult for non-expert audiences to understand their contents.
Abstractive summarization models can be used to generate a concise summary of an article, capturing its salient point using words and sentences that aren’t used in the original text. As such, these models have the potential to help broaden access to highly technical documents when trained to generate summaries that are more readable, containing more background information and less technical terminology (i.e., a “lay summary”).
This shared task surrounds the abstractive summarization of biomedical research articles, with an emphasis on controllability and catering to non-expert audiences. Through this task, we aim to help foster increased research interest in controllable summarization that helps broaden access to technical texts and progress toward more usable abstractive summarization models in the biomedical domain.
For more information, see:
Main site: https://biolaysumm.org/ CodaLab page - subtask 1: https://codalab.lisn.upsaclay.fr/competitions/9541 CodaLab page - subtask 2: https://codalab.lisn.upsaclay.fr/competitions/9544 Detailed descriptions of the motivation, the tasks, and the data are also published in:
Goldsack, T., Zhang, Z., Lin, C., Scarton, C.. Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature. EMNLP 2022. Luo, Z., Xie, Q., Ananiadou, S.. Readability Controllable Biomedical Document Summarization. EMNLP 2022 Findings.
Task 2 Organizers:
Chenghua Lin, Deputy Director of Research and Innovation in the Computer Science Department, University of Sheffield. Sophia Ananiadou, Turing Fellow, Director of the National Centre for Text Mining and Deputy Director of the Institute of Data Science and AI at the University of Manchester. Carolina Scarton, Computer Science Department at the University of Sheffield. Qianqian Xie, National Centre for Text Mining (NaCTeM). Tomas Goldsack, University of Sheffield. Zheheng Luo, the University of Manchester. Zhihao Zhang, Beihang University.
Organizers Dina Demner-Fushman, US National Library of Medicine Kevin Bretonnel Cohen, University of Colorado School of Medicine Sophia Ananiadou, National Centre for Text Mining and University of Manchester, UK Jun-ichi Tsujii, National Institute of Advanced Industrial Science and Technology, Japan