We are pleased to announce the inaugural offering of the Plain Language Adaptation of Biomedical Abstracts (PLABA) track, as part of the 2023 Text Analysis Conference (TAC) hosted by the U.S. National Institute of Standards and Technology (NIST). This track is an opportunity to showcase your cutting-edge research on an important topic, and to take advantage of large amounts of expert annotated data and manual evaluation.
Background: Deficits of Health Literacy are linked to worse outcomes and drive health disparities. Though unprecedented amounts of biomedical knowledge are available online, patients and caregivers face a type of “language barrier” when confronted with jargon and academic writing. Advances in language modeling have improved plain language generation, but the task of automatically and accurately adapting biomedical text for a general audience has thus far lacked high-quality, standardized benchmarks.
Task: Systems will adapt biomedical abstracts to plain language. This includes substituting medical jargon, providing explanations for necessary terms, simplifying sentences, and other modifications. The training set is the publicly available PLABA datasethttps://doi.org/10.1038%2Fs41597-022-01920-3, which contains 750 abstracts with manual, sentence-aligned adaptations for each, totaling more than 7k sentence pairs with document context.
Evaluation: Participating systems will be evaluated on 400 held out abstracts, manually adapted four-fold by different annotators for robust automatic metrics. Additionally, a subset of system output will be manually evaluated along several axes to ensure they are accurate and faithful to the original, which is crucial for the biomedical domain.
URL: https://bionlp.nlm.nih.gov/plaba2023/
Mailing list: https://groups.google.com/g/plaba2023
Key dates:
Jul 19 – Evaluation data released
Aug 16 – Submissions due
Oct 18 – Results posted
We look forward to your submissions.