(Apologies for cross-posting!) 1st CFP MEDDOPLACE Shared Task @ IberLEF/SEPLN2023 [Medical Documents PLAce-related Content Extraction]
Info:
-
Web: https://temu.bsc.es/meddoplace/ -
Registration: https://temu.bsc.es/meddoplace/registration -
Data: https://zenodo.org/record/7707567 -
Guidelines: https://zenodo.org/record/7775235
A. INTRODUCTION:
Location information represents one of the most relevant types of entities for high impact practical NLP solutions, resulting in a variety of applications adapted to different languages, content types and text genres.
Despite these previous efforts, the use, application, and exploitation of location-related entity detection (including sociodemographic information as well as more domain-specific things like clinical departments) from medical content was not sufficiently addressed. The performance of general purpose location NER systems applied on clinical texts is still poor, usually covering only general geolocation mentions, lacking sufficient granularity and not taking into account appropriate normalization or linking of the extracted locations to widely used geocoding resources, terminologies or vocabularies (like PlusCodes, GeoNames, or SNOMED CT concepts), thus hindering the practical exploitation of the generated results.
To address these issues we organize the MEDDOPLACE shared task (part of the IberLEF/SEPLN2023 initiative) devoted to the recognition, normalization and classification of location and location-related concept mentions for high impact healthcare data mining scenarios.
For this task we will release the MEDDOPLACE corpus, a large collection of clinical case reports written in Spanish that were exhaustively annotated manually by linguists and medical experts to label location-relevant entity mentions, following detailed annotation guidelines and entity linking procedures.
The practical implications of this task include:
-
Patient management: The detection of locations, origin of patients, their language is relevant for healthcare safety, management, patient communication and appropriate treatment options. -
Diagnosis & prognosis: Location information is important for the diagnosis or prognosis of some diseases that are more endemic to certain regions or particular geographical environments. -
Health risk factors: Geolocation information can be a risk factor in case of exposure to radiation, work-related or environmental contaminants affecting patients health. -
Mobility: Due to the increasing mobility of populations, detection of patients' travels and movements can improve early detection and tracing of infectious disease outbreaks, and thus enable taking preventive measurements. -
Traceability: the detection of medical departments, facilities and services is critical to support the traceability of the patient’s route through the health services.
The expected results as well as provided resources for this task show a clear multilingual adaptation potential and could have an impact beyond healthcare documents, being relevant for processing tourism-related content (traveling) or even legal texts.
B. TASKS DESCRIPTION:
The MEDDOPLACE task is structured into three subtracks:
-
MEDDOPLACE-NER: Given a collection of plain text documents, systems have to return the exact character offsets of all location and location-related mentions. -
MEDDOPLACE-NORM: Given a collection of entities and their origin in text, systems have to normalize them to their corresponding GeoNames (Toponym Resolution), PlusCodes (POIs Toponym Resolution) and SNOMED CT (Entity Linking) concept, depending on entity type. -
MEDDOPLACE-CLASS: Classification of detected location entities into four subcategories of clinical relevance (patient’s origin place; residence’s location; place where the patient has traveled to/from; place where the patient has received medical attention)
Publications and IBERLEF/SEPLN2023 workshop
Teams participating in MEDDOPLACE will be invited to contribute a systems description paper for the IberLEF (SEPLN 2023) Working Notes proceedings, and a short presentation of their approach at the IberLEF 2023 workshop.
Tentative Schedule:
-
Train set: March 27th, 2023 -
Test set release (start of evaluation period): April 3rd, 2023 -
End of evaluation period (system submissions): May 10th, 2023 -
Working papers submission: June 5th, 2023 -
Notification of acceptance (peer-reviews): June 23rd, 2023 -
Camera-ready system descriptions: July 6th, 2023 -
IberLEF @ SEPLN 2023: September 27th-29th, 2023
Organizers:
MEDDOPLACE is organized by the Barcelona Supercomputing Center’s NLP for Biomedical Information Analysis, as well as some external collaborators:
-
Martin Krallinger, Barcelona Supercomputing Center, Spain -
Salvador Lima, Barcelona Supercomputing Center, Spain -
Eulàlia Farré, Barcelona Supercomputing Center, Spain -
Luis Gascó, Barcelona Supercomputing Center, Spain -
Vicent Briva-Iglesias, D-REAL, Dublin City University, Ireland