(Apologies for cross-posting)
CFP: SYMPTEMIST Shared Task (BioCreative VIII run with AMIA 2023)
Named entity recognition and linking of symptoms, signs & findings (incl. multilingual dataset)
https://temu.bsc.es/SYMPTEMIST/ https://temu.bsc.es/distemist/
The SYMPTEMIST track focuses on the automatic detection of mentions of clinical symptoms (NER) and mapping to concept identifiers in clinical case reports in Spanish (entity linking). Also a multilingual version of the dataset will be released including versions in English, French, Italian, Dutch, Portuguese, Romanian and Swedish.
Key information:
-
Web: https://temu.bsc.es/symptemist -
Data: https://doi.org/10.5281/zenodo.6408476 https://zenodo.org/record/8223654 -
Annotation guidelines: https://zenodo.org/record/8246440 -
BioCreative web: https://biocreative.bioinformatics.udel.edu -
Registration form (Track 2- SYMPTEMIST): https://temu.bsc.es/distemist/registration/ https://docs.google.com/forms/d/e/1FAIpQLScoSNulOoxRju3c8v9Q-CSv-w5jJcXu93G7...
Motivation Systems able to detect and normalize clinical symptom mentions from medical texts are crucial for almost any healthcare data mining, AI, medical analytics or predictive application. As opposed to other clinical information types, such as diagnoses (diseases/procedures), lab test results or even medications, clinical symptoms can only be recovered directly from written clinical narratives. Due to the high complexity, variability and difficulty in generating annotated corpora for clinical symptoms, only few large manually annotated data collections have been constructed so far, with certain underlying limitations in terms of a) entity linking / normalization of the symptom mentions to controlled vocabularies and b) a lack of attempts to promote the development of multilingual solutions and b) provide detailed annotation criteria and guidelines. To address these issues, we have posed the SYMPTEMIST track at the upcoming BioCreative VIII initiative, which will be run in the context of the prestigious AMIA 2023 conference, which received over 1400 submissions this year.
Automatic detection of symptoms mentions are key for a range of clinical use cases and real world applications like:
-
Predictive modeling of diseases -
Differential diagnosis of complex diseases -
Rare disease characterization & analysis -
Selection of appropriate treatment & therapy -
Study of disease-symptom associations -
Early detection of disease outbreaks & epidemiological surveillance -
Extraction of phenotypes -
Drug repurposing & off label indications
The SYMPTEMIST organizers will also release multilingual resources to foster the development of multilingual tools and generate systems not only for Spanish but also for content in English and Romance languages (French, Portuguese, Italian, Romanian and Catalan) as well as versions in Dutch, Swedish and Czech.
Inspired by previous initiatives (e.g. n2c2, CLEF or TREC) and shared tasks (CANTEMIST, PharmaCoNER, or CodiEsp), we are launching the SYMPTEMIST shared task as part of the BioCreative 2023 evaluation initiative, with the following three sub-tracks:
-
SYMPTEMIST-entities: automatic detection of mentions of symptoms.
-
SYMPTEMIST-linking: finding mentions of symptoms and normalizing them to their Snomed-CT concept identifiers. -
SYMPTEMIST-multilingual: automatic detection of mentions of symptoms in versions of the corpus generated in English, French, Italian, Portuguese, Romanian, Catalan, Dutch, Swedish and Czech.
Tentative schedule
-
Annotation Guidelines: August 8th 2023 -
Train Set Subtask 1 (NER): August 8th, 2023 -
Train Set Subtask 2 (Linking): September 10th 2023 -
Train Set Subtask 3 (Multilingual): September 10th 2023 -
SympTEMIST Test Set: September 30th 2023 -
Participants Test Predictions Deadline: October 5th 2023 -
Participants Evaluation Results Release. October 10th 2023 -
Submission of Participant Papers Deadline: October 22nd 2023 -
Notification of Acceptance Participant Papers: October 30 2023 -
Submission of Camera-ready Participant Papers Deadline. November 1st 2023 -
BioCreative VIII workshop @ AMIA 2023: November 11-15, 2023, In New Orleans, LA.
BioCreative proceedings and AMIA workshop
Teams participating in SYMPTEMIST will be invited to contribute a systems description paper for the BioCreative 2023 Working Notes proceedings and a flash presentation of their approach at the BioCreative 2023 session. The BioCreative VIII workshop will run with AMIA 2023, November 11-15, 2023, In New Orleans, LA. See: https://amia.org/education-events/amia-2023-annual-symposium
Workshop Proceedings and Special Issue:
The BioCreative VIII Proceedings will host all the submissions from participating teams, and it will be freely available by the time of the workshop. In addition, we are happy to announce that the journal Database will host the BioCreative VIII special issue for work that has passed their peer-review process. Invitation to submit will be sent after the workshop.
All BioCreative VIII tracks
Track 1: BioRED (Biomedical Relation Extraction Dataset)
*Track 2: SYMPTEMIST (Symptom TExt Mining Shared Task)
Track 3: Genetic Phenotype Extraction and Normalization from Dysmorphology Physical Examination Entries
Track 4: Clinical Annotation Tool Track
Main Organizers
-
Martin Krallinger, Barcelona Supercomputing Center, Spain -
Eulàlia Farré-Maduell, Barcelona Supercomputing Center, Spain -
Luis Gascó, Barcelona Supercomputing Center, Spain -
Salvador Lima, Barcelona Supercomputing Center, Spain -
Jan Rodriguez, Barcelona Supercomputing Center, Spain
======================================= Martin Krallinger, Dr. Head of NLP for Biomedical Information Analysis Unit Barcelona Supercomputing Center (BSC-CNS) https://www.linkedin.com/in/martin-krallinger-85495920/ =======================================