Dear all
As part of the new EU ATRIUM project we have opportunities for short-term fully funded research visits to work on development and application of NLP methods and tools to archaeological data. We have models, tools and data available for you to use. Visitors must come from an organisation outside the UK.
For more information and applications, please visit https://atrium-research.eu/travel-grants/
Diana Maynard
Senior Research Fellow
Computer Science Dept.
University of Sheffield, UK
Transnational Access Grants: Calls for Applications
The ATRIUM project invites researchers to apply to participate in Transnational Access training visits to support their research. ATRIUM’s Transnational Access (TNA) scheme offers researchers the possibility to apply for a fully funded placement at several different partner organisations to access expert knowledge and advice from leading Data Management organisations across Europe.
The TNA scheme aims to recruit and support approximately 200 Arts and Humanities researchers with mentorship and access to knowledge, data and tools from 14 different institutions across Europe. Researchers who are successful in their applications will be supported to visit the infrastructure providers in our consortium in person, benefiting from direct contact, knowledge sharing and network building. In total, 388 weeks of Transnational Access will be provided during the ATRIUM project.
There are two types of types of TNA applications:
Individual Access – These are individual applications based on a specific research topic proposed by the applicant that match the specialisms of the host organisation.
Summer School Access – These are fixed events during the year that provide access for a group of researchers based on a set of predetermined specialised topics.
The first collection date is 31 May, 2024, and applicants will be notified by 28 June, 2024. Calls for applications will be issued several times per year throughout the duration of the project (March 2024 to December 2028).
Individual Access applications will be offered on a rolling basis with a deadline every three months. Summer Schools will be offered 1 to 2 times a year with a fixed deadline 3 to 4 months ahead of the scheduled event.
Visit www.atrium-research.eu <https://atrium-research.eu/travel-grants/> for more information.
PhD Title: Information Rating and Analysis of Knowledge Dynamics.
Application to the Temporal Monitoring of the Reliability of
Bibliographic Information on Insects as Vectors of Plant Pathogens.
We are looking for candidates who have proven knowledge of NLP and
Machine Learning. Deadline is June30th.
Detailed description of the doctoral project HERE
<https://maiage.inrae.fr/sites/default/files/2024-04/ADUM%20INRAE%202024-eng…>
-----------
## Research subject
within the framework of the research project on NLP for insect
monitoring in plant health. The central aim of the PhD project is to
develop original approaches for the reliability of textual information,
by integrating linguistic dimensions and knowledge graphs (NLP, language
models), and dynamic dimensions (time series). The quality and relevance
of the extracted information will be derived from the collected
documents along time and from the existing knowledge base.
For plant health and risk management, the biological interaction between
insect vectors, pathogens, and host plants is of primary interest for
anticipating contamination and reducing pesticide use.
You will be affiliated with the Computer Science Graduate School at
Paris-Saclay University. You will be employed by INRAE.
## Requirements
A successful candidate will have an MSc or equivalent in Artificial
Intelligence.
*
Proven experience with applying natural language processing.
*
Interest in learning about biology or bioinformatics.
*
High level of academic English or French, both written and spoken;
*
Good programming skills in Python or Java (and preferably experience
with deep learning tools)
*
Capacity to work as part of a team in a multidisciplinary framework.
*
Experiences of applied research to Life Science is an asset.
## Offer
We offer a motivating research environment with many opportunities for
in-house, national and international collaborations and access to
computing GPU resources and state-of-the-art research equipment. The
gross salary per month for the three-year contract is 2 100 (in 2024) to
2300 (in 2026) including the social security package (healthcare,
pensions, unemployment benefits).
## Start date: October 2024
Location: Paris-Saclay University campus, mainly at the MaIAGE lab [1]
and MIA-Paris-Saclay [2], you will visit PHIM [3] at the INRAE research
center in Montpellier (South of France).
## Application
The closing date is June 30th 2024
Interested candidates should send their application files to Claire
Nédellec (claire.nedellec(a)inrae.fr <mailto:claire.nedellec@inrae.fr>),
Vincent Guigue, Nicolas Sauvion (Nicolas.sauvion(a)inrae.fr
<mailto:Nicolas.sauvion@inrae.fr>), and Robert Bossy
(Robert.bossy(a)inrae.fr <mailto:Robert.bossy@inrae.fr>).
It should comprise:
*
a CV (max 5 pages) with transcripts (Master), diplomas, internships
*
a cover letter
*
the names and contact of two referees for reference letters
[1] https://maiage.inrae.fr/en/bibliome
<https://maiage.inrae.fr/en/bibliome>
[2] https://vguigue.github.io/ <https://vguigue.github.io/>
[3]
https://umr-phim.cirad.fr/en/recherche/comprendre-les-epidemies-dans-les-ch…
<https://umr-phim.cirad.fr/en/recherche/comprendre-les-epidemies-dans-les-ch…>
*Apologies for cross-posting*
First Workshop on Knowledge-Enhanced Machine Translation
Sheffield, United Kingdom, on June 27, 2024
https://kemt2024.wixsite.com/home
First Call for Papers
The 1st edition of the Workshop on Knowledge-Enhanced Machine Translation
will be held in Sheffield, United Kingdom, on June 27, 2024, co-located
with EAMT 2024. We welcome submissions either of research papers or
extended abstracts/industry reports. Full research papers should describe
original, unpublished content, while extended abstracts are open to
reporting preliminary results of ongoing research. Industry reports should
demonstrate the impact of conceptual modelling in a real-world setting,
arguing for generalisability of methods and lessons learned. Potential
submission topics encompass, but are not restricted to:
-
Integration of external terminology and constrained decoding
-
Integration of translation memories and similar translations from
external sources
-
Leveraging any kind of linguistic information
-
Data augmentation techniques
-
Using large language models to integrate external resources
-
Knowledge graphs
-
Integration of translation quality indicators for improving final MT
output
-
Quality assessment of knowledge-enhanced MT systems
-
Utilising quality estimation systems for improving MT performance
Submission information
The workshop accepts submissions in two different modalities:
-
Full research papers: Submissions will be accepted as papers of at least
4 up to 10 pages (plus unlimited pages for references and appendices).
-
Extended abstracts/industry reports:Submissions will be accepted as
papers of up to 2 pages. The references are not included in the 2-page
limit.
Accepted submissions will be presented either as posters or oral
communications, as decided by the program committee. Accepted submissions
will be published online as proceedings included in the ACL Anthology,
unless the authors specify otherwise.
Submissions should be formatted according to the EAMT 2024 guidelines (PDF,
LaTeX, Word)
<https://eamt2024.sheffield.ac.uk/conference-calls/2nd-call-for-papers#h.45w…>
and submitted in PDF through OpenReview page (
https://openreview.net/group?id=EAMT.org/2024/Workshop/KEMT ).
Important dates
-
Workshop paper/abstracts due: 15th April 2024
-
Notification of acceptance: 15th May 2024
-
Camera-ready papers/extended abstracts due: 27th May 2024
-
Workshop date: 27th June 2024
Follow us on Twitter: https://twitter.com/kemt2024
--
Miquel Esplà-Gomis
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
Carretera de Sant Vicent del Raspeig s/n
03690 Sant Vicent del Raspeig, Alacant (Spain)
Tel: +34 965903400 ext. 2424
The next meeting of the Edge Hill Corpus Research Group will take place online (via MS Teams) on Thursday 25 April 2024, 2:00-3:30 pm (UK time).
Attendance is free. You can register here:
https://store.edgehill.ac.uk/conferences-and-events/conferences/events/edge…
Topics: Corpus Methodology, Large Language Models
Speakers: Sylvia Jaworska<https://www.reading.ac.uk/elal/staff/dr-sylvia-jaworska> (University of Reading, UK) & Mathew Gillings<https://www.wu.ac.at/ebc/about-us/team/mathew-gillings/> (Vienna University of Economics and Business, Austria)
Title: How humans vs. machines identify discourse topics: an exploratory triangulation
Abstract
Identifying discourses and discursive topics in a set of texts has not only been of interest to linguists, but to researchers working across social sciences. Traditionally, these analyses have been conducted based on small-scale interpretive analyses of discourse which involve some form of close reading. Naturally, however, that close reading is only possible when the dataset is small, and it leaves the analyst open to accusations of bias and/or cherry-picking.
Designed to avoid these issues, other methods have emerged which involve larger datasets and have some form of quantitative component. Within linguistics, this has typically been through the use of corpus-assisted methods, whilst outside of linguistics, topic modelling is one of the most widely-used approaches. Increasingly, researchers are also exploring the utility of LLMs (such as ChatGPT) to assist analyses and identification of topics. This talk reports on a study assessing the effect that analytical method has on the interpretation of texts, specifically in relation to the identification of the main topics. Using a corpus of corporate sustainability reports, totalling 98,277 words, we asked 6 different researchers, along with ChatGPT, to interrogate the corpus and decide on its main ‘topics’ via four different methods. Each method gradually increases in the amount of context available.
• Method A: ChatGPT is used to categorise the topic model output and assign topic labels;
• Method B: Two researchers were asked to view a topic model output and assign topic labels based purely on eyeballing the co-occurring words;
• Method C: Two researchers were asked to assign topic labels based on a concordance analysis of 100 randomised lines of each co-occurring word;
• Method D: Two researchers were asked to reverse-engineer a topic model output by creating topic labels based on a close reading.
The talk explores how the identified topics differed both between researchers in the same condition, and between researchers in different conditions shedding light on some of the mechanisms underlying topic identification by machines vs humans or machines assisted by humans. We conclude with a series of tentative observations regarding the benefits and limitations of each method along with suggestions for researchers in selecting an analytical approach for discourse topic identification. While this study is exploratory and limited in scope, it opens up a way for further methodological and larger scale triangulations of corpus-based analyses with other computational methods including AI.
________________________________
Edge Hill University<http://ehu.ac.uk/home/emailfooter>
Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter>
University of the Year, Educate North 2021/21
________________________________
This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>
Dear all,
We are excited to announce the upcoming "Celtic Languages in the Digital Age" workshop, scheduled for April 9, 2024, at Lancaster University.
This is a hybrid event of talks and panel discussion, organised by the UCREL NLP Group and funded by the Faculty of Science and Technology's Research Catalyst Fund, aims to address the critical need for linguistic resources supporting Celtic languages.
Event Details:
- Date: April 9, 2024
- Location: Lancaster University
- Format: Hybrid with online broadcast (free registration)
- Register to attend online: https://bit.ly/clida2024
The workshop will gather experts in Celtic languages, linguistics, corpus linguistics, computer science, and computational linguistics to explore the development of language models for under-resourced languages, including Welsh, Irish, Scottish Gaelic, Cornish, Breton, and Manx. We also have a talk on the use of transfer learning to create language models for low-resourced languages taking Luxembourgish as a use case.
Programme, list of speakers and talks details can be found on the event's website: https://wp.lancs.ac.uk/celtic/
If you are in or near Lancaster and would like to attend in person, do get in touch with me as we have a few places left, attending in person is free, lunch and refreshments will be provided on the day.
Best wishes,
Mo
--------------------------------
Dr Mo El-Haj
Senior Lecturer in NLP
Director of Admissions (SCC)
Co-Director of UCREL NLP Group<https://ucrel.lancs.ac.uk/>
Natural Language Engineering (NLE) Journal Editorial Board
https://www.cambridge.org/core/journals/natural-language-engineering
Advisory Board of the Natural Language Processing Book Series
https://benjamins.com/catalog/nlp
School of Computing and Communications, Lancaster University
https://www.lancaster.ac.uk/staff/elhaj
@DocElhaj<https://twitter.com/DocElhaj>
You may receive emails from me outside what are your typical office hours.
I do not expect you to respond to my email outside your working hours.
*** Third Call for Papers ***
12th IEEE International Conference on Intelligent Systems (IS'24)
Invited Session on Intelligent Tools for e/m/d Learning
August 29-31, 2024, Golden Sands Resort, Varna, Bulgaria
https://www.ieee-is.org
(*** Submission Deadline: April 30, 2024 AoE ***)
This invited session aims to explore state-of-the-art advancements in intelligent tools
designed to revolutionize the landscape of e-learning (electronic learning), m-learning
(mobile learning) and d-learning (digital learning). With the rapid integration of technology
into education, there is a growing need to investigate and showcase innovative intelligent
solutions that can enhance the effectiveness of online, mobile and digital learning
environments. Intelligent solutions for the educational context can have a groundbreaking and
revolutionary impact on learners and faculty alike, with integrated educational tools offering
enhanced engagement, support for different learning levels, styles and abilities, and
personalized learning experiences with immediate feedback at its core.
Papers submitted to this session are expected to present results on various aspects of
intelligent tools for e-learning, m-learning and d-learning. Topics may include but are not
limited to adaptive learning systems, personalized learning experiences, artificial
intelligence-driven content creation, data analytics for educational insights, and the integration
of emerging technologies such as augmented reality and virtual reality in educational settings.
Authors are encouraged to share empirical research, case studies, and practical
implementations that demonstrate the impact of intelligent tools on learner/faculty
engagement, knowledge retention, and overall educational outcomes. The session aims to
provide a comprehensive overview of the current state-of-the-art in intelligent tools in
e-learning, m-learning and d-learning, fostering discussions on their implications for the
future of education and their potential to address challenges in diverse learning
environments and educational levels, as well as diverse learner ability.
PAPER SUBMISSION INSTRUCTIONS
Submitted papers should be in IEEE 2-column format and should adhere to the template
available here: https://www.ieee-is.org/wp-content/uploads/2024/02/IS_A4_format-AAu.docx
The expected paper length in camera-ready format should not exceed 6 pages.
Submissions should be done in PDF using Easy Chair and the submission link is:
https://easychair.org/conferences/?conf=itl24 .
PUBLICATION
All accepted papers will be included in the IS'24 proceedings to be published by IEEE. The
proceedings of the previous editions of IS can be found here:
https://ieeexplore.ieee.org/xpl/conhome/1000395/all-proceedings .
Traditionally, extended versions of conference-selected papers appear within 1-2 years after
the conference dates in well-known international iournals and/or post-conference books.
More information can be found on the conference web site
(https://www.ieee-is.org/publication-information/ ).
CONTACT POINT
For any additional information or clarification please contact the Invited Session Chair,
George A. Papadopoulos at george(a)ucy.ac.cy .
IMPORTANT DATES
• Paper submission: April 30, 2024, AoE
• Notification: May 20, 2024
• Camera Ready: June 6, 2024
• Author Registration: June 15, 2024
ORGANISATION
Committees
https://www.ieee-is.org/program-committee/
Invited Session Chair
• George A. Papadopoulos, University of Cyprus, Cyprus (george(a)ucy.ac.cy)
****We apologize for multiple postings of this e-mail****
IberLEF 2024 Task - HOPE: Approaching Hope Speech Detection in Social
Media from Two Perspectives, for Equality, Diversity and Inclusion and as
Expectations
Held as part of the evaluation forum IberLEF 2024
<https://sites.google.com/view/iberlef-2024/home?authuser=0> in the 40th
edition of the International Conference of the Spanish Society for Natural
Language Processing (SEPLN 2024
<http://sepln2024.infor.uva.es/en/front-page-english/>)
Valladolid, Spain, 24-27 September 2024
Codalab link: https://codalab.lisn.upsaclay.fr/competitions/17714
Dear All,
Hope, a crucial aspect of human psychology, profoundly shapes emotions,
behavior, and mood, influencing how individuals perceive and navigate
challenges (Bruininks and Malle, 2005; Snyder, 1994, 2000). High levels of
hope correlate with positive outcomes such as academic success and lower
depression rates, while low hope is associated with diminished well-being
(Snyder, 2002; Snyder et al., 1997; Diener, 2009). Despite its
significance, hope has been underexplored in Natural Language Processing
(NLP) until recent years. Efforts have been made to integrate NLP
techniques into the analysis of hope through shared tasks, like those
organized in ACL 2022, RANLP 2023, and IberLEF 2023 (Chakravarthi et al.,
2022; Kumaresan et al., 2023; Jiménez-Zafra et al., 2023). The upcoming
IberLEF 2024 edition aims to delve deeper into hope from two angles: hope
for equality, diversity, and inclusion, and hope as expectations. This
edition promises to expand understanding by examining hope across different
domains and languages, thus addressing crucial questions in hope speech
detection research. Two tasks are outlined in this description, each
focusing on different aspects of hope.
-
Task 1: It centers on "Hope for Equality, Diversity, and Inclusion,"
emphasizing the importance of hope speech in mitigating hostility and
supporting individuals facing challenges like illness, stress, or
loneliness, particularly within vulnerable groups such as the LGBT
community and racial minorities. This task consists of giving a Spanish
tweet, identifying whether it contains hope speech or not. The possible
categories for each text are:
-
hs: hope speech.
-
nhs: non hope speech.
-
Task 2: It delves into "Hope as Expectations," highlighting hope's role
as an anticipatory mindset shaping human emotions and behaviors, especially
in the context of social media where expressions are abundant. This task
aims to analyze hope speech's presence in English and Spanish texts,
focusing on binary hope speech detection and multiclass hope speech
detection. The subtask are presented as follows,
-
Subtask 2a- Binary Hope speech detection: A given text in
English/Spanish will be classified as:
-
Hope
-
Not Hope
-
Subtask 2b- Multiclass Hope speech detection: A given text in
English/Spanish will be classified as:
-
Generalized Hope
-
Realistic Hope
-
Unrealistic Hope
-
Not Hope
In both tasks, there will be a real-time leaderboard and the participants
will be allowed to make a maximum of 10 submissions through CodaLab, from
which each team will have to select the best one for ranking.
The dataset details and registration are available at:
https://codalab.lisn.upsaclay.fr/competitions/17714
Best regards,
The HOPE 2024 organizing committee
Important dates
-
Release of training + development corpora: Feb 16, 2024.
-
Release of test corpora and start of evaluation campaign: April 1, 2024.
-
End of evaluation campaign (deadline for runs submission): Apr 16, 2024.
-
Publication of official results: Apr 18, 2024.
-
Paper submission: May 14, 2024.
-
Review notification: Jun 11, 2024.
-
Camera-ready submission: Jun 28, 2024.
-
IberLEF Workshop (SEPLN 2024): Sep 27, 2024.
-
Publication of proceedings: Sep ??, 2024.
Organizing Committee
-
Daniel García-Baena, SINAI, Universidad de Jaén, Spain.
-
Fazlourrahman Balouchzahi, CIC IPN, Mexico.
-
Salud María Jiménez-Zafra, SINAI, Universidad de Jaén, Spain.
-
Sabur Butt, Institute for the Future of Education (IFE) at Tecnológico
de Monterrey, Mexico.
-
Miguel Ángel García-Cumbreras, SINAI, Universidad de Jaén, Spain.
-
Atnafu Lambebo Tonja, Centro de Investigación en Computación, Instituto
Politécnico Nacional (IPN), Mexico.
-
José Antonio García-Díaz, UMUTeam, Universidad de Murcia, Spain.
-
Selen Bozkurt, Department of Biomedical Informatics, School of Medicine,
Emory University.
-
Bharathi Raja Chakravarthi, University of Galway, Ireland.
-
Hector G. Ceballos, Institute for the Future of Education (IFE) at
Tecnologico de Monterrey, Mexico.
-
Rafael Valencia-García, UMUTeam, Universidad de Murcia, Spain.
-
Grigori Sidorov, CIC IPN, Mexico.
-
L. Alfonso Ureña-López, SINAI, Universidad de Jaén, Spain.
-
Alexander Gelbukh, CIC IPN, Mexico.
*Sabur Butt, Ph.D. *(He/Him)
Institute for the Future of Education (IFE)
*Tecnológico de Monterrey, Mexico*
Address: Av. Eugenio Garza Sada 2501 Sur Tecnológico, 64849 Monterrey, N.L.
LinkedIn <https://www.linkedin.com/in/saburb> - GitHub
<https://github.com/saburbutt> - Scholar
<https://scholar.google.com/citations?user=re7md-0AAAAJ&hl=en> - Website
<https://saburbutt.github.io/>
Anna Rogers <anna.gld(a)gmail.com>
10:36 AM (2 hours ago)
to Corpora
The Department of Computer Science at the IT University of Copenhagen is
offering a PhD position in Natural Language Processing/Computational
Linguistics*,* with a start date of *1 September 2024*. The *application
deadline is 1**5* *April** 2024.* Applications for the position can be
submitted via ITU job portal
<https://candidate.hr-manager.net/ApplicationInit.aspx?cid=119&ProjectId=181…>
.
*Proposed project title: *Linguistic competence of language models in
prompt interpretation and text generation
*Proposed project description.* Recent generative systems based on
pre-trained language models are remarkably fluent when generating even
relatively exotic kinds of text, such as limericks or texts in early middle
English. At the same time, they remain sensitive to slight variation in the
wording of the prompts.
The proposed project will investigate this difference in competence for
different linguistic phenomena when the model interprets its instructions
(prompts) and generates text in response to prompts. *The specific focus of
the project is negotiable*, it could be syntactic constructions, variations
in lexical semantics, text registers etc. Model competence will be assessed
with respect to analysis of its pre-training data (hence, an open-source
model will be used).
The ideal candidate would have a strong background in computational
linguistics, as well as core skills in programming in Python and machine
learning. For entering the PhD program in Denmark, a M.Sc. or equivalent
degree is required. For this position, it is also possible to start as a
Master student, if extra ECTs are needed (for students who currently have
60-115 MA ECTS).
The successful candidate will be a member of the national Pioneer Centre
for Artificial Intelligence <https://aicentre.dk/>, a 5-university Danish
research endeavor, and of the NLPnorth <https://nlpnorth.github.io/>research
group at the IT University’s Computer Science Department. Both the centre
and research group are highly international and well-funded, working on a
broad range of research topics.
The project will be co-supervised by Associate Professors Anna Rogers
<https://annargrs.github.io/> (arog(a)itu.dk) and Rob van der Goot
<https://robvanderg.github.io/> (robv(a)itu.dk), to whom inquiries about the
project can be directed.
--
Anna Rogers
Associate Professor
IT University of Copenhagen
http://annargrs.github.io/
I have a postdoc opening in my lab with *applications due April 10th*. See
Bullard Research Fellow (BRF) area 7 (“*BRF7*”) in the job ad here:
https://apply.interfolio.com/142711.
"BRF7) We seek applicants in natural language processing (NLP), information
retrieval (IR), and human computation & crowdsourcing (HCOMP). Our work on
responsible AI develops methods for model explanations and fairness. We
build automated and human-in-the-loop models. We develop general methods to
advance the state-of-the-art, grounded in social challenges like curbing
disinformation and hate speech. A variety of our ongoing work touches on
large language models (LLMs). This position will be mentored by Matt Lease
<https://www.ischool.utexas.edu/~ml/>, as part of his lab for Artificial
Intelligence and Human-Centered Computing <http://ai.ischool.utexas.edu/>
(AI&HCC), and provide collaboration opportunities in UT Austin’s
campus-wide Good Systems <http://goodsystems.utexas.edu/> grand challenge
for responsible AI."
Please see the job ad for full details about the opening:
https://apply.interfolio.com/142711.
--
Matt Lease
Professor
Information & Computer Science
University of Texas at Austin
Voice: (512) 471-9350 · Fax: (512) 471-3971 · Office: UTA 5.536
http://www.ischool.utexas.edu/~ml
Apologies for cross-posting.
---------------------------------------------------------------------------
There is still time to participate in this shared task---evaluation phase from April 17 to 24.
**Social Media Mining For Health 2024**
https://healthlanguageprocessing.org/smm4h-2024/
The Social Media Mining for Health (SMM4H) workshop and shared tasks have been running successfully since 2016. They now go into the 9th round, with the workshop being co-located at ACL 2024 in Bangkok.
https://2024.aclweb.org/
Bangkok, Thailand , August 12–17, 2024
**Important Dates for all SMM4H Shared Tasks**
Training data available: January 10, 2024
CodaLab Available: January 17, 2024
Evaluation Phase: April 17 - 24, 2024
System description paper due: May 17, 2024
Paper acceptance notification: June 17, 2024
Camera-ready papers due: July 1, 2024
Workshop in Bangkok, Thailand , August 15, 2024
**Task 2: Task Description**
Adverse Drug Events (ADEs) are negative medical side effects related to a drug. Mining ADEs from user-generated text has become a popular topic and is an important use case for research, as it could help detecting crowd signals from users online. Being able to make use of information across languages by applying multi-lingual methods further supports this endeavor.
Our task targets the languages *German, French and Japanese* and is split into two subtasks. Subtask 2a focuses on Named Entity Recognition (NER) of of medication, disorder, and function mentions from user-generated texts. Subtask 2b performs joint NER and Relation Extraction (RE) to determine if these disorders are ADEs by finding the correct relations between medications, disorders and functions. We distinguish two types of relations between medication mentions and disorder/function mentions:
- "caused": the disorder/function was caused by a medication, i.e., the disorder/function is an ADE
- "treatment_for": the disorder/function is the reason for the medication, i.e., the medication is supposed to treat the disorder/function
*Tasks*:
Participants can choose between participating in subtask 2a, or subtask 2b, or both.
~~ We explicitly encourage the submission of new and creative approaches! ~~
- Subtask 2a) Named entity recognition of the entities "drug", "disorder" and "function" from user-generated texts.
- Subtask 2b) Joint named entity and relation extraction of the entities "drug", "disorder" and "function" and the relations "caused" and "treatment_for".
*Data*:
The data originates from social media platforms, e.g., patient fora and X (Twitter). We provide data in German and Japanese, and a few examples in French. The submitted systems will be evaluated on German, French and Japanese data. Please find more information here: https://healthlanguageprocessing.org/smm4h-2024/
Please use this form to register: https://forms.gle/7w4si27uJrCMiTyL8
Organizers of Subtask 2:
Pierre Zweigenbaum, Université Paris-Saclay, CNRS, LISN, France
Sebastian Möller, Technische Universität Berlin, DFKI GmbH, Germany
Roland Roller, DFKI GmbH, Germany
Philippe Thomas, DFKI GmbH, Germany
Eiji Aramaki, NAIST, Japan
Shoko Wakamiya, NAIST, Japan
Shuntaro Yada, NAIST, Japan
Katherine Yeh, Université Paris-Saclay, CNRS, LISN, France
Lisa Raithel, Technische Universität Berlin, Germany & Université Paris-Saclay, CNRS, LISN, France
--
Pierre Zweigenbaum
Senior Researcher
Université Paris-Saclay, CNRS, LISN