Third Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC 2026
https://bionlp.nlm.nih.gov/cl4health2026/
LREC 2026
Palma, Mallorca (Spain)
SCOPE
CL4Health fills the gap among the different biomedical language processing workshops by providing a general venue for a broad spectrum of patient-oriented language processing research. The third workshop on patient-oriented language processing follows the successful CL4Health workshops (co-located with LREC-COLING 2024 and NAACL 2025), which clearly demonstrated the need for a computational linguistics venue focused on language related to public health.
CL4Health is concerned with the resources, computational approaches, and behavioral and socio-economic aspects of the public interactions with digital resources in search of health-related information that satisfies their information needs and guides their actions. The workshop invites papers concerning all areas of language processing focused on patients' health and health-related issues concerning the public. The issues include, but are not limited to, accessibility and trustworthiness of health information provided to the public; explainable and evidence-supported answers to consumer-health questions; accurate summarization of patients' health records at their health literacy level; understanding patients' non-informational needs through their language, and accurate and accessible interpretations of biomedical research. The topics of interest for the workshop include, but are not limited to the following:
* Health-related information needs and online behaviors of the public;
* Quality assurance and ethics considerations in language technologies and approaches applied to text and other modalities for public consumption;
* Summarization of data from electronic health records for patients;
* Detection of misinformation in consumer health-related resources and mitigation of potential harms;
* Consumer health question answering (Community Question Answering)(CQA);
* Biomedical text simplification/adaptation;
* Dialogue systems to support patients' interactions with clinicians, healthcare systems, and online resources;
* Linguistic resources, data, and tools for language technologies focusing on consumer health;
* Infrastructures and pre-trained language models for consumer health;
Resubmissions from the LREC Main Conference
We welcome submissions of topically-relevant papers that have been rejected from the main LREC conference. The scores and reviews from the main conference will be taken into consideration. Please ensure that you paste the original review and scores within the indicated text box on the submission page.
IMPORTANT DATES
February 18 25, 2026 -Workshop Paper Due Date️
March 13, 20 2026 - Notification of acceptance
March 20 28 2026 - Camera-ready papers due
April 10, 2026 - Pre-recorded video due (hard deadline)
May 12, 2026 - Workshop
KEYNOTE TALK
Kailai Yang, Department of Computer Science, University of Manchester
SHARED TASKS
Detecting Dosing Errors from Clinical Trials (CT-DEB'26).
Clinical Trials Dosing Errors Benchmark 2026 is a challenge to predict medication errors in clinical trials using Machine Learning. The Clinical Trials Dosing Errors Benchmark 2026 (CT-DEB'26) is dedicated to automated detection of the risks of medication dosing errors within clinical trial protocols. Leveraging a curated dataset of over 29K trial records derived from the ClinicalTrials.gov<http://clinicaltrials.gov/> registry, participants are challenged to predict the risk probabilities of protocols likely to manifest dosing errors. The dataset consists of various fields with numerical, categorical, as well as textual data types. Once the shared task is concluded and the leaderboard is published, the participants are invited to submit a paper to the CL4Health workshop.
Website: https://www.codabench.org/competitions/11891/
Automatic Case Report Form (CRF) Filling from Clinical Notes.
Case Report Forms (CRFs) are standardized instruments in medical research used to collect patient data in a consistent and reliable way. They consist of a predefined list of items to be filled with patient information. Each item aims to collect a portion of information relevant for a specific clinical goal (e.g., allergies, chronicity of disease, tests results). Automating CRF filling from clinical notes would accelerate clinical research, reduce manual burden on healthcare professionals, and create structured representations that can be directly leveraged to produce accessible, patient- and practitioners-friendly summaries. Even though the healthcare community has been utilizing CRFs as a basic tool in the day-to-day clinical practice, publicly available CRF datasets are scarce, limiting the development of robust NLP systems for this task. We present this Shared Task on CRF-filling aiming to enhance research on systems that can be applied in real clinical settings.
Website: https://sites.google.com/fbk.eu/crf/
ArchEHR-QA 2026: Grounded Question Answering from Electronic Health Records.
The ArchEHR-QA (“Archer”) shared task focuses on answering patients’ health-related questions using their own electronic health records (EHRs). While prior work has explored general health question answering, far less attention has been paid to leveraging patient-specific records and to grounding model outputs in explicit clinical evidence, i.e., linking answers to specific supporting content in the clinical notes. The shared task dataset consists of patient-authored questions, corresponding clinician-interpreted counterparts, clinical note excerpts with sentence-level relevance annotations, and reference clinician-authored answers grounded in the notes. ArchEHR-QA targets the problem of producing answers to patient questions that are supported by and explicitly linked to the underlying clinical notes. This second iteration builds on the 2025 challenge (which was co-located with the ACL 2025 BioNLP Workshop) by expanding the dataset and introducing four complementary subtasks spanning question interpretation, clinical evidence identification, answer generation, and answer–evidence alignment. Teams may participate in any subset of subtasks and will be invited to submit system description papers detailing their approaches and results.
Website: https://archehr-qa.github.io/
FoodBench-QA 2026: Grounded Food & Nutrition Question Answering.
FoodBench-QA 2026 is a shared task challenging systems to answer food and nutrition questions using evidence from nutrient databases and food ontologies.The dataset includes realistic dietary queries, ingredient and their quantities lists, and recipe descriptions, requiring models to perform nutrient estimation, FSA traffic-light prediction, and food entity recognition/linking across three food semantic models. Participants must generate accurate, evidence-based answers across these subtasks (or at least one of it). After the shared task concludes and the leaderboard is released, participants will be invited to submit their work to the Shared Tasks track of the CL4Health workshop at LREC 2026.
Website: https://www.codabench.org/competitions/12112/
SUBMISSIONS
Two types of submissions are invited:
- Full papers: should not exceed eight (8) pages of text, plus unlimited references. These are intended to be reports of original research.
- Short papers: may consist of up to four (4) pages of content, plus unlimited references. Appropriate short paper topics include preliminary results, application notes, descriptions of work in progress, etc.
Electronic Submission: Submissions must be electronic and in PDF format, using the Softconf START conference management system. Submissions need to be anonymous.
Papers should follow LREC 2026 formatting.
LREC provides style files for LaTeX and Microsoft Word at https://lrec2026.info/authors-kit/.
Submission site: https://softconf.com/lrec2026/CL4Health/
Dual submission policy: papers may NOT be submitted to the workshop if they are or will be concurrently submitted to another meeting or publication.
Share your LRs: When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones).
MEETING
The workshop will be hybrid. Virtual attendees must be registered for the workshop to access the online environment.
Accepted papers will be presented as posters or oral presentations based on the reviewers’ recommendations.
ORGANIZERS
- Deepak Gupta, US National Library of Medicine
- Paul Thompson, National Centre for Text Mining and University of Manchester, UK
- Dina Demner-Fushman, US National Library of Medicine
- Sophia Ananiadou, National Centre for Text Mining and University of Manchester, UK
--
Paul Thompson
Research Fellow
Department of Computer Science
National Centre for Text Mining
Manchester Institute of Biotechnology
University of Manchester
131 Princess Street
Manchester
M1 7DN
UK
http://personalpages.manchester.ac.uk/staff/Paul.Thompson/
We are arranging a workshop / training seminar aimed at advanced MA students and Ph.D. scholarship students.
May 4th to May 6th, 2026.
Bergen University.
There will be a reasonable conference fee covering snacks and lunches.
Calls for poster presentations will open soon.
Registration will open soon.
See
https://www.uib.no/en/lle/181518/workshop-tracking-eyes-and-mind
We welcome you to the next Natural Language Processing and Vision (NLPV) seminars at the University of Exeter.
Zoom scheduled: Thursday 19 Feb 2026 at 15:30 to 17:00 (expected to finish around 16:30), GMT
Location: https://Universityofexeter.zoom.us/j/92623498359?pwd=zPDKXy6XqUqEFKkQG2xPlE… (Meeting ID: 926 2349 8359 Password: 542291)
Title: Across the Magic Circle: AI Agents as a Creative Medium for Interactive Storytelling
Abstract: As an artist and researcher, I use AI chatbots as my primary expressive medium. My work explores the intersection of LLM applications, interactive storytelling, and computational art. In this talk, I will present research projects that extend generative AI’s storytelling capabilities. These include the game 1001 Nights, in which players must survive an AI King by telling stories; an art installation based on an emergent language system inspired by Nüshu (a historical women’s script created and used exclusively by women in China); and Dramamancer, an AI game-making system built with Midjourney.
Speaker's bio: Dr Yuqian (Uchan) Sun, also known as CheeseTalk, is an AI research artist based in London. She's now working at Midjourney. Yuqian aims to create 'alive' narrative experiences that extend beyond video games and into our daily lives through conversational AI agents. She explores linguistic interactions through chatbots, games and interactive installations. Yuqian's interdisciplinary arts and research have been featured in galleries and tech conferences including SIGGRAPH, CVPR, NeurIPS, GDC, Gamescom, AMaze and New York Times Square. She won the Reddot Design Award and Lumen Prize in 2024.
Personal website https://fakecheese.me/
X @cheesetalk1997
Check past and upcoming seminars at the following url: https://sites.google.com/view/neurocognit-lang-viz-group/seminars
Joining our *Google group* for future seminars and research information: https://groups.google.com/g/neurocognition-language-and-vision-processing-g…
FINAL CALL FOR PAPERS: The 1st Workshop on Computational Affective
Science (Deadline Extended)
--------------------------------------------------------------------------------------------------
Third and Final Call for Papers: The 1st Workshop on Computational
Affective Science (CAS 2026), co-located with the Language Resources and
Evaluation Conference (LREC) 2026 in Palma de Mallorca, Spain, May
11-16. (Submission deadline extended to 20 Feb 2026).
Website: https://casworkshop.github.io/
Contact: <cas-workshop(a)googlegroups.com>
We invite submissions to the first Workshop on Computational Affective
Science (CAS 2026), co-located with LREC 2026, on research related to
the understanding of affect and emotions through language and
computation. CAS will accept archival long and short paper submissions,
featuring substantial, original, and unpublished research. We also
encourage submissions of extended abstracts from researchers in the
broader Affective Science community, with up to two pages of content
featuring the research background/hypotheses and a description of
methods/results. Extended abstracts are non-archival, offering the
option for publication and presentation at other conference venues.
------------
MOTIVATION
------------
Affect refers to the fundamental neural processes that generate and
regulate emotions, moods, and feeling states. Affect and emotions are
central to how we organize meaning, to our behavior, to our health and
well-being, and to our very survival. Despite this, and even though most
of us are intimately familiar with emotions in everyday life, there is
much we do not know about how emotions work and how they impact our
lives. Affective Science is a broad interdisciplinary field that
explores these and related questions about affect and emotions.
Since language is a powerful mechanism of emotion expression, there is a
growing use of language data and advanced natural language processing
(NLP) algorithms to shed light on fundamental questions about emotions.
The Workshop on Computational Affective Science (CAS) aims to be a
dedicated venue for work focused specifically on the link between NLP
and affective science.
Interdisciplinary Scope: The workshop takes an interdisciplinary
approach to affective science and aims at bringing together NLP
researchers, scientists, and theorists from many research areas,
including psychology, sociology, neuroscience, and philosophy. Although
work in sentiment analysis is decades old, this work often proceeds
separately and in different fields from research and theory in affective
science. Meanwhile, affective scientists in psychology, sociology,
neuroscience and philosophy increasingly seek to use linguistic tools to
shed light on the nature of emotions, moods, and feeling states. CAS is
therefore co-organized by an interdisciplinary group of researchers
(spanning NLP and Affective Science) to foment collaboration at this
exciting frontier of research.
------------
SUBMISSIONS
------------
We invite long and short archival paper submissions, as well as
non-archival extended abstracts on a broad range of topics at the
intersection of affective science and natural language processing,
including but not limited to:
1. The Nature of Affect and Computational Modeling of Emotions
Computational experiments that add to our understanding of affect and
emotions, including findings relevant to:
- theories and nature of emotion
- the biology or neuroscience of emotions
- appraisal models
- dimensional models (valence / arousal / dominance)
- models of constructed emotion
- cognitive-affective architectures
- emotion dynamics (emergence, intensification, decay, transitions)
- emotion granularity
- emotion regulation
- affective embodiment
- evolutionary and developmental affect
- emotion–cognition interactions
These areas are relevant not just to human affect, but may also apply to
data animals and artificial agents.
2. Affective Data and Resources
Work on compiling and annotating affect-related information in text,
speech, facial and bodily expression, and physiological signals (ECG,
EEG, GSR, multimodal biosensing), with a focus on text data (monolingual
or multilingual) and multimodal data suitable for an NLP venue. Data
from underserved languages is especially encouraged.
3. Emotion Recognition, Prediction, and Inference
At the instance level:
- emotion classification (discrete emotions, dimensional ratings)
- emotion intensity estimation
- emotion cause detection
- context-aware affect inference (culture, situation, social setting)
- structured emotion analysis
At the aggregate level:
- creating emotion arcs
- determining broad trends in emotions over time or across locations
- tracking emotional responses toward entities of interest (e.g.,
climate change)
- document-level and cross-document emotion analysis
- labeling social networks
4. Applications
Including but not limited to:
- Affect and health, psychopathology, and mental disorders
- Affect and behavior/social science (e.g., interpersonal affect,
empathy, group-level affect, affect contagion, computational emotion
regulation)
- Affect and education
- Affect and literature/narratives/digital humanities
- Affect and commerce
5. Explainability and Interpretability in Computational Affective Models
Work aimed at improving the transparency and interpretability of
affective systems. This includes understanding how models represent and
infer emotions and identifying key cues driving predictions.
6. Ethics, Fairness, Theory Integration, Philosophical Implications
- Bias and generalizability of affective systems across demographics
- Privacy and ethics in affective data collection
- Examining whether automatic NLP systems rely on current and valid
theories of affect and emotion
- The implications of machines modeling or simulating affect
- Societal considerations surrounding affective artificial agents
------------
IMPORTANT DATES
------------
Submission deadline (**Extended**): 20 Feb 2026
Notification of acceptance: 16 March 2026
Camera Ready Paper due: 30 March 2026
Workshop date: 16 May 2026
------------
SUBMISSION DETAILS
------------
We invite submissions for archival long and short papers, as well as
non-archival extended abstracts.
Archival long and short papers should feature novel and unpublished work
relating to the topics detailed above.
We also invite submissions of extended abstracts from researchers in the
broader Affective Science community, with up to two pages of content
featuring the research background/hypotheses and a description of
methods/results. Extended abstracts are non-archival, offering the
option for publication and presentation at other conference venues.
Archival Track:
Long Paper: Consists of up to 8 pages of content, with additional pages
for references, limitations, ethical considerations, and appendices.
Short Paper: Consists of up to 4 pages of content, with additional pages
for references, limitations, ethical considerations, and appendices.
(When preparing camera ready papers, you will be allowed one extra page
to address comments by the reviewers.)
Non-Archival Track:
Extended Abstract: Up to 2 pages.
------------
SUBMISSION FORMAT
------------
All submissions must use the LREC 2026 template and follow the
guidelines found at: https://lrec2026.info/authors-kit/ (Note: extended
abstracts can be limited to being 1-2 pages in length).
Mandatory Ethics Section: We ask all authors to include a section on
Ethical Considerations in their submission, touching on the ethical
concerns and broader societal impacts of the work. This discussion
section will not count towards the page limit.
------------
SUBMISSION SITE
------------
All submissions must be made through the SoftConf portal:
https://softconf.com/lrec2026/CAS
------------
ADDITIONAL DETAILS
------------
Website: https://casworkshop.github.io/
Attendance: The workshop will follow the attendance policy of the main
conference (https://lrec2026.info/registration-policy/ ).
------------
ORGANIZERS
------------
- Christopher Bagdon, University of Bamberg, Germany
- Krishnapriya Vishnubhotla, National Research Council Canada
- Kristen A. Lindquist, The Ohio State University, USA
- Lyle Ungar, University of Pennsylvania, USA
- Roman Klinger, University of Bamberg, Germany
- Saif M. Mohammad, National Research Council Canada
***Contact us at <cas-workshop(a)googlegroups.com> with any questions.***
***APOLOGIES FOR CROSS-POSTING***
*****************************************************************
Dear Colleagues,
Due to multiple requests, we are pleased to announce a deadline extension for submissions to the Workshop on Dialects in NLP: A Resource Perspective (DialRes-LREC26), which will be held at LREC 2026 in Palma de Mallorca, Spain. The new submission deadline is now February 27, 2026.
Key Information
Workshop Title
Dialects in NLP: A Resource Perspective (DialRes-LREC26)
Event
Workshop at LREC 2026 (Hybrid event — in person and online)
Location
Palma de Mallorca, Spain (and Online)
Workshop Date
May 16, 2026
Website
https://dialres.github.io/dialres/<https://www.google.com/url?q=https%3A%2F%2Fdialres.github.io%2Fdialres%2F>
Contact
dialres-lrec26(a)googlegroups.com<https://www.google.com/url?q=mailto%3Adialres-lrec26%40googlegroups.com>
Overview
Dialectal and non-standard varieties pose persistent challenges for linguistic resource development. While in-depth study and large-scale resource creation for dominant or standard varieties have driven major advances in language technology, linguistic resources that adequately represent dialectal variation remain scarce. It therefore remains an open question whether standard-centric practices address dialectal variation or instead create new problems for dialects.
DialRes-LREC26 invites submissions on the creation, analysis, and evaluation of dialectal resources, including—but not limited to—work that critically examines how standard-centric methodologies impact dialects in the development of linguistic resources and models. We especially encourage contributions addressing the consequences of such practices for speech and morphosyntactic modelling, OCR of dialectal and historical texts, orthographic normalisation and homogenisation, annotation practices and lemmatisation strategies that abstract away or suppress dialectal forms, as well as analyses of how these choices affect dialects and their communities methodologically, economically, and socially.
The workshop focuses on problems, limitations, and trade-offs in developing dialectal resources from a linguistic perspective, while encouraging the creation and evaluation of resources in formats that enable reuse by the NLP community.
Workshop Topics
*
Development and evaluation of dialectal oral and textual resources
*
Orthographic normalisation and homogenisation, including their impact on dialectal variation
*
Dialects vs. standard language varieties in annotation frameworks
*
Cross-lingual and cross-dialectal transfer and model adaptation
*
Resource scalability issues and techniques
*
Use and limitations of large language models (LLMs) in dialectal resource development
*
OCR for dialectal, non-standard, and historical texts: challenges, errors, and downstream effects
*
Resources for, and applications supporting, dialect revitalisation and preservation
*
Dialectal studies and teaching from a resource-oriented perspective
*
Working on dialectal resources: academic, financial, legal, and societal issues
*
Enabling and empowering dialect communities to develop their own resources
Submission Information
Instructions for Authors Submissions are electronic, using the Softconf START conference management system via the link: https://softconf.com/lrec2026/DialRes. They must be 4 to 8 pages long (excluding references and potential Ethics Statements) and follow the LREC stylesheet, available on the conference website on the Author’s kit page Author’s Kit<https://lrec2026.info/authors-kit/>. All templates are also available from this<https://lrec2026.info/calls/second-call-for-papers/> page.
Invited Speaker
Prof. Barbara Plank, LMU Munich (https://bplank.github.io/)
Important Dates [updated]
Submission Deadline
February 27, 2026 [updated]
Notification of Acceptance
March 18, 2026 [updated]
Camera-ready Papers Due
March 28, 2026
Resubmissions from the LREC Main Conference
It will also be possible to submit papers that were rejected from the LREC 2026 main conference to DialRes 2026. Such submissions must be revised to fit the scope and format of the workshop and must comply with the same anonymization requirements.
Endorsements The workshop is endorsed by UniDive COST Action CA21167 and Archimedes Athena R.C.
Organizing Committee
*
Antonios Anastasopoulos — George Mason University / Archimedes–Athena RC
*
Stella Markantonatou — ILSP / Archimedes–Athena RC
*
Angela Ralli — University of Patras / Archimedes–Athena RC
*
Marcos Zampieri — George Mason University
*
Stavros Bompolas — Archimedes–Athena RC
*
Vivian Stamou — Archimedes–Athena RC
We look forward to receiving your contributions!
Sincerely,
Stavros Bompolas
On behalf of the Organizing Committee of DialRes-LREC26
[Apologies for cross-postings]
Call for Papers
First International Workshop on Extraction from Triplet
Text-Table-Knowledge Graph and associated Challenge
https://ecladatta.github.io/triplet2026/
in conjunction with the 23rd European Semantic Web Conference (ESWC 2026)
https://2026.eswc-conferences.org/, Dubrovnik, Croatia
Important dates:
- **Submission deadline**: 3 March, 2026 (11:59pm, AoE)
- **Notifications**: 31 March, 2026
- **Camera-ready deadline**: 15 April, 2026 (11:59pm, AoE)
- **Workshop**: Sunday 10 May OR Monday 11 May 2026
Motivation:
Understanding information spread across text and table is essential for
tasks such as question answering and fact checking. Existing benchmarks
primarily deal with semantic table interpretation or reasoning over
tables for question answering, leaving a gap in evaluating models that
integrate tabular and textual information, perform joint information
extraction across modalities, or can automatically detect
inconsistencies between modalities.
This workshop aims to provide a forum for exchanging ideas between the
NLP community working on open information extraction and the vibrant
Semantic Web community working on the core challenge of matching tabular
data to Knowledge Graphs, on populating knowledge graphs using texts and
on reasoning across text, tabular data and knowledge graphs. The
workshop also targets researchers focusing on the intersection of
learning over structured data and information retrieval, for example, in
retrieval augmented generation (RAG) and question answering (QA)
systems. Hence, the goal of the workshop is to connect researchers and
trigger collaboration opportunities by bringing together views from the
Semantic Web, NLP, database, and IR disciplines.
Scope:
The topics of interest include but are not limited to:
- Semantic Table Interpretation
- Automated Tabular Data Understanding
- Using Large Language Models (LLMs) for Information Extraction
- Generative Models and LLMs for Structured Data
- Knowledge Graph Construction and Completion with Tabular Data and Texts
- Analysis of Tabular Data on the Web (Web Tables)
- Benchmarking and Evaluation Frameworks for Joint Text-Table Data Analysis
- Applications (e.g. data search, fact-checking, Question-Answering, KG
alignment)
Submission Guidelines:
We invite two types of submissions:
1. Full research papers (12-15 pages) including references and appendices
2. Challenge papers (6-8 pages) including references and appendices
All submissions should be formatted in the CEUR layout format,
https://www.overleaf.com/latex/templates/template-for-submissions-to-ceur-w…
This workshop is double-blind and non-archival. Submissions are managed
through EasyChair at
https://easychair.org/conferences/?conf=triplet2026. All accepted papers
will be presented as posters or as oral talks.
**TRIPLET Challenge:**
In recent years, the research community has shown increasing interest in
the joint understanding of text and tabular data, often, for performing
tasks such as question answering or fact checking where evidences can be
found in texts and tables. Hence, various benchmarks have been developed
for jointly querying tabular data and textual documents in domains such
as finance, scientific publications, and open domain. While benchmarks
for triple extraction from text for Knowledge Graph construction and
semantic annotation of tabular data exist in the community, there
remains a gap in benchmarks and tasks that specifically address the
joint extraction of triples from text and tables by leveraging
complementary clues across these different modalities.
The TRIPLET 2026 challenge is proposing three sub-tasks and benchmarks
for understanding the complementarity between tables, texts, and
knowledge graphs, and in particular to propose a joint knowledge
extraction and reconciliation process.
#Sub-Task 1: Assessing the Relatedness Between Tables and Textual Passages
The goal of this task is to assess the relatedness between tables and
textual passages (within documents and across documents). For this
purpose, we have constructed LATTE (Linking Across Table and Text for
Relatedness Evaluation), a human annotated dataset comprising table–text
pairs with relatedness labels. LATTE consists of 7,674 unique tables and
41,880 unique textual paragraphs originating from 3,826 distinct
Wikipedia pages. Each text paragraph is drawn from the same or
contextually linked pages as the corresponding table, rather than being
artificially generated. LATTE provides a challenging benchmark for
cross-modal reasoning by requiring classification of related and
unrelated table–text pairs. Unlike prior resources centered on
table-to-text generation or text retrieval, LATTE emphasizes
fine-grained semantic relatedness between structured and unstructured data.
The Figure below provides an example, using a web-annotation tool we
developed, of how we identify the relatedness between the sentence
containing the entity AirPort Extreme 802.11n (highlighted in Orange)
and the data table providing information about output power and
frequency for this entity. Participants are provided with tables and
textual passages that would need to be ranked. The evaluation will use
metrics such as P@k, R@k and F1@k.
Go to https://www.codabench.org/competitions/12776/ and enroll to
participate in this Task.
#Sub-Task 2: Joint Relation Extraction Between Texts and Tables
The goal of this task is to automatically extract knowledge jointly from
tables and related texts. For this purpose, we created ReTaT, a dataset
that can be used to train and evaluate systems for extracting such
relations. This dataset is composed of (table, surrounding text) pairs
extracted from Wikipedia pages and has been manually annotated with
relation triples. ReTaT is organized in three subsets with distinct
characteristics: domain (business, telecommunication and female
celebrities), size (from 50 to 255 pairs), language (English vs French),
type of relations (data vs object properties), close vs open list of
relation, size of the surrounding text (paragraph vs full page). We then
assessed its quality and suitability for the joint table-text relation
extraction task using Large Language Models (LLMs).
Given a Wikipedia page containing texts and tables and a list of
predicates defined in Wikidata, a participant system should extract
triples composed of mentions located partly in the text and partly in
the table and disambiguated with entities and predicates identified in
the Wikidata reference knowledge graph. For example, in the Figure
below, an annotation triple <Q13567390, P2109, 24.57> is associated with
mentions highlighted in orange (subject), blue (predicate) and green
(object) to annotate the document available at
https://en.wikipedia.org/wiki/AirPort_Extreme. Similar to the
Text2KGBench evaluation
(https://link.springer.com/chapter/10.1007/978-3-031-47243-5_14), and
because the set of triples are not exhaustive for a given sentence, to
avoid false negatives, we follow a locally closed approach by only
considering the relations that are part of the ground truth. The
evaluation then uses metrics such as P, R and F1.
Go to https://www.codabench.org/competitions/12936/ and enroll to
participate in this Task.
# Sub-Task 3: Detecting Inconsistencies Between Texts, Tables and
Knowledge Graphs
The goal of this task is to check the consistency of knowledge extracted
from tables and texts with existing triples in the Wikidata knowledge
graph. Different kind of inconsistencies will be considered in this
task. Participants to this task will be able to report on their findings
in their system paper.
See the Figure at
https://ecladatta.github.io/images/triplet_annotation_tool.png
# Data & Evaluation:
For the first 2 sub-tasks, we have released a training dataset with
ground-truth annotations, enabling participant teams to develop machine
learning-based systems, and in particular for training purposes and for
hyperparameter optimizations and internal validations.
A separate blind test dataset will remain private and be used for
ranking the submissions.
Participants should register on Codabench and then enroll for each
sub-task separately (Task 1:
https://www.codabench.org/competitions/12776/ and Task 2:
https://www.codabench.org/competitions/12936/). Each team are allowed a
limited number of daily submissions, and the highest achieved accuracy
will be reported as the team's final result. We encourage participants
to develop open-source solutions, to utilise and fine-tune pre-trained
language models and to experiment with LLMs of various size in zero-shot
or few-shot settings.
# Challenge Important Dates:
- Release of training set: 13 February 2026
- Deadline for registering to the challenge: 15 March 2026
- Release of test set: 24 March 2026
- Submission of results: 10 April 2026
- System Results & Notification of Acceptance: 17 April 2026
- Submission of System Papers: 28 April 2026
- Presentations @ TRIPLET Workshop: May 2026
Workshop Organizers
- Raphael Troncy (EURECOM, France)
- Yoan Chabot (Orange, France)
- Véronique Moriceau (IRIT, France)
- Nathalie Aussenac-Gilles (IRIT, France)
- Mouna Kamel(IRIT, France)
Contact:
For discussions, please use our Google Group,
https://groups.google.com/g/triplet-challenge
The workshop is supported by the ECLADATTA project funded by the French
National Funding Agency ANR under the grant ANR-22-CE23-0020.
--
Raphaël Troncy
EURECOM, Campus SophiaTech
Data Science Department
450 route des Chappes, 06410 Biot, France.
e-mail: raphael.troncy(a)eurecom.fr & raphael.troncy(a)gmail.com
Tel: +33 (0)4 - 9300 8242
Fax: +33 (0)4 - 9000 8200
Web: http://www.eurecom.fr/~troncy/
Dear colleagues,
We are looking for a postdoctoral researcher with experience in explainable AI and human-computer interaction who will pursue an independent research agenda, acquire third-party funding, and eventually establish their own research group. More details can be found in our application portal https://jobs.dfki.de/en/vacancy/en-senior-researcher*in-fur-erklarbare-spra…
*The application deadline is March 1st.*
The position is initially assigned to the Efficient and Explainable NLP group in the Multilinguality and Language Technology research department at DFKI in Saarbrücken, which is headed by Dr. Simon Ostermann (https://www.dfki.de/en/web/research/research-departments/multilinguality-an…). The group works on national and international research and development projects in the field of explainable and efficient language processing. The environment offers a very active publication culture, high methodological expertise, a focus on basic research, and close ties to current developments in NLP and AI research.
The focus of the position is on the research and development of explainable AI systems with a special emphasis on user perspectives and interaction. Relevant topics include explainable multimodal models, user-centred explanation approaches, and the generation of rationales for complex model decisions. The specific design of the research agenda leaves room for your own ideas and new research directions.
A central goal of the position is to establish or expand an independent scientific profile. This includes, in particular, the development of your own project ideas, the acquisition of third-party funding, and the preparation and establishment of your own research group in the aforementioned field within the MLT research department.
The position is ideally to be filled on 1 April 2026; later employment is possible. The position is initially limited to three years, but may be made permanent if the candidate successfully assists in acquiring third-party funding.
Your tasks
- You will work on independent research questions in the field of explainable AI and interaction as part of a BMFTR-funded project.
- You will develop and evaluate user-centred explanatory approaches for complex AI models.
- You will publish your research at leading international conferences.
- You will design and apply for independent third-party funded projects in collaboration with other members of the group.
- You will actively participate in developing your own research agenda and take on scientific leadership tasks in the future.
- You will supervise master's theses and doctoral dissertations and participate in teaching.
Your qualifications
- Completed doctorate in computer science, AI, computational linguistics, human-computer interaction or a related field.
- Very good knowledge of explainable AI, interpretability and language processing.
- High motivation for independent research and profile building.
- Initial experience in acquiring third-party funding.
- A convincing track record of publications in the field of NLP and explainable AI, ideally at leading international conferences such as ACL, EMNLP, NAACL, EACL, COLING, CHI, FAccT, NeurIPS, ICLR or ICML.
- Very good publication and communication skills.
Your benefits
- An excellent research environment with high international visibility.
- Great scientific freedom while being part of an established research group.
- Active support in acquiring third-party funding and setting up your own working group.
- An interdisciplinary network at the interface of AI research, language processing and human-computer interaction.
- A young, motivated and collegial team.
- A working environment in which we place great value on positive, respectful and constructive collaboration.
The German Research Center for Artificial Intelligence (DFKI) has operated as a non-profit, Public-Private-Partnership (PPP) since 1988. DFKI combines scientific excellence and commercially-oriented value creation with social awareness and is recognized as a major "Center of Excellence" by the international scientific community. In the field of artificial intelligence, DFKI has focused on the goal of human-centric AI for more than 35 years. Research is committed to essential, future-oriented areas of application and socially relevant topics.
DFKI encourages applications from people with disability; DFKI intends to increase the proportion of female employees in the field of science and encourages women to apply for this position.
12th Workshop on the Challenges in the Management of Large Corpora The next meeting of CMLC (see also http://corpora.ids-mannheim.de/cmlc.html ) will be held as part of the LREC-2026 conference [3] in Palma, Mallorca.
3rd Call for Papers and deadline extensionImportant dates *
Deadline for paper submission: the 16th
25th of February 2026 (Monday, 23:59 UTC)
* Notification of acceptance: the 12th of March 2026 (Thursday)
* Deadline for the submission of camera-ready papers: the 30th of March 2026 (Monday)
* Meeting: the 11th of May, morning slot
Paper submission * We invite anonymised extended abstracts for oral presentations on the topics
listed above, as PDF created according to LREC-2026 templates [1].
Length and content: 4 to 8 pages in length, excluding acknowledgements, references,
potential Ethics Statements and discussion on Limitations. Appendices or
supplementary material are not permitted during the initial submission
phase, as papers should be self-contained and reviewable on their own.
However, appendices and supplementary material will be allowed in the
final, camera-ready version of the paper.
* CMLC has always reserved a track for national corpus project reports, and to
this end, we invite poster proposals of 500-750 words. National project
reports need not be anonymised.
* Submissions are accepted solely through the LREC START system [2].
* A volume of proceedings will be published online by ELRA. Oral and poster
contributions will have equal status.
Workshop description As in the previous CMLC meetings, we wish to explore common areas of interest across a range of issues in language resource management, corpus linguistics, natural language processing, natural
language generation, and data science.
Large textual datasets require careful design, collection, cleaning, encoding, annotation, storage, retrieval, and curation to be of use for a wide range of research questions and to users across a
number of disciplines. A growing number of national and other very large corpora are being made available, many historical archives are being digitised, numerous publishing houses are opening their
textual assets for text mining, and many billions of words can be quickly sourced from the web and online social media.
A mixed blessing of the times is that much of those texts, in mono- and multi-lingual arrangements can now be created automatically by exploiting Large Language Models at various scales. That, on the
one hand, makes it possible to inflate the amounts of data where normally data would be scarce: in under-resourced languages or language varieties, in specific genres or for intricate and rarely
attested constructions. On the other hand, such procedures immediately raise concerns regarding the authenticity and quality of such data, casting doubt on the possibility of adequately (truthfully,
verifiably, reproducibly) addressing the kind of research questions that provoked the rapid but tainted increase of the available data volumes in the first place. Similar doubts may be directed at
mass creation of secondary and tertiary data ordinarily crucial for linguistic research: apart from potential legal constraints on the use of the initial amounts of human-created data, new questions
arise as to the legal status of the derived data, the ways to create e.g. provenance metadata of the derived resources, and the level of trust regarding mass-produced grammatical (and other)
annotation layers.
These new as well as more traditional questions lie at the base of the list of topics that management of large corpora (for any currently suitable definition of “large”) invokes or at least strongly
brushes against.
Topics of interest This year's event adds new items to the standard range of CMLC themes and addresses some of LREC-2026 focus topics:
*
Interoperability and accessibility
How to make corpora as accessible as possibleInteroperable APIs for query and analysis softwareProvision of multiple levels of access for different tasks
* Machine/Deep Learning
Data preparation for machine learning inputCreation, curation, maintenance and
dissemination of language models based on machine learning (e.g. word
embeddings and entire deep learning networks)Legal issues concerning language model distribution
* Linguistic content challenges
Dealing with the variety of language:
multilinguality, minority and/or underrepresented languages, historical
texts, noisy OCR texts, user-generated content, etc.Diversity and inclusion in language resourcesIntegration of human computation (crowdsourcing) and automatic annotationQuality management of
annotationsEnsuring linguistic integrity of data
through deduplication, correction of typos and errors, removal of
incomplete or malformed sentences, and filtering harmful, offensive and
toxic content, etc.Integrating different linguistic data types (text, audio, video, facsimiles, experimental data, neuroimaging data, …)
* Technical challenges
Storage and retrieval solutions for large
text corpora: primary data (potentially including facsimiles, etc.),
metadata, and annotation dataCorpus versioning and release managementScalable and efficient NLP tooling for
annotating and analysing large datasets: distributed and GPGPU
computing; using big data analysis frameworks for language processingDealing with streaming data (e.g. Social Media) and rapidly changing corporaEnvironmental impact of big language data
computingEngineering and management of research software
* Exploitation challenges
Legal and privacy issuesQuery languages, data models, and standardisationLicensing models of open and closed data, coping with intellectual property restrictionsInnovative approaches for
aggregation and visualisation of text analyticsRepurposing or extending application areas of existing corpora and tools
National corpus initiatives In the tradition of CMLC, we invite reports on national corpus initiatives; submitters of these reports should be prepared to present a poster. Given that it's been a while since the last round, we
would be happy to have a little "What's the news?" session, and we cordially invite both our veteran presenters as well as colleagues who have not yet introduced their national corpus projects,
Our poster sessions are usually scheduled to overlap with the coffee break, to ensure informal atmosphere and to maximally use the time slot available to us. A flash presentation section is plan for
just before the poster session: ca. 3 minutes for the highlights.
LRE 2026 Map and the "Share your LRs!" initiative When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that
have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.) to enable
their reuse and replicability of experiments (including evaluation ones).
Programme Committee * Laurence Anthony (Waseda University, Japan)
* Vladimír Benko (Slovak Academy of Sciences)
* Felix Bildhauer (IDS Mannheim)
* Mark Davies (English-Corpora.org)
* Nils Diewald (IDS Mannheim)
* Kaja Dobrovoljc (University of Ljubljana / Jožef Stefan Institute)
* Jarle Ebeling (University of Oslo)
* Tomaž Erjavec (Jožef Stefan Institute, Ljubljana)
* Andrew Hardie (Lancaster University, UK)
* Serge Heiden (ENS de Lyon)
* Ulrich Heid (University of Hildesheim)
* Nancy Ide (Vassar College / Brandeis University)
* Olha Kanishcheva (Heidelberg University)
* Gražina Korvel (Vilnius University)
* Natalia Kocyba (Samsung Poland)
* Michal Křen (Charles University, Prague)
* Anna Latusek (ICS PAS, Warsaw)
* Paul Rayson (Lancaster University)
* Laurent Romary (INRIA)
* Thomas Schmidt (University of Duisburg-Essen)
* Serge Sharoff (University of Leeds)
* Maria Shvedova (Kharkiv Polytechnic Institute / University of Jena)
* Irena Spasić (Cardiff University)
* Martin Wynne (University of Oxford)
Organising Committee * 📩 Piotr Bański (IDS Mannheim)
* 📩 Dawn Knight (Cardiff University)
* 📩 Marc Kupietz (IDS Mannheim)
* 📩 Andreas Witt (IDS Mannheim)
* 📩 Alina Wróblewska (ICS PAS, Warsaw)
[1] LREC-2026 templates https://lrec2026.info/authors-kit/
[2] LREC START system https://softconf.com/lrec2026/CMLC2026/
[3] LREC-2026 conference https://lrec2026.info/
**Final Call for Papers (with extended deadline)**
Gaze4NLP - The Second Workshop on Gaze Data and Natural Language Processing
12 May 2026, Palma de Mallorca, Spain (co-located with LREC 2026)
https://gaze4nlp.github.io/Gaze4NLP2026/
The Second Workshop on Gaze Data and Natural Language Processing
(Gaze4NLP) invites papers of a theoretical or experimental
nature describing research methodologies by employing
interdisciplinary perspectives, including computer science and
engineering perspectives and cognitive sciences, and identifying
challenges to resolve in the intersection of the two domains: eye
tracking and NLP. Gaze4NLP aims to bring together researchers
conducting research on eyes on eyes on text and NLP; and
establishing bridges between them for identifying future venues
of research.
Workshop webpage:
https://gaze4nlp.github.io/Gaze4NLP2026/
Important Dates
Workshop paper submission deadline: 23 February 2026
Workshop paper acceptance notification: 16 March 2026
Workshop paper camera-ready versions: 30 March 2026
Workshop date: 12 May 2026
All deadlines are 11:59PM UTC-12:00 (anywhere on Earth)
Topics for the workshop will include, but are not limited to:
- Investigating the pillars for bridging the gap between the research
on eyes on text and NLP. Study how to expand research methodologies
by employing interdisciplinary perspectives, including computer
science and engineering perspectives and cognitive sciences, and
identify challenges, issues to resolve.
- Exploring new areas so that both fields benefit from each other
better than the past, identifying novel domains of exploration for
further research.
- Discussing how to develop cognitively inspired models that align
human reading data with LLMs.
Submissions
We solicit regular workshop papers, which will be included in the
proceedings as archival publications. The length of the papers should
be between 4 and 8 pages (excluding references). The submissions
should not include any appendices. Accepted papers will be presented
in the form of either oral or poster presentations.
Please note that camera-ready papers are allowed an additional page of
content to address reviewer comments, and unlimited pages for
appendices. The workshop proceedings will be part of the ACL
anthology. Accepted papers will also be given an opportunity with an
extended version to be published as part of an edited book.
Submissions will be handled via the START Conference Manager.
- Submission link: https://softconf.com/lrec2026/Gaze4NLP/
All submissions should follow the LREC style guidelines. We strongly
recommend the use of the LaTeX style files, OpenDocument, or Microsoft
Word templates created for LREC: <https://lrec2026.info/authors-kit/>.
All papers must be anonymous, i.e., not reveal author(s) on the title
page or through self-references. So, e.g., “We previously showed
(Smith, 2020)”, should be avoided. Instead, use citations such as
“Smith (2020) previously showed”.
LRE-Map and Sharing Language Resources
When submitting a paper from the START page, authors will be asked to
provide essential information about resources (in a broad sense, i.e.
also technologies, standards, evaluation kits, etc.) that have been
used for the work described in the paper or are a new result of your
research. Moreover, ELRA encourages all LREC authors to share the
described LRs (data, tools, services, etc.) to enable their reuse and
replicability of experiments (including evaluation ones).
Organization Committee:
Cengiz Acarturk, Jagiellonian University, Poland
Jamal Nasir, University of Galway, Ireland
Burcu Can, University of Stirling, Scotland, UK
Cagri Coltekin, University of Tubingen, Germany