We are excited to announce the Call for Participation for NTCIR-19 Tip-of-the-Tongue (ToT) Shared Task. ToT known-item retrieval is defined as “an item identification task in which the searcher has previously experienced an item but cannot recall a reliable identifier”—i.e., “It’s on the tip of my tongue…”. After 3 successful years as a TREC Track, the ToT shared task is expanding to NTCIR for 2026. The NTCIR-19 ToT Shared Task will focus on open-domain ToT information needs in multiple languages (English, Chinese, Japanese, and Korean). You can participate in the shared task in any subset of these languages, and you are also welcome to present your work remotely at the NTCIR conference in Tokyo in December 2026.
Please visit the following websites for further information.
Task guidelines: https://ntcir-tot.github.io/guidelines
Registration: https://research.nii.ac.jp/ntcir/ntcir-19/howto.html (Deadline: June 1)
Important dates
March 27: Release corpus and training queries
May: Release test queries
June 1st: Deadline for registration
July (tentative): Deadline for submitting runs
Please consider participating and help us spread the word!
Best regards,
Fernando Diaz
On behalf of the NTCIR-19 ToT Shared Task organizers
10th Workshop on Online Abuse and Harms (WOAH) @EMNLP: 2nd CFP
*** Second Call for Papers ***
We invite paper submissions to the 10th Workshop on Online Abuse and Harms (WOAH), which will take place on 24-29 October at EMNLP 2026.
Website: https://www.workshopononlineabuse.com/cfp.html
Important Dates
* Registration deadline for mentorship programme: April 10, 2026
* Notification of mentor/mentee match: April 25, 2026
* Submission due: June 26, 2026
* ARR reviewed submission due: August 3, 2026
* Notification of acceptance: August 15, 2026
* Camera-ready papers due: September 10, 2026
* Workshop: 24-29 October 2026
Overview
Digital technologies have brought significant benefits to society, transforming how people connect, communicate, and interact. However, these same technologies have also enabled the widespread dissemination and amplification of abusive and harmful content, such as hate speech, harassment, and misinformation. Given the sheer volume of content shared online, addressing abuse and harm at scale requires the use of computational tools. Yet, detecting and moderating online abuse remains a complex task, fraught with technical, social, legal, and ethical challenges.
The 10th Workshop on Online Abuse and Harms (WOAH) invites paper submissions from a diverse range of fields, including but not limited to natural language processing, machine learning, computational social science, law, political science, psychology, sociology, and cultural studies. We explicitly encourage interdisciplinary research, technical and non-technical contributions, and submissions that focus on under-resourced languages. Non-archival papers and civil society reports are also welcome.
Topics covered by WOAH include, but are not limited to:
* New models or methods for detecting abusive and harmful online content, including misinformation;
* Biases and limitations in existing detection models or datasets for abusive and harmful content, especially those in commercial use;
* Development of new datasets and taxonomies for online abuse and harms;
* Novel evaluation metrics and procedures for detecting harmful content;
* Analyses of the dynamics of online abuse, its propagation, and its impact on different communities;
* Social, legal, and ethical considerations in detecting, monitoring, and moderating online abuse.
Special Theme: “Ten Years of WOAH: Reflecting on Progress and New Frontiers”
In its 10th edition, WOAH highlights the theme “Ten Years of WOAH: Reflecting on Progress and New Frontiers”. Over the past decade, WOAH has become a central interdisciplinary venue for online harms research. As harms and enabling technologies have evolved, the field has moved beyond an early focus on textual hate speech and harassment to address more complex phenomena. Advances in AI and online ecosystems have expanded the scale and diversity of harms. Transformer models, multimodal platforms, and recommendation systems have contributed to the escalation of issues like misinformation, radicalisation, child sexual exploitation, identity-based abuse, algorithmic bias, privacy violations, and AI-mediated harms. Methods tackling this have evolved from monolingual lexicon-based approaches to deep learning, multilinguality, multimodality, interpretability, and interdisciplinarity.
Despite this progress, fundamental challenges remain. There is limited consensus on what constitutes “harm”, how context and thresholds should be defined, or how harms vary across cultures and modalities. These ambiguities affect datasets and models, constrain comparability, and often marginalise affected communities. The past decade also calls for critical self-reflection. Research has frequently prioritised detection, high-resource languages, and narrowly defined phenomena over intervention, global perspectives, and systemic or structural harms, with insufficient attention to user agency, platform incentives, lived experience, and participatory approaches. Finally, ten years of work have underscored that interdisciplinarity is essential for addressing the sociotechnical nature of the phenomenon. Addressing future online harms will require deeper integration across NLP, ML, social sciences, law, policy, and HCI. WOAH 10 seeks to consolidate lessons from the past decade, identify enduring gaps, and connect research, practice, and policy to guide the next generation of work on online harms.
Submission
Submission is electronic, using the Softconf START conference management system.
Submission link: TBA
The workshop will accept three types of papers.
1) Academic Papers (long and short): Long papers of up to 8 pages, excluding references, and short papers of up to 4 pages, excluding references. Unlimited pages for references and appendices. Accepted papers will be given an additional page of content to address reviewer comments. Previously published papers cannot be accepted.
2) Non-Archival Submissions: Up to 2 pages, excluding references, to summarise and showcase in-progress work and work published elsewhere.
3) Civil Society Reports: Non-archival submissions, with a minimum of 2 pages and no upper limit. Can include work published elsewhere.
All submissions must use the official ACL style files<https://github.com/acl-org/acl-style-files>. Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review. All submissions should adhere to the workshop policies https://www.workshopononlineabuse.com/policies.html.
WOAH Community
We are excited to share the WOAH community Slack channel — a workspace for researchers interested in or working on understanding and addressing online abuse and harms!
Join us here: https://join.slack.com/t/hatespeechdet-47d7560/shared_invite/zt-2a8d96j4z-g…
Contact Info
Please send any questions about the workshop to organizers(a)workshopononlineabuse.com<mailto:organizers@workshopononlineabuse.com>
Organisers
Agostina Calabrese, Cohere
Thomas Davidson, Rutgers University-New Brunswick
Christine de Kock, University of Melbourne
Urja Khurana, Delft University of Technology
Marta Marchiori Manerba, University of Turin
Paloma Piot, Universidade da Coruña
Zeerak Talat, University of Edinburgh
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
*** Second Call for Participation for HAHA at IberLEF 2026
<https://sites.google.com/view/iberlef-2026> ***
Humor Analysis based on Human Annotation and Automatic Humor Generation
https://www.fing.edu.uy/inco/grupos/pln/haha/
Codabench page: https://www.codabench.org/competitions/14700/
NEWS:
The trial and development data have been released. You can now submit your
systems for the development phase!
Can computers be funny? Can humans identify computer-generated humor?
While humor has been studied historically from psychological, cognitive,
and linguistic perspectives, its computational study is an active area of
research in Machine Learning and Computational Linguistics that has gained
traction in recent years. There has been significant development mainly in
the field of automatic humor detection and classification, but a
characterization of humor that enables its automatic recognition and
generation is far from being solved.
This task aims to gain better insight into what is humorous and what causes
laughter, and to take some steps forward by assessing the capabilities of
current LLMs to generate actual humorous content in Spanish and attempting
to see whether it’s possible to automatically distinguish between
computer-generated humor and humor written by humans. The target audience
is NLP researchers interested in advancing the understanding of highly
subjective and creative tasks, though anyone is welcome to participate.
Task description
This year, the HAHA evaluation campaign proposes three different subtasks
related to automatic humor detection and generation, with the aim of
deepening our understanding of computational humor.
Subtask 1 - Humor Detection: determining if a news headline is satirical or
real. The main performance metric for this subtask will be the F1 score of
the 'humorous' class. This subtask is similar to the first subtask proposed
in previous editions of the HAHA shared task, but this time it's applied to
a particular domain where humorous and non-humorous content might sometimes
be difficult to tell apart.
Subtask 2 - LLM-generated humor detection: determining if a joke inspired
by a news headline was generated by an LLM or written by a human. The main
performance metric for this subtask will be the F1 score of the 'automatic'
class.
Subtask 3 - Humor Generation: generating jokes from a news headline using
computational methods. This subtask will be evaluated through human
preference judgments, employing LLM arena-style battles between pairs of
generated jokes, and ranking the systems using an Elo-based leaderboard.
How to Participate
The CodaBench page for the competition is available:
https://www.codabench.org/competitions/14700/
Important Dates
March 18th, 2026: team registration page.
April 8th, 2026: development sets released and open for dev submissions.
May 27th, 2026: test sets released and open for test submissions.
June 3rd, 2026: end of test submissions, publication of results of subtasks
1 and 2.
June 10th, 2026: publication of results of subtask 3.
June 12th, 2026: paper submission.
June 23rd, 2026: notification of acceptance.
July 1st, 2026: camera-ready paper submission.
September 2026: IberLEF 2026 Workshop.
<Apologies for cross-postings>
------------------------------------------------
Release of test data and registration still open !!
PROFE 2026: Language Proficiency Evaluation
IberLEF 2026 Shared Task
Website URL: https://sites.google.com/view/profe2026
CodaLab site: https://www.codabench.org/competitions/15902/
PROFE 2026 reuses the exams for Spanish proficiency evaluation developed by Instituto Cervantes along many years to evaluate human students. Therefore, automatic systems will be evaluated under the same conditions as humans were. Systems will receive a set of exercises with their corresponding instructions without specific training material. In this way we expect Transfer Learning approaches or the use of Generative Large Language Models.
The previous edition proposed exams based only on text. In this new edition, we will include exams with images, which sometimes require interpretation to answer the exercise correctly. We propose evaluating systems on their ability to perform multimodal reasoning, moving beyond text-only comprehension.
We will provide a limited set of new image-based exercises while retaining the dataset from the previous edition. This setup encourages participants to develop strategies for handling the scarcity of specific training data.
Subtasks
PROFE 2026 has three subtasks, one per exercise type. Teams can participate in any combination of them. Each subtask contains several exercises of the same type. The subtasks are:
1.
Multiple choice subtask: each exercise includes a text and a set of multiple-choice questions about the text where only one answer is correct. Given a multiple-choice question, systems must select the correct answer among the candidates.
2.
Matching subtask: each exercise contains two sets of texts. Systems must find the text in the second set that best matches the first set. There is only one possible matching per text, but the first set can contain extra unnecessary texts.
3.
Filling the gap subtask: each exercise contains a text with several gaps corresponding to textual fragments that have been removed and presented disorderly as options. Systems must determine the correct position for each fragment. There is only one correct text per gap, but there could be more candidates than gaps.
The different exercises open research on how to approach them, adapting different prompts when using generative models.
As the main novelty in this edition, some exercises will contain images. While some of these images will be the candidate answers (rather than text excerpts), others might provide visual information needed to answer the exercise correctly. Conversely, some images will not provide essential information. Consequently, systems participating in this edition must adopt a multimodal approach, capable of discerning when to integrate visual cues and when to disregard them. This necessity to filter visual relevance introduces significant new challenges compared to the previous edition.
Dataset
We will use the IC-UNED-RC-ES dataset created from real examinations at Instituto Cervantes. These exams were created by human experts to assess language proficiency in Spanish. We have already collected the exams and converted them to a digital format, which is ready to be used in the task. The dataset contains exams at different levels (from A1 to C2). The description of the full dataset was published in the following paper:
*
Anselmo Peñas, Álvaro Rodrigo, Javier Fruns-Jiménez, Inés Soria-Pastor, Sergio Moreno-Álvarez, Alberto Pérez García-Plaza, and Julio Reyes-Montesinos. A Spanish Language Proficiency Dataset for AI Evaluation<https://www.mdpi.com/2078-2489/17/2/159>. Information 17, no. 2: 159. DOI: 10.3390/info17020159<https://doi.org/10.3390/info17020159>. 2026.
The complete dataset contains 282 exams with 855 exercises. The total number of evaluation points are 6146 (among 16570 options) distributed by exercise type as:
multiple-choice: 3544 responses
matching: 2309 responses
fill-the-gap: 293 responses
In PROFE 2026, we plan to use around 50% of the exams; the other 50% was already used for the PROFE 2025 edition.
We intend not to distribute the gold standard to prevent overfitting in post-campaign experiments and data contamination in LLMs.
Evaluation measures and baseline
We will use traditional accuracy (proportion of correct answers) as the main evaluation measure. Systems will receive evaluation scores from two different perspectives:
*
At the question level, where correct answers are counted individually without grouping them.
*
At the exam level, where scores for each exam are considered. Each exam contains several exercises of different types. An exam is considered to be passed if an accuracy score (accounted as the proportion of correct answers) above 0.5 is reached. Then, the proportion of passed exams is given as a global score. This perspective will only apply to those teams participating in the three subtasks.
More in detail, the exact evaluation per subtask is as follows:
*
Multiple choice subtask: we will measure accuracy as the proportion of questions correctly answered
*
Matching subtask: we will measure accuracy as the proportion of correct texts matched.
*
Fill in the gap subtask: We will measure accuracy as the proportion of correctly filled gaps.
We will use accuracy as the evaluation measure because there is only one correct option among candidates and because it is the measure applied to humans doing the same exams. Thus, we can compare the performance of automatic systems and humans under the same conditions
A preliminary baseline using ChatGPT obtains the following results for each exercise type (provided that different prompting can produce slightly different results):
*
Multiple choice accuracy: 0.64
*
Filling the gap accuracy: 0.43
*
Matching accuracy: 0.51
Schedule
April
April 10, 2026
Development data released
April 27, 2026
Test set release
May
May 11, 2026
Deadline for submitting runs
May 18, 2026
Release of evaluation results
June
June 3, 2026
Paper submission deadline
Organizers
Alvaro Rodrigo<https://www.uned.es/universidad/docentes/informatica/alvaro-rodrigo-yuste.h…>, UNED NLP & IR Group (Universidad Nacional de Educación a Distancia)
Anselmo Peñas<https://www.uned.es/universidad/docentes/informatica/anselmo-penas-padilla.…>, UNED NLP & IR Group (Universidad Nacional de Educación a Distancia)
Alberto Pérez<https://www.uned.es/universidad/docentes/informatica/alberto-perez-garcia-p…>, UNED NLP & IR Group (Universidad Nacional de Educación a Distancia)
Sergio Moreno<https://www.uned.es/universidad/docentes/en/informatica/sergio-moreno-alvar…>, UNED NLP & IR Group (Universidad Nacional de Educación a Distancia)
Javier Fruns, Instituto Cervantes
Inés Soria, Instituto Cervantes
Rodrigo Agerri<https://ragerri.github.io/>, HiTz (Universidad del País Vasco, UPV/EHU)
AVISO LEGAL. Este mensaje puede contener información reservada y confidencial. Si usted no es el destinatario no está autorizado a copiar, reproducir o distribuir este mensaje ni su contenido. Si ha recibido este mensaje por error, le rogamos que lo notifique al remitente.
Le informamos de que sus datos personales, que puedan constar en este mensaje, serán tratados en calidad de responsable de tratamiento por la UNIVERSIDAD NACIONAL DE EDUCACIÓN A DISTANCIA (UNED) c/ Bravo Murillo, 38, 28015-MADRID-, con la finalidad de mantener el contacto con usted. La base jurídica que legitima este tratamiento, será su consentimiento, el interés legítimo o la necesidad para gestionar una relación contractual o similar. En cualquier momento podrá ejercer sus derechos de acceso, rectificación, supresión, oposición, limitación al tratamiento o portabilidad de los datos, ante la UNED, Oficina de Protección de datos<https://www.uned.es/dpj>, o a través de la Sede electrónica<https://uned.sede.gob.es/> de la Universidad.
Para más información visite nuestra Política de Privacidad<https://descargas.uned.es/publico/pdf/Politica_privacidad_UNED.pdf>.
(apologies for cross-posting; please redistribute)
KONVENS 2026 FINAL Call for Conference Papers & Deadline Extension!
https://konvens2026.uni-hamburg.de/
We are delighted to share the second call for papers with you for Konferenz zur Verarbeitung natürlicher Sprache (KONVENS) 2026, organized under the auspices of the GSCL, the DGfS-CL, the ÖGAI, and SwissNLP. This year’s KONVENS will take place in Hamburg, September 14 – 17 under the special theme “Context Matters: NLP Beyond Text”. The conference will include a diverse program including talks by our two keynote speakers:
* Dr. Valentin Hoffmann, Allen Institute for AI
* Prof. Dr. Barbara Plank, LMU Munich.
We invite the submission of long and short papers featuring substantial, original, and unpublished research on Natural Language Processing and Computational Linguistics, to be archived in the ACL Anthology, as well as abstract submissions that describe research in progress or published elsewhere. Beyond standard research contributions, submissions are welcome that present negative results, survey an area, introduce new resources, articulate a position, report novel linguistic insights obtained using existing computational methods, or reproduce (successfully or not) previous findings.
We welcome the following types of paper submissions:
* Long papers (up to 8 pages plus references), describing original research with substantial new results.
* Short papers and demos (up to 4 pages plus references), including small and focused contributions, work in progress, as well as descriptions of projects, systems and resources.
* Abstracts (1 page, non-archival), which will be presented at the poster session and printed in the proceedings, but which will be non-archival. We especially invite submission on ongoing projects, student projects, past or ongoing bachelor and master theses, ongoing or recently completed PhD theses, and opinion pieces in this category to foster interaction and discussion in our community.
Papers can be submitted either to the main conference track or to the special track “Context Matters”.
Context Matters Track
The widespread use of large language models (LLMs) and other types of language technology in research and real-world applications has fundamentally reshaped how natural language processing (NLP) systems interact with people and their environments. As NLP systems increasingly operate in socially embedded, high-impact settings like search, conversational agents and recommendation systems in business, education, medicine, law, and beyond, it becomes crucial to move beyond text in isolation and to account for the many forms of context that shape language use and interpretation. These include user-related factors (e.g., identity aspects like socio-demographic characteristics and the resulting perspectival differences), cultural and societal context, interaction history, application constraints, and signals from other modalities.
The “Context Matters” track focuses on how different forms of context influence NLP systems, their design, their behavior, and their use. We invite work that studies NLP not as decontextualized text processing, but as situated technology embedded in human, social, disciplinary, and multimodal environments. Here, disciplines and application domains are important not only as areas of use, but as sources of structured contextual knowledge, perspectives, and methodological traditions — particularly from the social sciences and humanities, but also law, education, psychology, economics, and the natural sciences.
In particular, the special theme includes:
* Research that models user- and group-related context, such as identity aspects, socio-demographic variables, cultural background, or perspectival differences, and examines how these factors affect language use, system behavior, or system impact
* Work that draws on or operationalizes concepts from other disciplines like the social sciences and related fields (e.g., social theory, cultural analysis, behavioral perspectives) to better understand linguistic phenomena, system outputs, or evaluation settings
* Research analyzing social, societal, and institutional context, including norms, power structures, and real-world deployment environments, especially with respect to ethics, bias, and societal consequences
* Studies of application context, where domain-specific constraints (e.g., in education, law, public administration, or the natural sciences) shape both language use and system requirements
* Approaches that move beyond text-only processing and integrate multiple modalities (e.g., vision, audio, video, sensor data), with attention to the distinct contextual signals these modalities introduce
* Work incorporating interactional context, such as dialogue history, user intent, and evolving human–AI interaction dynamics
While the modelling component should include language, we especially encourage contributions that treat language as part of a broader contextual ecosystem, aiming toward more grounded, adaptive, and socially aware NLP systems.
Papers must be in English and formatted in accordance with the ACL style sheet https://github.com/acl-org/acl-style-files and submitted via the submission link: https://openreview.net/group?id=GSCL.org/KONVENS/2026/Conference
Please consider the OpenReview policy for new accounts:
* New profiles created without an institutional email will go through a moderation process that can take up to two weeks.
* New profiles created with an institutional email will be activated automatically.
KONVENS also adopts the ACL policies for submission, review, and citation, the ACL privacy policy, and the ACL code of ethics.
Further information can be found on the conference website:
https://konvens2026.uni-hamburg.de/
Submissions need to be anonymized to ensure double-blind review. However, we allow for pre-prints to be posted any time before or during the review period. We strongly encourage authors to use LaTeX in preparing their document.
Important dates:
NEW 12.5.2026 Paper Submission Deadline
12.7.2026 Notification of Acceptance
01.8.2026 Camera-Ready Deadline
14.9. – 17.9.2026 KONVENS in Hamburg
See you in Hamburg!
Your conference chairs,
Heike Zinsmeister, Chris Biemann, and Anne Lauscher
Dear community!
We are delighted to invite you for submission to the 5th Workshop on NLP
for Positive Impact co-located at EMNLP 2026!
Workshop website: https://sites.google.com/view/nlp4positiveimpact
<https://sites.google.com/view/nlp4positiveimpact>Call for paper: https://sites.google.com/view/nlp4positiveimpact/call-for-papers-2026
Submission methods: OpenReview both direct submissions and ARR May Cycle commitment.
We also accept non-archival submissions.
Important dates:
ARR May Cycle Submission Due: May 25th, 2026
Direct Submissions Due: June 26th, 2026 via https://openreview.net/group?id=EMNLP/2026/Workshop/NLP4PI<https://openreview.net/group?id=EMNLP/2026/Workshop/NLP4PI#tab-your-consoles>
ARR Reviewed Submissions Commitment Due: July 26th, 2026 (tentative)
Notification of Acceptance (both channels): August 15th, 2026
Camera-Ready Papers Due: September 10th, 2026
Workshop Date: October 24th-29th 2026 (co-located with EMNLP 2026)
All deadlines are 11:59 PM (Anywhere on Earth)
Workshop Summary
The increasing adoption of language-oriented AI systems offers unprecedented opportunities for positive societal impact. NLP technologies have matured to the point where they can meaningfully contribute to addressing global challenges like poverty, hunger, healthcare, education, inequality, COVID-19, and climate change, aligning with the UN sustainability goals.
This workshop aims to advance innovative NLP research that benefits society, emphasizing responsible methods and impactful applications. We welcome submissions in areas including, but not limited to:
* Grounding NLP in Real-World Impact: Beyond improving model performance, how can NLP systems be directly tied to social outcomes? This could include case studies of real-world deployments or strategies for better deployment and maintenance practices.
* Underexplored Applications: While NLP for healthcare and mental well-being is well-established, we encourage research tackling overlooked areas such as poverty, hunger, energy, and climate change.
* Interdisciplinary Collaborations: We highly value work that integrates insights from other fields, such as social science, political science, economics, philanthropy, and HCI, and we encourage submissions of case studies or examples that highlight such collaborations.
Special Theme: Measuring the Societal Impact of AI and NLP
This year we would like to find an answer to the question: How can we measure the social impact of AI and NLP? With even the bigger raise of opportunities of AI and language technologies, we would like to understand how it influences society and if in positive manners. Position, philosophical-grounded, and new evaluation framework suggestion papers are very much welcomed to enhance the discussion!
Submission Types
We encourage diverse contributions, including:
* Identifying social needs and affected demographics.
* Proposing new tasks or directions through position papers.
* Conducting literature reviews or philosophical discussions on NLP’s societal impact.
* Designing user studies, surveys, or ethical frameworks.
* Exploring interdisciplinary methods and collaboration strategies.
Submissions must address the ethical and societal implications of the work, with a clear focus on defining and achieving positive impact. We look forward to fostering discussions that inspire actionable, responsible advancements in NLP for the greater good.
Papers Format
Both long and short paper submissions should follow all of the ARR
submission requirements
https://aclrollingreview.org/cfp#paper-submission-information, including: Long Papers <https://aclrollingreview.org/cfp#long-papers> (8 pages) and Short Papers (4 pages).
Organizers
Katherine Atwell (Northeastern University)
Angana Borah (University of Michigan)
Dr. Daryna Dementieva (Technical University of Munich)
Prof Elisa Kreiss (University of California)
Dr. Neema Kotonya (Dataminr)
Jiarui Liu (Carnegie Mellon University)
Liz Olson (Dataminr)
Ruyuan Wan (Pennsylvania State University)
Prof Jieyu Zhao (University of Southern California)
Steering Committee
Prof Rada Mihalcea (University of Michigan)
Dr. Joel Tetreault (Dataminr)
Dr. Zhijing Jin (University of Toronto)
Contact Email: nlp4pi.workshop(a)gmail.com<mailto:nlp4pi.workshop@gmail.com>
All positive regards,
Daryna Dementieva
On behalf of NLP4PI Workshop Organizers
* Atelier TAL@Santé 2026 * @ CORIA-TALN 2026 -- 29 juin 2026, Nantes
Site Internet : [ https://atelier-tal-sante.github.io/ | https://atelier-tal-sante.github.io/ ]
APPEL A COMMUNICATION
Dans le cadre des conférences conjointes CORIA-TALN 2026, l'atelier TAL@Santé 2026 vise à fédérer la communauté francophone du Traitement Automatique des Langues (TAL) appliqué à la santé. Il ambitionne de croiser outils, méthodes, ressources, retours d’expérience et perspectives autour des textes cliniques et biomédicaux.
DATES IMPORTANTES
* Soumission des articles : 29 avril 2026
* Notification aux auteurs : 13 mai 2026
* Version finale : 21 mai 2026
* Atelier : 29 juin 2026, de 9h00 à 17h30
TYPES ET FORMAT DES SOUMISSIONS
Les types d’articles acceptés sont :
* Articles résumés (3 pages max + références) :
- Travaux préliminaires en cours
- Description d'un projet de recherche
- Traduction d’un article récemment accepté (ou en cours de soumission) dans une conférence internationale
* Articles classiques (entre 6 et 10 pages + références) :
- Contribution nouvelle
- Etat de l’art
- Résultat négatif apportant une perspective nouvelle à un problème scientifique
- Prise de position présentant un point de vue sur l’état des recherches en TAL et santé
Les articles acceptés seront présentés au cours de la journée sous la forme d’une présentation orale ou d’un poster.
Ils seront également publiés dans les actes de la conférence CORIA-TALN 2026.
La langue officielle de la conférence est le français. Si tous les auteurs sont francophones, les articles doivent être rédigés en français. Si l’un des auteurs n’est pas francophone, les articles peuvent être rédigés en anglais.
Site de soumission : [ https://openreview.net/group?id=ls2n.fr/CORIA-TALN/2026/Workshop/TAL-Sante | https://openreview.net/group?id=ls2n.fr/CORIA-TALN/2026/Workshop/TAL-Sante ]
THEMATIQUES DE L’ATELIER
Les sujets d’intérêt incluent, mais ne sont pas limités à :
- Extraction d’entités, de relations et d’événements complexes
- Extraction d’information et classification de textes cliniques ou biomédicaux
- Accessibilité : simplification de textes médicaux, littératie en santé, communication patient–soignant
- Mésinformation et qualité de l’information en santé
- Détection et atténuation des biais
- Enjeux éthiques du TAL pour la santé
- Approches frugales
- Cadres d’évaluation, reproductibilité et métriques orientées vers l’usage
- Approches génératives : factualité, traçabilité (RAG, citations, vérification), détection d’hallucinations
- Schémas d’annotation et méthodologies de construction de ressources annotées
- Annotation assistée par LLM
- Création de modèles de langue spécialisés, adaptation de domaine, apprentissage par transfert, apprentissage fédéré, apprentissage faiblement supervisé
- Analyse automatique de la littérature scientifique pour la santé
CONTACT : atelier-tal-sante(a)univ-nantes.fr
ORGANISATEURS
Richard Dufour (LS2N, Nantes Université)
Yanis Labrak (IDIAP)
Emmanuel Morin (LS2N, Nantes Université)
Aurélie Névéol (LISN, Université Paris-Saclay)
Aman Sinha (ATILF, Université de Lorraine)
Laura Zanella (Doctolib)
Pierre Zweigenbaum (LISN, Université Paris-Saclay)
Xavi
On 24 Apr 2026, at 14:00, corpora-request(a)list.elra.info wrote:
Send Corpora mailing list submissions to
corpora(a)list.elra.info
To subscribe or unsubscribe via email, send a message with subject or
body 'help' to
corpora-request(a)list.elra.info
You can reach the person managing the list at
corpora-owner(a)list.elra.info
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Corpora digest..."
Today's Topics:
1. MIAI–PRAIRIE Online Seminar: LLMs and the Study of Language, Mind, and Society
(Thierry Poibeau)
2. One postdoc position in NLP/CL at University of Technology Nuremberg
(Michael Roth)
3. CFP Reminder: International Conference ‘New Trends in Translation and Interpreting Technology’ (NeTTIT’2026)
(Constantin Orasan)
----------------------------------------------------------------------
Message: 1
Date: Thu, 23 Apr 2026 21:12:54 +0100
From: Thierry Poibeau <thierry.poibeau(a)ens.psl.eu>
Subject: [Corpora-List] MIAI–PRAIRIE Online Seminar: LLMs and the
Study of Language, Mind, and Society
To: corpora(a)list.elra.info
Message-ID: <24E07222-5277-4DCF-91DA-D9209F3FF60A(a)ens.psl.eu>
Content-Type: multipart/alternative;
boundary="Apple-Mail=_9FF39057-FF73-44AE-BDF5-05632B17E77A"
LLMs and the Study of Language, Mind, and Society
As part of the MIAI–PRAIRIE seminar series, organized by Caroline Rossi (Université Grenoble Alpes / MIAI) and Thierry Poibeau (ENS–PSL / PRAIRIE–PSAI).
Online, with no registration
LLMs have profoundly transformed the way research is conducted and develops across a wide range of disciplines, including linguistics, philosophy, psychology, and the social sciences. Beyond their technical performance, these systems raise new questions about language, cognition, interpretation, and the production of knowledge itself.
This new online seminar, jointly organized by Caroline Rossi (U. Grenoble Alpes / MIAI) and Thierry Poibeau (ENS-PSL / Prairie-PSAI) aims to explore recent research in these areas. It will provide a forum for discussing both empirical and theoretical work, bringing together perspectives from different fields to better understand the implications of LLMs for the study of language and mind. The seminar also seeks to foster dialogue between researchers who use these models in practice and those who critically examine their assumptions, limitations, and broader impact.
The first speaker will be Steven Piantadosi, from Berkeley. The next speakers will include Adele Goldberg (Princeton), Eloïse Boisseau (AMU, Marseille), and Dallas Card (U. Michigan).
The seminar will take place approximately once a month. The full schedule for the coming months will be announced shortly.
----
*** Monday 27 April, 5pm (French time), online (free access, no registration) ***
Connexion link: https://webinaire.numerique.gouv.fr/meeting/signin/invite/78275/creator/433…
*** Neuroscience, behavior, and what's in-between, ***
Steven T. Piantadosi, UC Berkeley (Psychology) & Helen Wills Neuroscience Institute
I'll present an overview of a forthcoming book about how we can link neuroscience to cognition and behavior. Drawing on several little-known results in early computer science, I'll describe how patterns in behavior can rigorously imply the existence of particular unobserved states and structures. This provides a foundation for linking behavioral regularities to what must be present in neural implementations. The resulting states are often re-describable in abstract terms more familiar to cognitive science, like "sets",
"numbers", "stacks", etc. I'll highlight the implementation of "stacks", commonly used for grammars, and show how to characterize the space of possible neural implementations, including with subsystems/circuits operating in serial and parallel. The approach provides a set of concrete hypotheses, a guide for neural data analysis, and points towards a method for understanding structure in modern AI systems, including LLMs. I'll conclude by suggesting a Marr-like framework in which the bridges between levels can be made rigorous, connecting behavior, high-level theorizing, and neural implementation.
Steven T. Piantadosi is a professor at UC Berkeley in Psychology and the Helen Wills Neuroscience Institute, where he heads the Computation and Language Lab. He has a PhD from MIT in Brain and Cognitive Sciences and undergraduate degrees in mathematics and linguistics. His work spans neural and cognitive research, with a focus on understanding how children come to know language, math, and abstract concepts. He often uses computational methods, including machine learning, cognitive modeling, mathematical analysis, and Bayesian data analysis. His research methods also include anthropological fieldwork, experimental work with children, and collaboration to study non-human primates and human neuroscience.
PhD Position: Structural Generalization in Transformer-based LLMs
*Start date:* as soon as possible
*Application deadline:* 11 May 2026
The research groups of Michael Hahn <https://www.mhahn.info/> and
Alexander Koller <https://www.coli.uni-saarland.de/~koller/> are jointly
looking for a PhD student. The student will be jointly advised by both
professors.
The position is funded through the new project “Structural
generalization in transformer-based LLMs”, which is part of the DFG
Special Priority Program Robust Assessment and Safe Applicability of
Language Modeling: Foundations for a New Field of Language Science and
Technology <https://www.lasting-spp.org/>. The goal of the SPP is to
bring together research in linguistics and LLMs so they can mutually
inform each other.
The starting point of the project is the observation that transformers
struggle with /structural generalization:/ they do not perform well on
test instances that are more complex than the training instances. We see
this e.g. when parsing complex sentences or when managing complex
reasoning chains. In the project, we want to develop theory that
explains this limitation, carry out empirical research to pinpoint the
transformer’s capabilities, understand them through mechanistic
interpretability, and find ways to improve structural generalization in
transformers. Methods range from formal language theory to training and
prompting-based experiments to circuit analysis. We will carry out the
project in collaboration with Will Merrill <https://lambdaviking.com/>
and Yuekun Yao <https://ykyaol7.github.io/>.
This is a position on the German TV-L E13 scale
<https://oeffentlicher-dienst.info/c/t/rechner/tv-l/west?id=tv-l-2023>
(100%). The starting salary of a 100% TV-L E13 position is a bit over
50,000 Euros per year and increases with experience. The position is
funded for three years; we expect to be able to extend it to four. We
can be flexible with the start date (within the year 2026).
*Requirements*
We are looking for candidates who have finished, or are about to
complete, an excellent Master’s degree in computer science,
computational linguistics, or a related discipline. The ideal candidate
will have outstanding programming skills and algorithmic understanding;
a strong understanding of current methods in machine learning; and
strong communication skills in English (spoken and written).
*About the group*
Saarland University is one of the leading centers for computational
linguistics in Europe, and offers a dynamic and stimulating research
environment. The Department of Language Science and Technology
<https://www.lst.uni-saarland.de/en/> consists of about 100 research
staff in ten research groups in the fields of computational linguistics,
psycholinguistics, and language science. It hosts the SFB 1102
“Information Density and Linguistic Encoding”
<https://sfb1102.uni-saarland.de/>.
Michael Hahn and Alexander Koller are members of the Research Training
Group “Neuroexplicit Models of Language, Vision, and Action”
<https://www.neuroexplicit.org/>, one of the largest centers for
research on neurosymbolic models in the world. They actively collaborate
with colleagues at the university’s computer science department, the Max
Planck Institute for Informatics <https://www.mpi-inf.mpg.de/home>, the
Max Planck Institute for Software Systems <https://www.mpi-sws.org>, and
the CISPA Helmholtz Center for Information Security
<https://cispa.de/en>. The Saarland Informatics Campus
<https://saarland-informatics-campus.de/en> brings together 1000
researchers and 2600 students from 81 countries; SIC faculty have won
roughly 50 ERC grants.
Saarland University is located in Saarbrücken
<https://en.wikipedia.org/wiki/Saarbr%C3%BCcken>, a city of roughly 180k
people in the tri-border area of Germany, France, and Luxembourg
<https://quattropole.org/en>. Saarbrücken combines a lively culture
scene with a relaxed atmosphere, and is quite an affordable place to
live in. Our department maintains an international and diverse work
environment. The primary working language is English; learning German
while you are here will make it easier to connect with the local
culture, but is not necessary for your work.
*How to apply*
Please submit your application by email to apply-ak(a)coli.uni-saarland.de
and include the reference number W2846
<https://www.uni-saarland.de/fileadmin/upload/verwaltung/stellen/Wissenschaf…>.
Preference will be given to applications received by 11 May 2026.
Include a single PDF file with the following information:
1. a statement of research interests that motivates why you are
applying for this position and outlines your research agenda;
2. a full CV including your list of publications;
3. scans of transcripts and academic degree certificates;
4. the names, affiliations, and e-mail addresses of two people who can
provide letters of reference for you.
Saarland University especially welcomes applications from women and
people with disabilities.
The legally binding version of this job ad is here: W2846
<https://www.uni-saarland.de/fileadmin/upload/verwaltung/stellen/Wissenschaf…>.
<Apologies for cross-postings>
--------------------------------------
*SECOND CALL FOR PARTICIPATION *
--------------------------------------
MIRROR@IberLEF20206: Motivational Interviewing Response & Rating via
Synthetic cOnversational tuRns
https://mirror-iberlef.vercel.app/
Challenge platform: https://www.mirror-iberlef.lat/
Development phase ends april 30th, make submissions and get feedback.
Final phase starts on May 1st.
-------------------------------
****Task description****
-------------------------------
We invite the community to develop Generative AI (GenAI) methods for
creating synthetic conversation turns that can substantially improve the
performance of models trained to recognize behavior codes (BCs) in the
context of motivational interviews. A BC is a discrete, observable
clinician action (e.g., asking a question, giving information) that is
counted during coding of a motivational interviewing session to quantify
specific techniques used. These codes allow raters to tally how often
particular clinician behaviours occur, which helps assess adherence to
MI-consistent versus MI-inconsistent practice. Our ultimate goal is to
generate valuable data for training models for the automatic assessment
of clinicians’ motivational-interviewing skills. These skills — crucial
for promoting behavior change among patients — can be evaluated by using
the “Motivational Interviewing Treatment Integrity (MITI)” rubric
(https://tinyurl.com/38byjrwy).
*
*
*This is a data-centric competition: *participants are expected to
produce high-quality datasets representing a wide range of clinical
conversations (rather than training a model) to enhance the performance
of a frozen baseline model used for BC classification. We encourage
participants to include samples featuring clients from diverse
backgrounds, varied conversation topics, and conversing with different
types of health professionals.
Participants in this competition should provide three datasets (one per
pair of considered BCs) of at most 100 labeled conversation turns that
will be used to fine-tune pretrained models; the fine-tuned models will
then be used to make predictions for a hold-out dataset. The performance
of the fine-tuned model will be used as the leading evaluation metric to
rank participants. The considered pairs of BCs are:
(1) Simple reflection vs. Complex reflection;
(2) Open question vs. Closed question;
(3) Persuasion vs. Giving Information.
Sample submissions, and detailed instructions on the formatting,
evaluation criteria and competition platform will be available at the
MIRROR website.
-------------------------------
****Important dates****
-------------------------------
* Mar 9th: Start of the development phase (platform starts receiving
submissions for the validation set)
* May 1st: Start of the final phase (platform starts receiving
submissions for the test set)
* May 11th: End of evaluation campaign (deadline for submission of runs)
* May 22nd; Publication of official results
* Jun 8th: Deadline for paper submission
* Jun 23th: Acceptance notification
* Jun 30th: Camera-ready submission deadline
* Sep, TBD: Publication of proceedings
* Sep, TBD: Workshop with SEPLN 2026
-------------------------------
****Organizing team****
-------------------------------
* Luis J. Arellano INAOE, Mexico
* Carlos Olachea INAOE, Mexico
* John Piette, University of Michigan, USA
* Hugo Jair Escalante, INAOE, Mexico
* Delia Irazú Hernández, INAOE, Mexico
* Luis Villaseñor, INAOE, Mexico
* Manuel Montes, INAOE, Mexico
Contact: Hugo Jair Escalante (hugo.jair(a)gmail.com)
*********
AVISO DE CONFIDENCIALIDAD: Este correo electrónico, incluyendo en su caso, los archivos adjuntos al mismo pueden contener información de carácter confidencial y/o privilegiada, y se envían a la atención única y exclusivamente de la persona y/o entidad a quien va dirigido. La copia, revisión, uso, revelación y/o distribución de dicha información confidencial sin la autorización por escrito del Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE) está prohibida. Si usted no es el destinatario a quien se dirige el presente correo, favor de contactar al remitente respondiendo al presente correo y eliminar el correo original incluyendo sus archivos, así como cualquiera copia de este.
Mediante la recepción del presente correo usted reconoce y acepta que en caso de incumplimiento de su parte y/o de sus representantes a los términos antes mencionados, este Centro Público de Investigación tendrá el derecho de reclamar los daños y perjuicios que dicha vulneración le cause; asimismo se hace de su conocimiento que el Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE) está obligado a salvaguardar los datos personales que le sean proporcionados por terceros, en los términos de la Ley General de Protección de Datos Personales en Posesión de Sujetos Obligados.
AVISO DE PRIVACIDAD, En cumplimiento con la Ley General de Protección de Datos Personales en Posesión de Sujetos Obligados, al recibir datos de carácter personal a través de este medio, se entiende el consentimiento expreso del titular de los datos personales para utilizarlos en actividades propias del Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE). Para mayor información, lo invitamos a consultar el Aviso de Privacidad en nuestro portal: https://www.inaoep.mx