In this newsletter:
New publications:
DEFT Chinese and English Light and Rich ERE Parallel Annotation<https://catalog.ldc.upenn.edu/LDC2026T04>
MATERIAL Tagalog-English Language Pack<https://catalog.ldc.upenn.edu/LDC2026S05>
LORELEI Somali Representative Language Pack<https://catalog.ldc.upenn.edu/LDC2026T03>
________________________________
New publications:
DEFT Chinese and English Light and Rich ERE Parallel Annotation<https://catalog.ldc.upenn.edu/LDC2026T04> was developed by LDC and consists of 179 Chinese discussion forum documents and their English translations annotated for entities, relations, and events (ERE). Light ERE annotation labels entity mentions for the target set of entity, relation, and event types between and among those entities including coreference. Rich ERE annotation expands types and tagging in the entities, relations, and events annotation tasks and replaces strict event coreference with a more loosely defined event hopper annotation. 179 Chinese-English document pairs were annotated following Light ERE annotation guidelines; a subset of 171 Chinese-English document pairs were also labeled with Rich ERE annotation. The source data and English translations were drawn from BOLT Chinese Discussion Forum Parallel Training Data (LDC2017T05)<https://catalog.ldc.upenn.edu/LDC2017T05>, originally collected and translated by LDC under the DARPA BOLT program.
DARPA's Deep Exploration and Filtering of Text (DEFT) program aimed to address remaining capability gaps in state-of-the-art natural language processing technologies related to inference, causal relationships, and anomaly detection. LDC supported the DEFT program by collecting, creating, and annotating a variety of data sources.
2026 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
MATERIAL Tagalog-English Language Pack<https://catalog.ldc.upenn.edu/LDC2026S05> was developed by Appen<http://www.appen.com/> for the IARPA MATERIAL<https://www.iarpa.gov/index.php/research-programs/material> program and contains 100 hours of Tagalog conversational telephone speech, transcripts, English translations, annotations, and queries. Calls were made using different telephones (e.g., mobile, landline) from a variety of environments. Transcripts cover approximately 30% of the speech files, 2% of which were translated into English. This release also includes domain annotations, English queries, and their relevance annotations.
The MATERIAL program focused on underserved languages with the ultimate goal to build cross language information retrieval systems to find speech and text content using English search queries.
2026 members can access this corpus through their LDC accounts provided they have submitted a completed copy of the special license agreement. Non-members may license this data for a fee.
*
LORELEI Somali Representative Language Pack<https://catalog.ldc.upenn.edu/LDC2026T03> contains over 13 million words of Somali monolingual text, 800,00 words of which were translated into English, and 106,000 Somali words translated from English data. Approximately 73,000 words were annotated for simple named entities, around 23,000 words were annotated for full entity (including nominals and pronouns), and over 10,000 words were covered by noun phrase chunking annotation. Data was collected from discussion forum, news, reference, social network, and weblogs.
The LORELEI (Low Resource Languages for Emergent Incidents) program was concerned with building human language technology for low resource languages in the context of emergent situations. Representative languages were selected to provide broad typological coverage.
The knowledge base for entity linking annotation is available separately as LORELEI Entity Detection and Linking Knowledge Base (LDC2020T10)<https://catalog.ldc.upenn.edu/LDC2020T10>.
2026 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
Dear all,
We are pleased to announce the second round of the Model Compression Shared
Task <https://www2.statmt.org/wmt26/model-compression.html> at WMT 2026
<https://www2.statmt.org/wmt26/>.
This shared task aims to evaluate the potential of model compression
techniques in reducing the size of general-purpose large language models,
with the goal of achieving an optimal balance between practical
deployability and high translation quality in specific machine translation
(MT) scenarios. The task’s broader objectives include fostering research
into the efficient, accessible, and sustainable deployment of LLMs for MT,
establishing a common evaluation framework to monitor progress in model
compression across a wide range of languages, and enabling meaningful
comparisons with state-of-the-art MT systems through standardized
evaluation protocols designed to assess not only translation quality but
also computational efficiency.
Although the focus is on model compression, the task is closely aligned
with the General MT shared task
<https://www2.statmt.org/wmt26/translation-task.html>, sharing test data
from a subset of its language directions, as well as protocols for
automatic MT quality evaluation. Additionally, the task follows the same
timeline as the flagship WMT task.
We warmly invite participation from academic teams and industry players
interested in applying existing compression methods to MT or exploring
innovative, cutting-edge approaches.
THE TASK IN A NUTSHELL
Goal: Reduce the size of a general-purpose LLM while maintaining a balance
between model compactness and MT performance.
Languages: The second round of the task will focus on a subset of the
languages covered by the General MT task, namely: Czech to German, English
to Chinese (Simplified), and English to Arabic (Egyptian).
Conditions:
-
Constrained: Participants will compress a specific model, using a
predefined pool of data for calibration and fine-tuning (if needed) to
ensure directly comparable results.
-
Unconstrained: Participants are free to compress any model, provided its
original size is below 20B parameters, and use any additional data for
calibration and fine-tuning.
Participation format: Participants will share their compressed models to be
run on a standardized hardware environment provided by the organizers.
Evaluation Criteria:
-
Translation quality: Automatically assessed using multiple metrics, e.g.
Comet, MetricX, and an LLM-as-a-judge framework.
-
Model size: Defined by memory usage.
-
Inference speed: Measured by total processing time over the test set.
IMPORTANT DATES
-
Test data released: June 18, 2026
-
Model Submission deadline: July 2, 2026
-
System description paper submission: in line with WMT26
<https://www2.statmt.org/wmt26/index.html>
-
Camera-ready submission: in line with WMT26
<https://www2.statmt.org/wmt26/index.html>
-
WMT 2026 Conference (co-located with EMNLP2026 <https://2026.emnlp.org/>
in Budapest, Hungary): November, 2026
WEBSITE: https://www2.statmt.org/wmt26/model-compression.html
ORGANIZERS:
Marco Gaido, Fondazione Bruno Kessler
Matteo Negri, Fondazione Bruno Kessler
Roman Grundkiewicz - Microsoft Translator
TG Gowda - Microsoft Translator
CONTACTS:
Marco Gaido - mgaido(a)fbk.eu
Matteo Negri - negri(a)fbk.eu
--
--
Le informazioni contenute nella presente comunicazione sono di natura
privata e come tali sono da considerarsi riservate ed indirizzate
esclusivamente ai destinatari indicati e per le finalità strettamente
legate al relativo contenuto. Se avete ricevuto questo messaggio per
errore, vi preghiamo di eliminarlo e di inviare una comunicazione
all’indirizzo e-mail del mittente.
--
The information transmitted is
intended only for the person or entity to which it is addressed and may
contain confidential and/or privileged material. If you received this in
error, please contact the sender and delete the material.
Final Call For Participation HIPE 2026 – CLEF Shared Task on Person-Place Relation Extraction from Multilingual Historical Texts
(apologies for cross-postings)
________________________________
HIPE: Identifying Historical People, Places and other Entities.
Website: https://hipe-eval.github.io/HIPE-2026/
Tasks: Person-Location Relation Extraction from Multilingual Historical Texts.
Registration: https://clef-labs-registration.dipintra.it/ (until 23 April 2026)
Training data releases: 19 Dec 2025 (partial); 19 Jan 2026 (full)
Evaluation period: 5–7 May 2026
Workshop venue: during CLEF conference, 21–24 September 2026, Jena, Germany.
LinkedIn: @ImpressoProject / #HIPE2026 / @clef_initiative / #clef2026
________________________________
"Who was where when?"
We invite participation in the third edition of the HIPE shared task, dedicated to the extraction of person–place relations in multilingual historical documents. Building on the success of HIPE-2020 and HIPE-2022, which focused on entity recognition and linking, HIPE-2026 aims to enable finer-grained analysis of entities and support the accurate reconstruction of individuals’ geographical and temporal trajectories.
The objective of HIPE-2026 is to build systems capable of determining whether a relation holds between a person and a location (place) mentioned in a document, and classify its temporal scope. Participants are asked to develop systems that determine, for each (person, location) pair associated with a historical document, whether the text implies that the person is at that location within the document’s temporal horizon (isAt relation), or that the person was there at some earlier moment in their life (a more general At relation), or that no such link can be established.
Can large language models take up the challenges? Simple co-occurrences of entity mentions in a text are not sufficient to uncover the implicit and explicit, temporally anchored relations between person and locations. Addressing this challenge requires temporal reasoning, geographical inference, and the interpretation of noisy historical texts (often with only fragmentary contextual cues) to classify person–location relations with varying degrees of certainty.
The task is designed to be tackled by generative AI systems/LLMs as well as by more traditional classification approaches.
HIPE-2026 features two evaluation profiles
* Accuracy Profile: Focusing on system performance in relation classification.
* Efficiency Profile: Rewarding scalable, lightweight approaches considering model size and compute cost.
* Generalization Profile: An unseen dataset from a different domain will be included to evaluate systems’ ability to generalise beyond the newspaper domain data.
For the accuracy and efficiency profile, training and test data originate from historical newspapers in English, German, French and Luxembourgish.
Entity pairs will be provided.
For further information on data, tasks, and evaluation settings
* HIPE-2026 website: https://hipe-eval.github.io/HIPE-2026/
* Participation Guidelines: https://doi.org/10.5281/zenodo.17800136
* HIPE-2026-data GitHub repository: https://github.com/hipe-eval/HIPE-2026-data
On HIPE shared tasks
HIPE evaluation lab series is part of the ongoing efforts of the natural language processing and digital humanities communities to adapt and develop technologies to efficiently retrieving and exploring information from historical texts.
Important upcoming dates
* 23 Apr 2026: Lab registration closes.
* 05 May 2026: Test data release (10:00 CEST).
* 07 May 2026: Participant run submission deadline.
* 13 May 2026: Publication of results and release of test data.
* 28 May 2026: Submission of participant notebook paper.
* 10 Jul 2026 / 31 Aug 2026: CLEF conference regular/late registration DL.
* 21 Sep 2026: CLEF 2026 Conference.
Best regards,
HIPE-2026 Shared Task Organizers
https://hipe-eval.github.io/HIPE-2026/
[Apologies for cross-postings]
*********************************************************
TSD 2026 - LAST CALL FOR PAPERS
*********************************************************
Twenty-ninth International Conference on TEXT, SPEECH and DIALOGUE (TSD 2026)
Brno, Czech Republic, 1-4 September 2026
http://www.tsdconference.org/
THE SUBMISSION DEADLINE has been EXTENDED to:
April 25 2026 ............ Submission of full papers
You may get updates by following new https://www.linkedin.com/company/tsdconference/
The conference is organized by the Faculty of Informatics, Masaryk
University, Brno, and the Faculty of Applied Sciences, University of
West Bohemia, Pilsen. The conference is supported by International
Speech Communication Association.
Venue: Brno, Czech Republic
SUBMISSION OF PAPERS
Authors are invited to submit a full paper not exceeding 12 pages
formatted in the LNCS style (including references). Those accepted
will be presented either orally or as posters. The decision about the
presentation format will be based on the recommendation of the
reviewers. Proceedings papers do not differentiate the presentation
format. The authors are asked to submit their papers using the
on-line form accessible from the conference website.
Papers submitted to TSD 2026 must not be under review by any other
conference or publication during the TSD review cycle, and must not be
previously published or accepted for publication elsewhere.
Publishing on preprint servers is not forbidden, but authors are
warned that when doing so this might influence the blind reviewing
conditions.
As reviewing will be blind, the paper should not include the authors'
names and affiliations. Furthermore, self-references that reveal the
author's identity, e.g., "We previously showed (Smith, 1991) ...",
should be avoided. Instead, use citations such as "Smith previously
showed (Smith, 1991) ...". Papers that do not conform to the
requirements above are subject to be rejected without review.
The authors are strongly encouraged to write their papers in TeX or
LaTeX formats. These formats are necessary for the final versions of
the papers that will be published in the Springer Lecture Notes.
The paper format for review has to be in the PDF format with all
required fonts included. Upon notification of acceptance, presenters
will receive further information on submitting their camera-ready and
electronic sources (for detailed instructions on the final paper
format see https://www.tsdconference.org/tsd2026/paper_instr.html).
Authors are also invited to present actual projects, developed
software or interesting material relevant to the topics of the
conference. The presenters of demonstrations should provide an
abstract not exceeding one page. The demonstration abstracts will not
appear in the conference proceedings.
KEYNOTE SPEAKERS
Anders Soegaard, University of Copenhagen, Denmark
TSD SERIES
TSD series evolved as a prime forum for interaction between researchers in
both spoken and written language processing from all over the world.
Proceedings of TSD form a book published by Springer-Verlag in their
Lecture Notes in Artificial Intelligence (LNAI) series. TSD Proceedings
are regularly indexed by Thomson Reuters Conference Proceedings Citation
Index/Web of Science. Moreover, LNAI series are listed in all major
citation databases such as DBLP, SCOPUS, EI, INSPEC or COMPENDEX.
CALL for SATELLITE WORKSHOP PROPOSALS
https://www.tsdconference.org/tsd2026/conf_workshop_proposals.html
The TSD 2026 conference will be accompanied by one-day satellite workshops
or project meetings with organizational support by the TSD organizing
committee. The organizing committee can arrange for a meeting room at the
conference venue and prepare a workshop proceedings as a book with ISBN by
a local publisher. The workshop papers that will pass also the standard TSD
review process will appear in the Springer proceedings. Each workshop is
a subject to proposal that should be sent via the proposal submission form
or discussed via the contact e-mail tsd2026(a)tsdconference.org ahead of the
respective deadline.
TOPICS
Topics of the conference will include (but are not limited to):
Corpora and Language Resources (monolingual, multilingual,
text and spoken corpora, large web corpora, large language models,
disambiguation, specialized lexicons, dictionaries)
Speech Recognition (multilingual, continuous, emotional
speech, handicapped speaker, out-of-vocabulary words,
alternative way of feature extraction, new models for
acoustic and language modelling)
Tagging, Classification and Parsing of Text and Speech
(morphological and syntactic analysis, synthesis and
disambiguation, multilingual processing, sentiment analysis,
credibility analysis, automatic text labeling, summarization,
authorship attribution)
Speech and Spoken Language Generation (multilingual, high
fidelity speech synthesis, computer singing)
Semantic Processing of Text and Speech (information
extraction, information retrieval, data mining, semantic web,
knowledge representation, inference, ontologies, sense
disambiguation, plagiarism detection, fake news detection)
Integrating Applications of Text and Speech Processing
(machine translation, natural language understanding,
question-answering strategies, assistive technologies)
Automatic Dialogue Systems (self-learning, multilingual,
question-answering systems, dialogue strategies, prosody in
dialogues)
Multimodal Techniques and Modelling (video processing, facial
animation, visual speech synthesis, user modelling, emotions
and personality modelling)
Papers on processing of languages other than English are strongly
encouraged.
PROGRAM COMMITTEE
Elmar Noeth, Germany (general chair)
Rodrigo Agerri, Spain
Tomas Arias-Vergara, Germany
Vladimir Benko, Slovakia
Archna Bhatia, USA
Jan Cernocky, Czech Republic
Simon Dobrisek, Slovenia
Kamil Ekstein, Czech Republic
Karina Evgrafova, Russia
Yevhen Fedorov, Ukraine
Volker Fischer, Germany
Darja Fiser, Slovenia
Lucie Flek, Germany
Bjorn Gamback, Norway
Radovan Garabik, Slovakia
Alexander Gelbukh, Mexico
Louise Guthrie, USA
Jan Hajic, Czech Republic
Eva Hajicova, Czech Republic
Yannis Haralambous, France
Hynek Hermansky, USA
Daniel Hládek, Slovakia
Ales Horak, Czech Republic
Eduard Hovy, USA
Milos Jakubicek, Czech Republic
Maria Khokhlova, Russia
Aidar Khusainov, Russia
Daniil Kocharov, Russia
Miloslav Konopik, Czech Republic
Valia Kordoni, Germany
Evgeny Kotelnikov, Russia
Pavel Kral, Czech Republic
Siegfried Kunzmann, USA
Oier Lopez de Lacalle, Spain
Nikola Ljubesic, Croatia
Natalija Loukachevitch, Russia
Bernardo Magnini, Italy
David Mareček, Czech Republic
Jindrich Matousek, Czech Republic
Vaclav Matousek, Czech Republic
Roman Moucek, Czech Republic
Daša Munková, Slovakia
Agnieszka Mykowiecka, Poland
Hermann Ney, Germany
Joakim Nivre, Sweden
Juan Rafael Orozco-Arroyave, Colombia
Paula Andrea Perez-toro, Germany
Maciej Piasecki, Poland
Josef Psutka, Czech Republic
James Pustejovsky, USA
German Rigau, Spain
Paolo Rosso, Spain
Leon Rothkrantz, The Netherlands
Anna Rumshisky, USA
Milan Rusko, Slovakia
Pavel Rychly, Czechia
Mykola Sazhok, Ukraine
Pavel Skrelin, Russia
Petr Sojka, Czech Republic
Ján Staš, Slovakia
Georg Stemmer, Germany
Marko Robnik Sikonja, Slovenia
Marko Tadic, Croatia
Jan Trmal, Czechia
Tamas Varadi, Hungary
Zygmunt Vetulani, Poland
Aleksander Wawer, Poland
Alina Wroblewska, Poland
Jerneja Zganec Gros, Slovenia
FORMAT OF THE CONFERENCE
The conference program will include presentation of invited papers,
oral presentations, and poster/demonstration sessions. Papers will be
presented in plenary or topic oriented sessions.
Social events including a trip in the vicinity of Brno will allow
for additional informal interactions.
IMPORTANT DATES
April 25 2026 ............ Submission of full papers
June 5 2026 .............. Notification of acceptance
June 15 2026 ............. Final papers (camera ready) and registration
August 8 2026 ............ Submission of demonstration abstracts
August 15 2026 ........... Notification of acceptance for
demonstrations sent to the authors
September 1-4 2026 ....... Conference date
Submission of abstracts serves for better organization of the review
process only - please submit your abstract as soon as possible.
For the actual review a full paper submission is necessary.
The accepted conference contributions will be published in Springer
proceedings that will be made available to participants at the time
of the conference.
OFFICIAL LANGUAGE
The official language of the conference is English.
ACCOMMODATION
The organizing committee will arrange discounts on accommodation in
the 4-star hotel at the conference venue. The current prices of the
accommodation will be available at the conference website.
ADDRESS
All correspondence regarding the conference should be
addressed to
Ales Horak, TSD 2026
Faculty of Informatics, Masaryk University
Botanicka 68a, 602 00 Brno, Czech Republic
phone: +420-5-49 49 18 63
fax: +420-5-49 49 18 20
email: tsd2026(a)tsdconference.org
The official TSD 2026 homepage is: http://www.tsdconference.org/tsd2026
LinkedIn: https://www.linkedin.com/company/tsdconference/
LOCATION
Brno is the second largest city in the Czech Republic with a
population of almost 400,000 and is the country's judiciary and
trade-fair center. Brno is the capital of South Moravia, which is
located in the south-eastern part of the Czech Republic and is known
for its wide range of cultural, natural, and technical attractions.
South Moravia is a traditional wine region. Brno had been a Royal
City since 1347 and with its six universities it forms a cultural
center of the region.
Brno can be reached easily by direct flights from London, Milano, and
Malaga, and by trains or buses from Vienna (150 km) or Prague (230 km).
For the participants with some extra time, nearby places may
also be of interest. Local ones include: Brno Castle now called
Spilberk, Veveri Castle, the Old and New City Halls, the
Augustine Monastery with St. Thomas Church and crypt of Moravian
Margraves, Church of St. James, Cathedral of St. Peter & Paul,
Cartesian Monastery in Kralovo Pole, the famous Villa Tugendhat
(UNESCO) designed by Mies van der Rohe along with other important
buildings of between-war Czech architecture.
For those willing to venture out of Brno, Moravian Karst with
Macocha Chasm and Punkva caves, battlefield of the Battle of
three emperors (Napoleon, Russian Alexander and Austrian Franz
- Battle by Austerlitz), Chateau of Slavkov (Austerlitz),
Pernstejn Castle, Buchlov Castle, Lednice Chateau, Buchlovice
Chateau, Letovice Chateau, Mikulov with one of the largest Jewish
cemeteries in Central Europe, Telc - a town on the UNESCO
heritage list, and many others are all within easy reach.
The project: <https://aclanthology.org/2025.findings-acl.785/> Position
Paper: MeMo: Towards Language Models with Associative Memory Mechanisms -
ACL Anthology
Do YOU have a strong mathematical background, good programming skills,
passion and curiosity, the ability to think out of the box? This is the
Ph.D. Program for YOU:
* Position 1) Mechanistic Interpretability by-design in
Neural-Network-based Large Language Models: This theme is in the context of
the MeMo project that aims to build alternative ways to build transparent
Large Language Models by explicitly using Associative Memories.
* Position 2) Injecting Neuro-symbolic Natural Language Processing
Capabilities in Neural-Network-based Large Language Models: This theme is in
the context of the MeMo project that aims to build alternative ways to build
transparent Large Language Models by explicitly using Associative Memories.
The positions will formally open in mid-April 2026 and will start on
November 1st, 2026.
If you want to see if you fit the position, take this test
<https://forms.office.com/e/sSCGhP9MgF> Open Fully-funded Ph.D. Positions in
the Data Science Ph.D. School and book an informal chat to understand if
the positions fit you.
Positions are based in Ph.D. on Data Science at University of Rome Tor
Vergata (Rome, Italy) in the <https://humancentricart.github.io/>
HumanCentricART Group
Fourth International Workshop on Gender-Inclusive Translation Technologies (GITT) at EAMT 2026
15 June 2026, Tilburg, The Netherlands
https://sites.google.com/view/gitt2026/
@gitt-workshop.bsky.social
Important Dates (Time zone: Anywhere on Earth)
(NEW) Submission deadline: 26 April, 2026
Notification of Acceptance: 13 May, 2026
Camera Ready Copy due: 20 May, 2026
Workshop: 15 June, 2026
**Aim and scope**
The Gender-Inclusive Translation Technologies Workshop (GITT) is set out to be the dedicated workshop that focuses on gender-inclusive language in translation and cross-lingual scenarios. The workshop aims to bring together researchers from diverse areas, including industry partners, MT practitioners, and language professionals. GITT aims to encourage multidisciplinary research that develops and interrogates both solutions and challenges for addressing bias and promoting gender inclusivity in MT and translation tools, including LMs applications for the translation task.
**Topics**
GITT invites technical as well as non-technical submissions, which consist of experimental, theoretical or methodological contributions. We explicitly welcome interdisciplinary submissions and submissions that focus on innovative, non-binary linguistic strategies and/or with sociolinguistically-informed perspectives. The topics of interest include, but are not limited to:
- Models or methods for assessing and mitigating gender bias
- New resources for inclusive language and gender translation (e.g., datasets, translation memories, dictionaries)
- Social, cross-lingual, and ethical implications of gender bias
- Qualitative and quantitative analyses on the potential limits of current approaches to gender bias in translation and MT, error taxonomies as well as best practices and guidelines
- User-centric case studies on the impact of biased language and/or mitigating approaches which can include translators, post-editors, or monolingual MT users
GITT is also open to other non-listed topics aligned with the scope of the workshop and works focusing on non-textual modalities (e.g., audiovisual translation)
**Submission**
We welcome four types of submissions, two archival and two non-archival.
ARCHIVAL
- Research papers: of at least 4 up to 10 pages (excluding references)
- Extended Abstracts: up to 2 pages (including references)
Accepted papers and extended abstracts consisting of novel work will be published online as proceedings in the ACL Anthology.
NON-ARCHIVAL
- Research Communications: up to 2 pages (including references).
We include a parallel submission policy in the form of Research Communications for papers related to the topic of GITT that were accepted in other venues in 2025 and 2026.
- Potluck Communications: short abstract up to 500 words (including references).
Potluck Communications offer a space for anyone—especially students and early career researchers—to discuss bold new ideas for collaboration, brainstorm about ongoing work, and explore future research directions.
The communications will not be included in the proceedings, but will serve to promote the dissemination of research aligned with the scope of the workshop.
All submissions should adhere to the EAMT 2026 guidelines and style templates (PDF, LaTeX, Word) and be uploaded on Easychair ( https://easychair.org/conferences?conf=eamt2026)
**Workshop organizers**
Manuel Lardelli, University of Padova
Janiça Hackenbuchner, University of Ghent
Luisa Bentivogli, Fondazione Bruno Kessler
Joke Daems, University of Ghent
Beatrice Savoldi, Fondazione Bruno Kessler
Eleni Gkovedarou, University of Ghent
Call for Participation for the 7th RAIL WORKSHOP
RAIL workshop
https://sadilar.org/en/seventh-workshop-on-resources-for-african-indigenous…
Co-located with LREC 2026 https://www.elra.info/lrec2026
RAIL workshop date: 12 May 2026
Conference venue: Palau de Congressos de Palma, Palma de Mallorca
(Spain)
Theme: Creating resources for less-resourced African languages
The seventh Resources for African Indigenous Languages (RAIL) workshop
will be co-located with the LREC 2026 conference in Palma, Mallorca
(Spain), on 12 May 2026. The RAIL workshop is an interdisciplinary
platform for researchers working on African indigenous language
resources such as natural language processing (NLP) tools, Human
Language Technologies (HLT), data collections, and annotations. This
workshop aims to foster a scientific community of practice that focuses
on computational linguistic tools and data that are designed for or
applied to the indigenous languages of Africa.
Many African languages are under-resourced, while only a few are
considered to be somewhat better resourced. These languages often share
interesting properties such as writing systems, making them different
from most high-resourced languages. From a computational perspective,
these languages lack enough corpora to undertake high-level development
of NLP and HLT tools, which in turn impedes the development of African
languages in these areas. During previous workshops, it was noted that
the problems and solutions presented were not only applicable to
African languages but were also relevant to many other low-resource
languages across the world. Because these languages share similar
challenges, this workshop provides researchers with opportunities to
work collaboratively on issues of language resource development and
learn from each other.
The RAIL workshop has several aims. First, the workshop brings together
researchers who work on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances the sharing of knowledge regarding the
development of low-resource languages. Finally, it enables discussions
on how to improve the quality as well as the availability of the
resources.
Organising Committee
Muzi Matfunjwa, South African Centre for Digital Language Resources
Mmasibidi Setaka, South African Centre for Digital Language Resources
Rooweither Mabuya, South African Centre for Digital Language Resources
Menno van Zaanen, South African Centre for Digital Language Resources
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
Registration open!!
########################################################
GRACE@IberLEF2026: https://www.codabench.org/competitions/13280/
########################################################
****We apologize for multiple postings of this e-mail****
GRACE@IberLEF2026 announces the first edition of a novel task on Argument Mining shared task in Spanish connecting Explainable AI and Evidence-Based Medicine across clinical trials and medical licensing examinations.
⚗️ Argument Mining
Argument Mining automatically extracts claims and evidence from clinical text and reveals how they support or challenge each other, enabling transparent, traceable clinical reasoning.
🌍 Spanish, First
GRACE is the first Argument Mining task in Spanish for the clinical domain, filling a key gap in multilingual biomedical NLP with fine-grained, entity-level annotations.
Track 01
🔬 Clinical Trial Evidence & Argumentation
This track focuses on abstracts of Randomized Controlled Trials (RCTs). Their standardized design, contrasting an intervention with a control group, provides a transparent path from data to conclusions, making argumentative components more accessible to automated systems.
Goal: Identify argumentative components (claims and premises) and detect support/attack relations at the sentence level.
Track 02
🩺 Clinical Case Reasoning (MIR)
This track uses cases from the MIR (Médico Interno Residente) exam, Spain's national medical specialization test. Each instance pairs a dense clinical narrative with five competing diagnostic or treatment options, only one of which is correct.
Goal: Extract fine-grained evidence spans that justify the correct option while refuting the incorrect alternatives.
📅 Important Dates
📂 Release of Training & Dev Sets March 18
🚀 Official Test Set Release April 22
⏰ Deadline for Result Submission May 3
📊 Publication of Results May 8
📄 System Paper Submission May 24
✅ Notification of Acceptance June 17
🎤 IberLEF Workshop (at SEPLN) September 22
(apologies for cross-postings)
Joint CODI CRAC 2026 Workshop: call for fast-track papers
�
July 2026 - ACL 2026 - San Diego, USA
Deadline for CODI CRAC fast-track papers: May 1st 2026
�
CODI-CRAC is officially endorsed by SIGDial, the ACL Special Interest Group on Discourse and Dialogue. More information about the workshop : <https://sites.google.com/view/codi-crac2026/home> https://sites.google.com/view/codi-crac2026/ �
�
CODI-CRAC 2026 invites you to submit your accepted or rejected conference submission as a fast track paper in either the archival or non-archival track:
* We invite presentations of papers accepted at another main conference, workshop or journal. They will be included in the workshop program and handbook, but will not appear in the workshop proceedings (non-archival). �
* We also invite submissions of papers rejected at another main conference, workshop or journal. The reviews should be submitted along with the paper. If accepted, the papers will be presented during the workshop and included in the proceedings (archival). �
�
Submission website
�
All submissions must be anonymous and follow the ACL 2026 formatting instructions described here: <https://aclrollingreview.org/cfp> https://aclrollingreview.org/cfp.
�
Please indicate the type “non archival” if your paper has been accepted at another conference.
�
Use the following link: <https://softconf.com/acl2026/codi-crac2026/> https://softconf.com/acl2026/codi-crac2026/ �
�
Topics of interest
�
We welcome papers on symbolic and probabilistic approaches, corpus development and analysis, as well as machine and deep learning approaches to discourse. We appreciate theoretical contributions as well as practical applications, including demos of systems and tools. The goal of the workshop is to provide a forum for the community of NLP researchers working on all aspects of discourse.
�
Topics of interest include, but are not limited to:
�
- discourse structure
- discourse connectives
- discourse relations
- long-form question answering
- annotation tools and schemes for discourse phenomena
- corpora annotated with discourse phenomena
- discourse parsing
- cross-lingual discourse processing
- cross-domain discourse processing
- anaphora and coreference resolution
- event coreference
- argument mining
- coherence modeling
- discourse and semantics
- discourse in applications such as machine translation, summarization, etc.
- evaluation methodology for discourse processing
- discourse pretraining tasks
- long-text modeling and generation
�
Schedule
Important dates for the workshop are listed below:
* Pre-reviewed fast-track (with reviews, can be accepted or rejected): May 1st �
* Notification of acceptance: May 8, 2026
* Student D&I Grant application: May 19, 2026
* Camera-ready paper due: May 19, 2026
* Pre-recorded video due: June 4, 2026
* Workshop dates: July 3 or 4, 2026
�
All deadlines are 11.59 pm UTC -12h ("anywhere on Earth").
�
Organizers
- Chloé Braud, CNRS-IRIT
- Christian Hardmeier, IT University of Copenhagen
- Chuyuan Li, � University of British Columbia
- Jessy Li, University of Texas, Austin
- Sharid Loáiciga, University of Gothenburg
- Vincent Ng, University of Texas at Dallas
- Michal Novák, Charles University, Prague
- Maciej Ogrodniczuk, Institute of Computer Science, Polish Academy of Sciences
- Massimo Poesio, Queen Mary University of London and University of Utrecht
- Michael Strube, Heidelberg Institute for Theoretical Studies
- Amir Zeldes, Georgetown University, Washington DC
�
To contact the organizers, please send an email to: <mailto:codi-crac-workshop@googlegroups.com> codi-crac-workshop(a)googlegroups.com �
�
(apologies for cross-posting; please circulate)
KONVENS 2026 Second Call for Conference Papers
https://konvens2026.uni-hamburg.de/
We are delighted to share the second call for papers with you for Konferenz zur Verarbeitung natürlicher Sprache (KONVENS) 2026, organized under the auspices of the GSCL, the DGfS-CL, the ÖGAI, and SwissNLP. This year’s KONVENS will take place in Hamburg, September 14 – 17 under the special theme “Context Matters: NLP Beyond Text”. The conference will include a diverse program including talks by our two keynote speakers:
* Dr. Valentin Hoffmann, Allen Institute for AI
* Prof. Dr. Barbara Plank, LMU Munich.
We invite the submission of long and short papers featuring substantial, original, and unpublished research on Natural Language Processing and Computational Linguistics, to be archived in the ACL Anthology, as well as abstract submissions that describe research in progress or published elsewhere. Beyond standard research contributions, submissions are welcome that present negative results, survey an area, introduce new resources, articulate a position, report novel linguistic insights obtained using existing computational methods, or reproduce (successfully or not) previous findings.
We welcome the following types of paper submissions:
* Long papers (up to 8 pages plus references), describing original research with substantial new results.
* Short papers and demos (up to 4 pages plus references), including small and focused contributions, work in progress, as well as descriptions of projects, systems and resources.
* Abstracts (1 page, non-archival), which will be presented at the poster session and printed in the proceedings, but which will be non-archival. We especially invite submission on ongoing projects, student projects, past or ongoing bachelor and master theses, ongoing or recently completed PhD theses, and opinion pieces in this category to foster interaction and discussion in our community.
Papers can be submitted either to the main conference track or to the special track “Context Matters”.
Context Matters Track
The widespread use of large language models (LLMs) and other types of language technology in research and real-world applications has fundamentally reshaped how natural language processing (NLP) systems interact with people and their environments. As NLP systems increasingly operate in socially embedded, high-impact settings like search, conversational agents and recommendation systems in business, education, medicine, law, and beyond, it becomes crucial to move beyond text in isolation and to account for the many forms of context that shape language use and interpretation. These include user-related factors (e.g., identity aspects like socio-demographic characteristics and the resulting perspectival differences), cultural and societal context, interaction history, application constraints, and signals from other modalities.
The “Context Matters” track focuses on how different forms of context influence NLP systems, their design, their behavior, and their use. We invite work that studies NLP not as decontextualized text processing, but as situated technology embedded in human, social, disciplinary, and multimodal environments. Here, disciplines and application domains are important not only as areas of use, but as sources of structured contextual knowledge, perspectives, and methodological traditions — particularly from the social sciences and humanities, but also law, education, psychology, economics, and the natural sciences.
In particular, the special theme includes:
* Research that models user- and group-related context, such as identity aspects, socio-demographic variables, cultural background, or perspectival differences, and examines how these factors affect language use, system behavior, or system impact
* Work that draws on or operationalizes concepts from other disciplines like the social sciences and related fields (e.g., social theory, cultural analysis, behavioral perspectives) to better understand linguistic phenomena, system outputs, or evaluation settings
* Research analyzing social, societal, and institutional context, including norms, power structures, and real-world deployment environments, especially with respect to ethics, bias, and societal consequences
* Studies of application context, where domain-specific constraints (e.g., in education, law, public administration, or the natural sciences) shape both language use and system requirements
* Approaches that move beyond text-only processing and integrate multiple modalities (e.g., vision, audio, video, sensor data), with attention to the distinct contextual signals these modalities introduce
* Work incorporating interactional context, such as dialogue history, user intent, and evolving human–AI interaction dynamics
While the modelling component should include language, we especially encourage contributions that treat language as part of a broader contextual ecosystem, aiming toward more grounded, adaptive, and socially aware NLP systems.
Papers must be in English and formatted in accordance with the ACL style sheet and submitted via the submission link :https://openreview.net/group?id=GSCL.org/KONVENS/2026/Conference
Please consider the OpenReview policy for new accounts:
* New profiles created without an institutional email will go through a moderation process that can take up to two weeks.
* New profiles created with an institutional email will be activated automatically.
KONVENS also adopts the ACL policies for submission, review, and citation, the ACL privacy policy, and the ACL code of ethics.
Further information can be found on the conference website:
https://konvens2026.uni-hamburg.de/
Submissions need to be anonymized to ensure double-blind review. However, we allow for pre-prints to be posted any time before or during the review period. We strongly encourage authors to use LaTeX in preparing their document.
Important dates:
30.4.2026 Paper Submission Deadline
12.7.2026 Notification of Acceptance
01.8.2026 Camera-Ready Deadline
14.9. – 17.9.2026 KONVENS in Hamburg
See you in Hamburg!
Your conference chairs,
Chris Biemann, Anne Lauscher, and Heike Zinsmeister
---------------------------
Prof. Dr. Heike Zinsmeister (sie/ihr)
Linguistik des Deutschen / Korpuslinguistik
Universität Hamburg, Institut für Germanistik, Raum C7012
Von-Melle-Park 6, Postfach #15, D-20146 Hamburg
Tel.: +49 (40) 2395-27119
heike.zinsmeister(a)uni-hamburg.de
http://www.slm.uni-hamburg.de/germanistik/personen/zinsmeister.html