The 2st Workshop on DHOW: Diffusion of Harmful Content on Online Web
Workshop
The workshop will be conducted in a *hybrid* format to ensure maximum
participation, accommodating attendees both *online* and in person.
Submission deadline: *July 11 2025 AOE*
*Workshop site*: https://dhow-workshop.github.io/2025/
*Co-located with ACMMM 2025*
https://acmmm2025.org/ <https://lrec-coling-2024.org/>
Dublin, Ireland, 27-31 October 2024
*Important Dates*
Submission deadline: extended to *July 11, 2025*
Notification of acceptance: August 01, 2025
Camera-ready papers due: August 11, 2025
Workshop date: October 27/28, 2025
*Workshop Description*
With the advancement of digital technologies and gadgets, online content
is easily accessible. At the same time, harmful content also gets
spread. There are different harmful content available on different
platforms in multiple languages. The topic of harmful content is broad
and covers multiple research directions. But from the user’s aspect,
they are affected by them all. Often, it is studied individually, like
misinformation and hate speech. Research has been done on one platform,
monolingual, on a particular issue. It leads to harmful content
spreaders switching platforms and languages to reach the user base.
Harmful is not limited to social media but also news media. Spreader
shares harmful content in posts, news articles, comments, and
hyperlinks. So, there is a need to study the harmful content by
combining cross-platform, language, multimodal data and topics.
We will bring the research on harmful content under one umbrella so that
research on different topics (hate speech, misinformation,
disinformation, self-harm, offensive content, etc.) can bring some novel
methods and recommendations for users, leveraging text analysis with
image, audio, and video recognition to detect harmful content in diverse
formats. The workshop will cover the ongoing issue of war or elections
in 2025.
We believe this workshop will provide a unique opportunity for
researchers and practitioners to exchange ideas, share latest
developments, and collaborate on addressing the challenges associated
with harmful contents spread across the Web. We expect that the workshop
will generate insights and discussions that will help advance the field
of societal artificial intelligence (AI) for the development of safer
internet. In addition to attracting high quality research contributions
to the workshop, one of the aims of the workshop is to mobilise the
researchers working on the related areas to form a community.
*Submissions Topics*
•Studying different types of harmful content
•Computational fact-checking & Misinformation Detection
•Role of Generative AI in Mitigating Harmful Content
•Harassment, Bullying, and Hate Speech Detection
•Explainable AI for Harmful Content Analysis
•Multimodal and Multilingual Harmful Content Detection such as fake
news, spam, and troll detection.
•Deepfake and Synthetic Media
•Ethical & Societal Implications of AI in Content Moderation
•Both Qualitative and Quantitative study on harmful content
•Psychological effects of harmful content like mental health
•Approaches for data collection or data annotation using multimodal
large models on harmful content
•User study on the effects of harmful content on human beings
*Submissions*
- Submission Instructions: https://dhow-workshop.github.io/2025/#call
<https://dhow-workshop.github.io/2025/#call>
- Submission Link:
https://openreview.net/group?id=acmmm.org/ACMMM/2025/Workshop/DHOW
<https://openreview.net/group?id=acmmm.org/ACMMM/2025/Workshop/DHOW>
***Workshop organizers*
•Thomas Mandl (University of Hildesheim, Germany)
•Haiming Liu (University of Southampton, United Kingdom)
•Gautam Kishore Shahi(University of Duisburg-Essen, Germany)
•Amit Kumar Jaiswal (University of Surrey, United Kingdom )
•Durgesh Nandini (University of Bayreuth, Germany)
DHOW 2025
Dear all,
We invite participation in our Shared Task on Vocabulary Difficulty Prediction for English Learners, which will be hosted at The<https://sig-edu.org/bea/2026> <https://sig-edu.org/bea/2026> 21st Workshop on Innovative Use of NLP for Building Educational Applications<https://sig-edu.org/bea/2026> (co-located with ACL 2026) both online and in person in San Diego, CA, United States.
This shared task focuses on predicting the difficulty of English vocabulary for learners with different L1 backgrounds. Evaluation will use the British Council’s Knowledge-based Vocabulary Lists (KVL), which provide psychometrically calibrated difficulty scores for English learners with Spanish, German, and Mandarin L1s. The task includes a Closed Track, limited to the provided data and standard NLP resources, and an Open Track, which allows external data and use of LLMs, to explore the full potential of current AI approaches.
Important Dates
26 January: Release of training data and baseline models<https://github.com/britishcouncil/bea2026st>
20 March: Test data release
27 March: System submissions from teams due
3 April: Announcement of evaluation results by the organisers
24 April: System papers due
1 May: Paper reviews returned
12 May: Final camera-ready submissions
2-3 July: BEA 2026 workshop at ACL
Further details can be found at our shared task website<https://www.britishcouncil.org/data-science-and-insights/bea2026st>. Please send any questions to vocabularychallenge(a)britishcouncil.org<mailto:vocabularychallenge@britishcouncil.org> or post a new topic in our forum<https://groups.google.com/g/bea-2026-shared-task/>. We look forward to your participation!
Organisers: Mariano Felice (British Council) and Lucy Skidmore (British Council).
The British Council is the United Kingdom's international organisation for cultural relations and educational opportunities. A registered charity: 209131 (England and Wales) SC037733 (Scotland). This message is for the use of the intended recipient(s) only and may contain confidential information. If you have received this message in error, please notify the sender and delete it. The British Council accepts no liability for loss or damage caused by viruses and other malware and you are advised to carry out a virus and malware check on any attachments contained in this message.
**Apologies for cross-posting**
Final Call for Papers: Joint Workshop on Legal and Ethical Issues in Human Language Technologies (LEGAL2026) and Computational Approaches to Language Data Pseudonymization, Anonymization, De-identification, and Data Privacy (CALD-pseudo 2026)
Website: https://legal2026.mobileds.de/
Submission: https://softconf.com/lrec2026/LEGAL2026/
We invite submissions to the Joint Workshop on Legal and Ethical Issues in Human Language Technologies (LEGAL2026) and Computational Approaches to Language Data Pseudonymization, Anonymization, De-identification, and Data Privacy (CALD-pseudo 2026), to be held at LREC 2026 on the 12th of May 2026.
Important Dates
*
20th 22nd of February 2026, 23:59 CET: paper submission deadline
*
30th March 2026: camera ready deadline (strict)
*
12th May 2026: workshop date
Introduction
Access to text and speech data is essential for research, yet personal and sensitive information often prevents open sharing. Techniques such as pseudonymization and anonymization offer potential solutions, but their effectiveness, limitations, and impact on data utility require deeper investigation. Balancing privacy protection with meaningful scientific use remains a key challenge.
At the same time, legal and ethical requirements increasingly shape how language resources can be created, processed, and distributed. Regulatory frameworks, such as the GDPR, the Data Act, and the Artificial Intelligence Act, affect access, reuse, and documentation duties for both text and speech data, creating a complex environment that demands interdisciplinary insight.
The workshop brings these two perspectives together by addressing both the technical and practical aspects of de-identification as well as the legal and ethical obligations governing data handling. Topics include anonymization and pseudonymization methods, compliance in practical workflows, provenance and rights tracking, and emerging approaches to legal metadata. The goal is to foster responsible, legally sound, and technically robust innovation in human language technologies.
Topics of Interest
We invite contributions from all disciplines involved in the creation, processing, governance, and de-identification of text and speech data. Submissions may address theoretical, empirical, methodological, legal, or technical questions, including cross-disciplinary work. We particularly encourage research on less-represented languages and on data from under-represented communities.
1. Legal Aspects of Language Data (LEGAL2026)
*
Regulatory frameworks and global governance
*
Intellectual property, data protection, and LLM governance
*
Ethics, fairness, trust, and transparency
*
Compliance in practice
*
Ethics, fairness, and trust
*
Operationalizing compliance
*
Emerging and grey areas
*
Interdisciplinary and cross-border coordination
2. Pseudonymization, Anonymization, and De-identification: Theoretical, Methodological, and Technical Aspects (CALD-pseudo 2026)
*
Detection and classification of personal information (PI)
*
Replacement and transformation of PI
*
Utility and bias after de-identification
*
Approaches to evaluation and adversarial testing
*
Dataset creation for de-identification research
*
Low-resource scenarios
*
Speech-specific challenges
*
Cross-disciplinary applications and challenges
We invite submissions from fields where de-identification of data plays an important role, including but not limited to Computational Linguistics, Applied Linguistics, Corpus Linguistics, Digital Humanities, Social Sciences, Political Sciences, Medical Science etc., from the perspectives of researchers, public organizations, and industry.
Submission Guidelines
Authors are invited to submit original and unpublished research papers in the following categories:
*
Long papers (up to 8 pages) for substantial contributions
*
Short papers (up to 4 pages) for:
*
Small, focused contributions or ongoing or preliminary work
*
Extended abstracts for non-technical submissions only, such as conceptual, theoretical, legal, ethical, policy-oriented, or position papers. Extended abstract submissions are expected to be developed into regular papers by the camera-ready submission deadline.
The full papers will be published as workshop proceedings along with the LREC main conference. They should follow the LREC stylesheet, which is available on the conference website on the Author’s kit<https://lrec2026.info/authors-kit/> page. Unlike the main conference, we allow appendices of up to 10 pages already in the review phase. However, the reviewers will not be required to look in the appendices and must be able to review the paper based on everything contained within the main body of the paper (as if there were no appendices).
Submission deadline: 22nd of February 2026, 23:59 CET
Submission link: https://softconf.com/lrec2026/LEGAL2026/
When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research.
Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones).
Keynote Talks
We are delighted to announce the workshop will host keynote talks from two speakers:
*
Paweł Kamocki, Leibniz-Institut für Deutsche Sprache, Germany
*
Ivan Habernal, Ruhr University Bochum, Germany
Workshop Organizers
LEGAL 2026:
*
Ingo Siegert, Otto-von-Guericke Universität Magdeburg, Germany
*
Paweł Kamocki, Leibniz-Institut für Deutsche Sprache, Germany
*
Kossay Talmoudi, ELDA, France
*
Khalid Choukri, ELDA, France
CALD-pseudo 2026
*
Maria Irena Szawerna, University of Gothenburg, Sweden
*
Simon Dobnik, University of Gothenburg, Sweden
*
Therese Lindström Tiedemann, University of Helsinki, Finland
*
Pierre Lison, Norwegian Computing Center & University of Oslo, Norway
*
Ildikó Pilán, Norwegian Computing Center, Norway
*
Ricardo Muñoz Sánchez, University of Gothenburg, Sweden
*
Lisa Södergård, University of Helsinki, Finland
*
Elena Volodina, University of Gothenburg, Sweden
*
Xuan-Son Vu, Lund University & DeepTensor AB, Sweden
Program Committee
A list of program committee members is available on the workshop webpage.
Contact
For inquiries, please contact ingo.siegert(a)ovgu.de for questions about LEGAL2026 or mormor.karl(a)svenska.gu.se for questions about CALD-pseudo 2026.
Best regards,
Maria Irena Szawerna
____________________
PhD student
Språkbanken Text<https://spraakbanken.gu.se/>
Institutionen för svenska, flerspråkighet och språkteknologi<https://www.gu.se/svenska-spraket>
UNIVERSITY OF GOTHENBURG<https://www.gu.se/>
https://spraakbanken.gu.se/om/personal/maria-szawerna
FINAL CALL FOR PAPERS: The 1st Workshop on Computational Affective
Science (Deadline Extended)
--------------------------------------------------------------------------------------------------
Third and Final Call for Papers: The 1st Workshop on Computational
Affective Science (CAS 2026), co-located with the Language Resources and
Evaluation Conference (LREC) 2026 in Palma de Mallorca, Spain, May
11-16. (Submission deadline extended to 20 Feb 2026).
Website: https://casworkshop.github.io/
Contact: <cas-workshop(a)googlegroups.com>
We invite submissions to the first Workshop on Computational Affective
Science (CAS 2026), co-located with LREC 2026, on research related to
the understanding of affect and emotions through language and
computation. CAS will accept archival long and short paper submissions,
featuring substantial, original, and unpublished research. We also
encourage submissions of extended abstracts from researchers in the
broader Affective Science community, with up to two pages of content
featuring the research background/hypotheses and a description of
methods/results. Extended abstracts are non-archival, offering the
option for publication and presentation at other conference venues.
------------
MOTIVATION
------------
Affect refers to the fundamental neural processes that generate and
regulate emotions, moods, and feeling states. Affect and emotions are
central to how we organize meaning, to our behavior, to our health and
well-being, and to our very survival. Despite this, and even though most
of us are intimately familiar with emotions in everyday life, there is
much we do not know about how emotions work and how they impact our
lives. Affective Science is a broad interdisciplinary field that
explores these and related questions about affect and emotions.
Since language is a powerful mechanism of emotion expression, there is a
growing use of language data and advanced natural language processing
(NLP) algorithms to shed light on fundamental questions about emotions.
The Workshop on Computational Affective Science (CAS) aims to be a
dedicated venue for work focused specifically on the link between NLP
and affective science.
Interdisciplinary Scope: The workshop takes an interdisciplinary
approach to affective science and aims at bringing together NLP
researchers, scientists, and theorists from many research areas,
including psychology, sociology, neuroscience, and philosophy. Although
work in sentiment analysis is decades old, this work often proceeds
separately and in different fields from research and theory in affective
science. Meanwhile, affective scientists in psychology, sociology,
neuroscience and philosophy increasingly seek to use linguistic tools to
shed light on the nature of emotions, moods, and feeling states. CAS is
therefore co-organized by an interdisciplinary group of researchers
(spanning NLP and Affective Science) to foment collaboration at this
exciting frontier of research.
------------
SUBMISSIONS
------------
We invite long and short archival paper submissions, as well as
non-archival extended abstracts on a broad range of topics at the
intersection of affective science and natural language processing,
including but not limited to:
1. The Nature of Affect and Computational Modeling of Emotions
Computational experiments that add to our understanding of affect and
emotions, including findings relevant to:
- theories and nature of emotion
- the biology or neuroscience of emotions
- appraisal models
- dimensional models (valence / arousal / dominance)
- models of constructed emotion
- cognitive-affective architectures
- emotion dynamics (emergence, intensification, decay, transitions)
- emotion granularity
- emotion regulation
- affective embodiment
- evolutionary and developmental affect
- emotion–cognition interactions
These areas are relevant not just to human affect, but may also apply to
data animals and artificial agents.
2. Affective Data and Resources
Work on compiling and annotating affect-related information in text,
speech, facial and bodily expression, and physiological signals (ECG,
EEG, GSR, multimodal biosensing), with a focus on text data (monolingual
or multilingual) and multimodal data suitable for an NLP venue. Data
from underserved languages is especially encouraged.
3. Emotion Recognition, Prediction, and Inference
At the instance level:
- emotion classification (discrete emotions, dimensional ratings)
- emotion intensity estimation
- emotion cause detection
- context-aware affect inference (culture, situation, social setting)
- structured emotion analysis
At the aggregate level:
- creating emotion arcs
- determining broad trends in emotions over time or across locations
- tracking emotional responses toward entities of interest (e.g.,
climate change)
- document-level and cross-document emotion analysis
- labeling social networks
4. Applications
Including but not limited to:
- Affect and health, psychopathology, and mental disorders
- Affect and behavior/social science (e.g., interpersonal affect,
empathy, group-level affect, affect contagion, computational emotion
regulation)
- Affect and education
- Affect and literature/narratives/digital humanities
- Affect and commerce
5. Explainability and Interpretability in Computational Affective Models
Work aimed at improving the transparency and interpretability of
affective systems. This includes understanding how models represent and
infer emotions and identifying key cues driving predictions.
6. Ethics, Fairness, Theory Integration, Philosophical Implications
- Bias and generalizability of affective systems across demographics
- Privacy and ethics in affective data collection
- Examining whether automatic NLP systems rely on current and valid
theories of affect and emotion
- The implications of machines modeling or simulating affect
- Societal considerations surrounding affective artificial agents
------------
IMPORTANT DATES
------------
*Submission deadline (**Extended**): 26 Feb 2026*
Notification of acceptance: 16 March 2026
Camera Ready Paper due: 30 March 2026
Workshop date: 16 May 2026
------------
SUBMISSION DETAILS
------------
We invite submissions for archival long and short papers, as well as
non-archival extended abstracts.
Archival long and short papers should feature novel and unpublished work
relating to the topics detailed above.
We also invite submissions of extended abstracts from researchers in the
broader Affective Science community, with up to two pages of content
featuring the research background/hypotheses and a description of
methods/results. Extended abstracts are non-archival, offering the
option for publication and presentation at other conference venues.
Archival Track:
Long Paper: Consists of up to 8 pages of content, with additional pages
for references, limitations, ethical considerations, and appendices.
Short Paper: Consists of up to 4 pages of content, with additional pages
for references, limitations, ethical considerations, and appendices.
(When preparing camera ready papers, you will be allowed one extra page
to address comments by the reviewers.)
Non-Archival Track:
Extended Abstract: Up to 2 pages.
------------
SUBMISSION FORMAT
------------
All submissions must use the LREC 2026 template and follow the
guidelines found at: https://lrec2026.info/authors-kit/ (Note: extended
abstracts can be limited to being 1-2 pages in length).
Mandatory Ethics Section: We ask all authors to include a section on
Ethical Considerations in their submission, touching on the ethical
concerns and broader societal impacts of the work. This discussion
section will not count towards the page limit.
------------
SUBMISSION SITE
------------
All submissions must be made through the SoftConf portal:
https://softconf.com/lrec2026/CAS
------------
ADDITIONAL DETAILS
------------
Website: https://casworkshop.github.io/
Attendance: The workshop will follow the attendance policy of the main
conference (https://lrec2026.info/registration-policy/ ).
------------
ORGANIZERS
------------
- Christopher Bagdon, University of Bamberg, Germany
- Krishnapriya Vishnubhotla, National Research Council Canada
- Kristen A. Lindquist, The Ohio State University, USA
- Lyle Ungar, University of Pennsylvania, USA
- Roman Klinger, University of Bamberg, Germany
- Saif M. Mohammad, National Research Council Canada
***Contact us at <cas-workshop(a)googlegroups.com> with any questions.***
The Information Disorder Workshop
Collocated with LREC 2026 in Palma de Mallorca, Spain
https://information-disorder-workshop.github.io/
* March 3: Paper submission (extension)
* March 17: Notification of acceptance
* March 30: Camera-ready submission
* May 12, 2026: InDor at LREC!
Online disinformation is a pressing challenge for our societies. Its role in influencing elections (Allcott & Gentzkow, 2017) and behaviours (van der Linden et al., 2020) has gathered the attention of different societal actors aimed at mitigating its negative impact.
The Natural Language Processing (NLP) community is contributing to fighting this phenomenon with a growing number of datasets (Hussain et al., 2025) and technologies (VeraAI, AskVera, Bellingcat) (Lupi et al., 2023; Wuhrl et al., 2023) for the automatic recognition of fake news. However, this field of research suffers from a lack of a common theoretical framework, which causes a fragmentation of approaches. The increasing attention of the NLP community to human-label variation (Plank, 2022) raises additional challenges regarding the cross-cultural and pragmatic implications that determine the spreading of disinformation (Dabbous et al., 2022).
The goal of the Information Disorder (InDor) workshop is to promote an interdisciplinary and intersectorial discussion towards the development of NLP research on disinformation.
Information Disorder is a recent framework introduced by Wardle and Derakhshan (2017) to organize theories, definitions, and approaches for the study of disinformation.
The framework is characterized by two main pillars: 1) acknowledging the need to categorize fake news under a finer-grained taxonomy of disorders (mis-information, dis-information, and mal-information); 2) exploring the role of the contextual factors that determine the spreading of fake news.
InDor aims to
Define a common theoretical ground for the research on disinformation in NLP and beyond
Discuss the cultural factors determining subjectivity to disinformation
Promote interdisciplinarity in the development of datasets and models
Discuss the impact of real-world applications to contrast disinformation
The InDor workshop (half-day duration) will be co-located with the fifteenth biennial Language Resources and Evaluation Conference (LREC) held at the Palau de Congressos de Palma in Palma de Mallorca, Spain, on 11-16 May 2026.
Submissions
When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones). In addition, authors will be required to adhere to ethical research policies on AI and may include an ethics statement in their papers.
The papers should be submitted as a PDF document, conforming to the formatting guidelines provided in the call for papers of the LREC conference. Templates are provided here https://lrec2026.info/authors-kit/
We accept three types of submissions (see the website for details)
Regular research papers;
Non-archival submissions: like research papers, but will not be included in the proceedings;
(Non-archival) research communications: 1-page abstracts summarising relevant research published elsewhere.
InDor will also accept submissions that have been rejected from ACL rolling review or other conferences (e.g., LREC), provided they are accompanied by their reviews, and they fit the topic of the workshop.
Research papers (archival or non-archival) may consist of up to 8 pages of content. Research communications may consist of up to 1 page of content. Please make the submission here: https://softconf.com/lrec2026/InDor26/
Topics
We invite original research papers specifically on the following topics, with a particular focus on resources, taxonomies, and benchmarks for the evaluation of NLP systems on Information Disorder:
new interdisciplinary theoretical proposals and foundational aspects
surveys on Information Disorder
multiculturality and multilinguality in datasets and technologies
interdisciplinary computational methods and frameworks
community- and user-centred approaches
real-world applications to contrast false information
experimental applications and projects for social good
evaluation of Information Disorder-focused systems
generative approaches to contrast false information
participatory approaches
positions on Information Disorder
Submissions are open to all and are to be submitted anonymously (and must conform to the instructions for double-blind review). All papers will be refereed through a double-blind peer review process by at least three reviewers, with final acceptance decisions made by the workshop organisers. Scientific papers will be evaluated based on relevance, significance of contribution, impact, technical quality, scholarship, and quality of presentation.
Attendance
At least one author of each accepted paper is required to participate in the conference and present the work, in-person or online.
Workshop organisers:
Simona Frenda, Heriot-Watt University
Marco Antonio Stranisci, University of Turin
Shaina Ashraf, Phillips University of Marburg
Ada Ren, Macquarie University
Ioannis Konstas, Heriot-Watt University
Usman Naseem, Macquarie University
Contact us at s.frenda(a)hw.ac.uk if you have any questions.
Website: https://information-disorder-workshop.github.io/
Full-time permanent Research Engineer Position in NLP
Bibliome team at MaIAGE-INRAE, France
The Bibliome team<http://bibliome> (MaIAGE<https://maiage.inrae.fr/> laboratory) is offering a full-time permanent position of research engineer in Natural Language Processing at INRAE research center within Paris-Saclay University, located in the Paris area, France.
INRAE<https://www.inrae.fr/en> (France’s National Research Institute for Agriculture, Food and Environment) is a public leading research institute, internationally recognised for the scientific excellence and societal impact of its work. INRAE addresses major global challenges related to biodiversity preservation, sustainable agricultural and food systems, climate change adaptation, and environmental risk management.
Within this context, the Bibliome team develops cutting-edge NLP research at the intersection of AI and Life Sciences with the aim of advancing large-scale knowledge extraction from documents, using state-of-the-art transformer architectures, large language models (LLMs), knowledge graphs, and domain ontologies. It develops advanced methods for entity linking, relation extraction, semantic representation, and structured knowledge integration, along with robust evaluation frameworks and reusable research software contributing to next-generation knowledge infrastructures.
This position offers the opportunity to work in a dynamic interdisciplinary environment, combining fundamental research, methodological innovation, and high-impact applications.
Position description and recruitment conditions at:
https://jobs.inrae.fr/concours/concours-externes-ingenieurs-cadres-technici…
Contacts :
Robert.Bossy(a)inrae.fr<mailto:Robert.Bossy@inrae.fr>
Louise.Deleger(a)inrae.fr<mailto:Louise.Deleger@inrae.fr>
Key dates:
- Application deadline : 19 March 2026
- Interview: from 1st to 19 June 2026
- Starting date: September/October 2026
Call for Papers
**************************************************************
19th WORKSHOP ON BUILDING AND USING COMPARABLE CORPORA
Co-located with LREC 2026, Palma de Mallorca (in-person & online)
May 11, 2026
Paper submission deadline: February 28, 2026
Workshop website: https://comparable.lisn.upsaclay.fr/bucc2026/
Main conference website: https://lrec2026.info/
**************************************************************
MOTIVATION
In the language engineering and linguistics communities, research
in comparable corpora has been motivated by two main reasons. In
language engineering, on the one hand, it is chiefly motivated by
the need to use comparable corpora as training data for data-driven
NLP applications such as statistical and neural machine translation, or
cross-lingual retrieval. In linguistics, on the other hand, comparable
corpora are of interest because they enable cross-language discoveries
and comparisons. It is generally accepted in both communities that
comparable corpora consist of documents that are comparable in content
and form in various degrees and dimensions across several languages.
Parallel corpora are on the one end of this spectrum, and unrelated
corpora are on the other. Increasingly, these resources are not only
collected, but also augmented or even created synthetically, which
raises new questions about how to define and measure comparability.
In recent years, the use of comparable corpora for pre-training Large
Language Models (LLMs) has led to their impressive multilingual and
cross-lingual abilities, which are relevant to a range of applications,
including information retrieval, machine translation, cross-lingual text
classification, etc. The linguistic definitions and observations related
to comparable corpora are crucial to improve methods to mine such corpora,
to assess and document synthetic data, and to improve cross-lingual transfer
of LLMs. Therefore, it is of great interest to bring together builders and
users of such corpora.
PANEL DISCUSSION
The panel discusses the impact of synthetic data on comparable corpora
research. Fundamental questions about how LLMs transform our understanding
and use of multilingual data are addressed.
TOPICS
We solicit contributions on all topics related to comparable (and parallel)
corpora, including but not limited to the following:
Building Comparable Corpora
- Automatic and semi-automatic methods, including generating
comparable corpora using LLMs
- Methods to mine parallel and non-parallel corpora from the web
- Tools and criteria to evaluate the comparability of corpora
- Parallel vs non-parallel corpora, monolingual corpora
- Rare and minority languages, within and across language families
- Multi-media/multi-modal comparable corpora
Synthetic Data for Comparable Corpora
- LLM generation of comparable/parallel data
- Improving comparability of synthetic data
- Incidental bilingualism & pre-training use of comparable data
- Comparability & cross-lingual consistency
- Detection & attribution of synthetic vs. human text
- English-centric effects & fairness across languages/scripts
- Evaluation & reproducibility for downstream tasks
Applications of Comparable Corpora
- Human translation
- Language learning
- Cross-language information retrieval & document categorization
- Bilingual and multilingual projections
- (Unsupervised) machine translation
- Writing assistance
- Machine learning techniques using comparable corpora
Mining from Comparable Corpora
- Cross-language distributional semantics, word embeddings and
pre-trained multilingual transformer models
- Extraction of parallel segments or paraphrases from comparable corpora
- Methods to derive parallel from non-parallel corpora (e.g. to provide
for low-resource languages in neural machine translation)
- Extraction of bilingual and multilingual translations of single words,
multi-word expressions, proper names, named entities, sentences,
paraphrases etc. from comparable corpora.
- Induction of morphological, grammatical, and translation rules from
comparable corpora
- Induction of multilingual word classes from comparable corpora
Comparable Corpora in the Humanities
- Comparing linguistic phenomena across languages in contrastive linguistics
- Analyzing properties of translated language in translation studies
- Studying language change over time in diachronic linguistics
- Assigning texts to authors via authors' corpora in forensic linguistics
- Comparing rhetorical features in discourse analysis
- Studying cultural differences in sociolinguistics
- Analyzing language universals in typological research
IMPORTANT DATES
28 Feb 2026: Paper Submission deadline
22 Mar 2026: Notification of acceptance
29 Mar 2026: Camera-ready final papers
14 Apr 2026: Workshop Programme final version
11 May 2026: Workshop date
All deadlines are 11:59PM UTC-12:00 (“anywhere on earth”).
For updates of the schedule, please see the workshop website.
PRACTICAL INFORMATION
The workshop is a hybrid event, both in-person and online. Workshop
registration is via the main conference registration site, see
https://lrec2026.info/
The workshop proceedings will be published in the ACL Anthology
(https://aclanthology.org/).
SUBMISSION GUIDELINES
Please follow the style sheet and templates (for LaTeX, Overleaf and
MS-Word) provided for the main conference at
https://lrec2026.info/authors-kit/
Papers should be submitted as a PDF file using the START conference
manager at https://softconf.com/lrec2026/BUCC2026/
Submissions must describe original and unpublished work and range from 4
to 8 pages plus unlimited references. Reviewing will be double blind, so
the papers should not reveal the authors' identity. Accepted papers will
be published in the workshop proceedings.
Double submission policy: Parallel submission to other meetings or
publications is possible but must be notified to the workshop organizers
by e-mail immediately upon submission to another venue.
For further information and updates, please see the BUCC 2026 web page
at https://comparable.lisn.upsaclay.fr/bucc2026/.
WORKSHOP ORGANIZERS
- Reinhard Rapp (University of Mainz, Germany)
- Ayla Rigouts Terryn (Université de Montréal, Mila, Canada)
- Serge Sharoff (University of Leeds, United Kingdom)
- Pierre Zweigenbaum (Université Paris-Saclay, CNRS, France)
Contact: reinhardrapp (at) gmx (dot) de
PROGRAMME COMMITTEE
- Ebrahim Ansari (Institute for Advanced Studies in Basic Sciences, Iran)
- Eleftherios Avramidis (DFKI, Germany)
- Gabriel Bernier-Colborne (National Research Council, Canada)
- Kenneth Church (VecML.com, USA)
- Patrick Drouin (Université de Montréal, Canada)
- Alex Fraser (Technical University of Munich, Germany)
- Natalia Grabar (CNRS, University of Lille, France)
- Amal Haddad Haddad (Universidad de Granada, Spain)
- Kyo Kageura (University of Tokyo, Japan)
- Natalie Kübler (Université Paris Cité, France)
- Philippe Langlais (Université de Montréal, Canada)
- Yves Lepage (Waseda University, Japan)
- Shervin Malmasi (Amazon, USA)
- Michael Mohler (Language Computer Corporation, USA)
- Emmanuel Morin (Nantes Université, France)
- Dragos Stefan Munteanu (RWS, USA)
- Preslav Nakov (Mohamed bin Zayed University of AI, United Arab Emirates)
- Ted Pedersen (University of Minnesota, Duluth, USA)
- Reinhard Rapp (University of Mainz, Germany)
- Ayla Rigouts Terryn (Université de Montréal & Mila, Canada)
- Nasredine Semmar (CEA LIST, Paris, France)
- Serge Sharoff (University of Leeds, UK)
- Richard Sproat (Sakana.ai, Tokyo, Japan)
- Marko Tadić (University of Zagreb, Croatia)
- François Yvon (CNRS & Sorbonne Université, France)
- Pierre Zweigenbaum (Université Paris-Saclay, CNRS, France)
INFORMATION ABOUT THE LRE 2026 MAP AND THE "SHARE YOUR LRs!" INITIATIVE
When submitting a paper from the START page, authors will be asked to
provide essential information about resources (in a broad sense, i.e.
also technologies, standards, evaluation kits, etc.) that have been used
for the work described in the paper or are a new result of the research.
Moreover, ELRA encourages all LREC authors to share the described LRs
(data, tools, services, etc.) to enable their reuse and replicability of
experiments (including evaluation ones).
Call for Participation
We are pleased to announce the PestCLEF2026<https://www.imageclef.org/PestCLEF2026> shared task as part of the LifeCLEF evaluation lab<https://clef2026.clef-initiative.eu/labs/lifeclef/> at CLEF 2026<https://clef2026.clef-initiative.eu/>.
Task
-------------------------
PestCLEF2026 is a knowledge graph extraction task framed as a document-level relation extraction problem. Target relations reflect ecological interactions and events relevant to plant health monitoring, which involve entities such as Host, Pest, Disease, Vector, and Location.
Important Dates
-------------------------
- Now: Registration is open and training data is released
- 23 April 2026: Registration closes
- 7 May 2026: Competition Deadline
- 28 May 2026: Deadline for submission of working note papers at the CLEF Conference by participants (CEUR-WS proceedings)
- 30 June 2026: Notification of acceptance of working note papers
- 6 July 2026: Camera-ready deadline for working note papers
- 21–24 September 2026: CLEF 2026 Conference in Jena, Germany
Register and Participate
-------------------------
Registration: https://clef-labs-registration.dipintra.it/registrationForm.php
CLEF2026 Discord server (see #lifeclef): https://discord.gg/PEMh4a2YHV
PestCLEF2026 homepage (general info): https://www.imageclef.org/PestCLEF2026
PestCLEF2026 Kaggle page (to participate): https://www.kaggle.com/competitions/pest-clef-2026
Organizers
-------------------------
Robert Bossy (Paris-Saclay University, INRAE, France)
Claire Nédellec (Paris-Saclay University, INRAE, France)
Marine Courtin (Paris-Saclay University, INRAE, France)
Louise Deléger (Paris-Saclay University, INRAE, France)
Keep in touch via Kaggle or Discord. We are looking forward to your submission!
The PestCLEF2026 team
*** Last Call for Workshop Proposals ***
International Conference on Software and Systems Reuse, Product Lines,
and Configuration (VARIABILITY 2026)
29 September - 2 October 2026, 5* St. Raphael Resort and Marina
Limassol, Cyprus
https://conf.researchr.org/home/variability-2026
VARIABILITY is a new conference that has been merged of three prominent conferences
focussing on software and systems variability, configuration and reuse: SPLC (the
International Systems and Software Product Line Conference, 29 successful editions,
ranked as a top conference), VaMoS (the International Working Conference on Variability
Modelling of Software-Intensive Systems, 19 successful editions), and ICSR (the
International Conference on Systems and Software Reuse, 22 successful editions).
We invite you to submit proposals for half-day or full-day workshops in any area related
to the field of Software and Systems Reuse, Product Lines, and Configuration, all of which
fall under the broader area of Variability. In particular, workshops on challenging,
emerging areas related to the conference topics are especially sought. We particularly
encourage workshop proposals for highly interactive and collaborative workshops, rather
than mini-conferences, e.g., apart from the traditional short and long papers, consider
allowing position papers with only one page (not included in the proceedings) and focus
on a lively discussion after the presentation, to foster new ideas and gather feedback
(rather than just defending the presented work). The expected date of the workshops
will be the September 29th, 2026, before the main track of the conference.
Submissions / Publishing
VARIABILITY workshop papers will be published in a volume of the conference proceedings
published by Springer. Moreover, a one-page summary of each accepted workshop will be
published in the proceedings as well.
Workshop proposals should be authored by at least two organizers, preferably from
different institutions, and they should contain the following three sections and address
each corresponding point:
1. Organizers
• Name: organizers’ full names
• Contact information: affiliations, job titles, postal addresses, e-mail addresses, URLs,
and phone
• Brief biography: 100-200 words, focusing on the organizers’ expertise in the field and
experience as workshop organizers
2. Workshop Content
• Title: workshop title and acronym
• Abstract: max 150 words describing the workshop (suitable for the conference’s website)
• Tentative Website URL
• Topics and motivation:
• What are the topics, themes, and areas of interest of the workshop?
• How is the workshop relevant to VARIABILITY?
• How does the workshop connect VARIABILITY to other research communities?
• Goals and expected results:
• Explicitly state the goals of the workshop and how you intend to reach them
• What are the expected results of the workshop?
• How will these results be disseminated?
• Format:
• What is the planned workshop format (paper presentations, working sessions, invited
talks (please note here that such talks are not financially supported by the conference),
lightning talks, demonstrations, etc.)?
• To avoid duplicated topics and cancellations, did you coordinate with or (plan to)
merge workshops on the same/similar topics from previous years (if there are any)?
• What will be done to stimulate collaborative interaction?
• What are the planned pre- and post-workshop activities?
• Participants:
• What is the expected number of submissions and participants? Provide a plan for
attracting sufficient submissions and promoting attendance
• If applicable, please provide information from previous or related workshops. Have
there been previous workshops on the same or a closely related topic? When, where
and with how many participants?
• Special room equipment (please note that VARIABILITY conference and the workshops
are in-person events) like flip charts, microphone, etc.
• Do you plan for a half-day or full-day workshop?
• Program Committee: list of tentative program committee members, names and
affiliations
3. Preliminary Call for Papers
This will necessarily repeat some of the information from the previous sections but should
be targeted towards prospective participants. It should address the following items:
• Overview of the motivation, topics, and goals
• Workshop format
• Deadlines of the workshop (see dates in this call for proposals)
• Submission guidelines and review process
• References to previous workshops (websites)
• Dissemination campaign to distribute the CFP
4. References to previous workshops (websites)
Submission Instructions
Please send your workshop proposals using EasyChair:
https://easychair.org/conferences?conf=variability2026
A workshop proposal must be at most 8 pages long. Submissions must follow the
Springer guidelines:
https://www.springer.com/gp/computer-science/lncs/conference-proceedings-gu…
Relevant supporting material, such as proceedings from previous editions of the proposed
workshop or other workshops organized by the proposal authors, should be included if
available but are not required for submission.
Acceptance Criteria
Each workshop proposal will be evaluated according to the relevance of its topic, the
expertise and experience of the workshop organizers, and the workshop’s potential for
attracting participants and generating useful results. We underline the importance of
active and creative workshops that foster a collaborative environment of interest to both
practitioners and researchers, aiming, e.g., to evolve the field of Variability and to identify
elements of joint future work. To obtain a balanced and cohesive workshop program, the
Organizing Committee will collaborate closely with workshop organizers and reserves the
right to circulate proposals to other submitters in view of possible workshop mergers. The
organizers of accepted workshops will be required to create and maintain a website in a
timely manner to serve as a workshop information center and to provide a repository for
documenting pre- and post-workshop activities.
At least one author of each accepted proposal must register and attend VARIABILITY 2026
in order for the workshop to be accepted and the summary of the workshop published.
The submission and review platform for workshop papers will be the one for the main
conference (i.e., all workshops will be as different tracks under the same Easy Chair
installation).
Important Dates (AoE)
• Workshop Proposals: 2 March 2025
• Notification of Acceptance: 16 March 2026
• Workshop Papers Submission: 15 June 2026
• Workshop Papers Notification: 7 July, 2026
• Camera-Ready Version Submission: 14 July, 2026
• Workshop Summary: 14 July, 2026
• Author Registration: 14 July, 2026
Organisation
General Chairs
• George A. Papadopoulos, University of Cyprus, Cyprus
• Gilles Perrouin, FNRS & University of Namur, Belgium
Research Track Chairs
• Thorsten Berger, Ruhr University Bochum, Germany
• Ina Schaefer, KIT, Germany
Industry Track Chairs
• Shaukat Ali, Simula Research Lab and Oslo Metropolitan University, Norway
• Martin Becker, Fraunhofer IESE, Germany
Journal First Track Chairs
• Mathieu Acher, University Rennes, Inria, CNRS, IRISA, France
• Xhevahire Tërnava, LTCI, Télécom Paris, Institut Polytechnique de Paris, France
Doctoral Symposium Track Chairs
• Rick Rabiser, LIT CPS, Johannes Kepler University Linz, Austria
• Iris Reinhartz-Berger, University of Haifa, Israel
Demos and Tools Track Chairs
• Sandra Greiner, University of Southern Denmark, Denmark
• Leopoldo Teixeira, Federal University of Pernambuco
Projects Showcase Chairs
• Daniel Struber, Chalmers, University of Gothenburg, Radbound University, Sweden
• Dalila Tamzalit, Nantes Université, France
Hall of Fame Chairs
• Martin Becker, Fraunhofer IESE, Germany
• Goetz Botterweck, Lero - The Irish Software Research Centre and University of Limerick, Ireland
• Natsuko Noda, Shibaura Institute of Technology, Japan
Workshops Chairs
• Lidia Fuentes, Universidad de Malaga, Spain
• Malte Lochau, University of Siegen, Germany
Tutorials Chairs
• Loek Cleophas, Eindhoven University of Technology and Stellenbosch University, The Netherlands
• Mahsa Varshosaz, IT University of Copenhagen, Denmark
Proceedings Chair
• Sophie Fortz, King's College London, UK
Publicity Chairs
• Wesley Assunção, North Carolina State University, USA
• Kentaro Yoshimura, Hitachi Ltd, Japan
Local Organiser and Finance Chair
• George A. Papadopoulos, University of Cyprus, Cyprus
# CLEF 2026 - Call for Papers
CLEF 2026 Conference and Labs of the Evaluation Forum
Information Access Evaluation meets Multilinguality, Multimodality, and Visualization
21-24 September 2026, Jena, Germany
https://clef2026.clef-initiative.eu/calls/papers/
# Good to Know
The CLEF 2026 Conference welcomes papers in the Information Access domain that describe rigorous hypothesis testing regardless of whether the results are positive or negative. Each submission is reviewed in two stages, see details below.
# Aim and Scope
The CLEF Conference addresses all aspects of Information Access in any modality and language. CLEF consists of the presentation of research papers and a series of workshops presenting the results of lab-based comparative evaluation benchmarks.
CLEF 2026 is the 17th CLEF conference, continuing the popular CLEF campaigns that have run since 2000, contributing to the systematic evaluation of information access systems, primarily through experimentation on shared tasks.
The CLEF conference has a clear focus on experimental Information Access as carried out within evaluation forums (e.g., CLEF Labs, TREC, NTCIR, FIRE, MediaEval, RomIP, SemEval, and TAC) with special attention to the challenges of multimodality, multilinguality, and interactive search in different domains, also considering specific classes of users, such as children, students, or impaired users in different tasks (e.g., academic, professional, or everyday life).
We invite paper submissions on significant new insights demonstrated on information access test collections, on the analysis of test collections and evaluation measures, and on concrete proposals to push the boundaries of the Cranfield-style evaluation paradigm.
All submissions to the CLEF main conference will be reviewed on the basis of relevance, originality, importance, and clarity. CLEF welcomes papers that describe rigorous hypothesis testing regardless of whether the results are positive or negative. CLEF also welcomes past runs, results, data analyses, and new data collections. Methods are expected to be written so that they are reproducible by others, and the logic of the research design should be clearly described in the paper. Linking to additional resources, such as code or data repositories, is encouraged. The conference proceedings will be published in the Springer Lecture Notes in Computer Science (LNCS).
# Topics
Relevant topics for the CLEF 2026 Conference include, but are not limited to:
- Information access in any language or modality: information retrieval, question answering, recommender systems, image retrieval, search interfaces and design, infrastructures, etc.
- Interactive and conversational search evaluation: the interactive/conversational evaluation of retrieval-augmented generation systems, information retrieval systems using user-centered methods, evaluation of novel search interfaces, novel interactive/conversational evaluation methods, simulation of interaction/conversation, etc.
- Analytics for information access: theoretical and practical results in the analytics field specifically targeted at information access data analyses, data enrichment, etc.
- Reproducibility and replicability: analyses of past results/runs in depth.
- Fairness, Accountability, Transparency, Ethics, and Explainability (FATE) in information access.
- Language diversity in information access: work on low-resource languages.
- Models leveraging collaborative and social data, and their evaluation.
User studies, either based on lab studies or crowdsourcing.
- Evaluation initiatives: conclusions, lessons learned, impact, and projections of any evaluation initiative upon completing its cycle.
- Evaluation: methodologies, metrics, statistical and analytical tools, component-based evaluation, user groups and use cases, ground-truth creation, impact of multilingual/multicultural/multimodal differences, etc.
- Technology transfer: economic impact/sustainability of information access approaches, deployment and exploitation of systems, use cases, etc.
- Specific application domains: information access and its evaluation in application domains such as cultural heritage, digital libraries, social media, health information, legal documents, patents, news, books, and in the form of text, audio, and/or image data.
- New data collection: presentation of new data collections with potential high impact on future research, specific collections from companies or labs, and multilingual collections.
- Reflections on past achievements and future research directions, roadmaps, outlooks for future developments, and lessons learned.
# Format
Authors are invited to electronically submit original papers, which have not been published and are not under consideration elsewhere, using the LNCS proceedings format (http://www.springer.com/it/computer-science/lncs/conference-proceedings-gui…)
Three categories of papers will be accepted:
- Long research papers: 12 pages plus references for complete research work
- Short research papers: 6 pages plus references for position/discussion papers, new evaluation proposals, developments and applications, etc.
- Past, Present, Future: Up to 12 pages for future directions and research roadmaps, reflections on past achievements, etc.
Papers should be anonymous. Sharing code and data with reviewers should be done via anonymous repositories (such as https://anonymous.4open.science/).
# Review Process for Research Papers
Research papers will be peer-reviewed by three members of the programme committee in two stages using a results-blind reviewing process. At the first stage, the members will review the paper's originality, clarity, technical & theoretical soundness, and methodology. At the second stage, the complete manuscripts that passed the first stage will be reviewed. At this stage, reviewers will also look at the reproducibility of the work. The final decision will not be based on whether results are positive or beat a baseline. Therefore, negative results and failed experiments are explicitly welcome.
Authors of long and short papers are asked to submit TWO versions of their manuscript:
1. Methodology version (restricted): This version does NOT report anything related to the results of the study. At this stage, the manuscripts will be evaluated based on the importance of the problem addressed and the soundness of the methodology. Manuscripts can include an introduction, a description of the proposed methodology, and datasets used. However, there should be no section on results and discussion. The authors should also remove any mentions of results from the included sections (e.g., the abstract and introduction).
2. Experimental version (complete): The complete manuscript that contains all the sections of the paper, including the experiments and results.
The submission deadline for both versions is 15 May 2026.
Authors of Past, Present, Future papers are asked to submit to the “Past, Present, Future” track.
The submission deadline for the Past, Present, Future track is 15 May 2026.
# Paper Submission
Papers should be submitted in PDF format at https://easychair.org/conferences/?conf=clef2026
- Submit the methodology/restricted version to the "Conference - Methodology Part" track.
- Submit the experimental/complete version to the "Conference - Experimental Part" track.
- Submit the Past, Present, Future papers to “Conference - Past, Present, Future” track.
- Submit the best of CLEF 2025 Labs papers to the "Conference - Best of CLEF 2025 Labs" track.
# Best Paper Award
A Best Paper Award will be given to one outstanding conference paper accepted to the conference. This award, sponsored by Springer LNCS, includes a certificate and a 500 EUR prize.
# Organisation
Programme Chairs
- Philipp Schaer, Technische Hochschule Köln, Germany
- Eva Zangerle, University of Innsbruck, Austria
Lab Chairs
- Sean MacAvaney, University of Glasgow, United Kingdom
- Julia Maria Struß, Fachhochschule Potsdam University of Applied Sciences, Germany
General Chairs
- Matthias Hagen, Friedrich-Schiller-Universität Jena, Germany
- Martin Potthast, University of Kassel, hessian.AI, ScaDS.AI, Germany
- Benno Stein, Bauhaus-Universität Weimar, Germany