We have had several requests to extend the submission deadline and we have decided to extend the deadline to the 30th of June 2025.
Submission Link: https://openreview.net/group?id=IWCS/2025/Workshop/CxGs_NLP
Please see details below:
We’re thrilled to see such strong interest in the second iteration of the CxG + NLP workshop, which will be held as part of IWCS. With three exciting keynote speakers confirmed (Prof Adele Goldberg, Prof Thomas Hoffmann, Prof Laura A. Michaelis), we’re looking forward to what promises to be a very engaging event.
The first workshop took place shortly after the release of ChatGPT. Now, two years on, the field has evolved dramatically with the rise of generative AI and the development of new large language models (LLMs). These developments make it all the more important to bring together researchers and practitioners to discuss the evolving landscape of CxG and NLP. In addition, in the time since the first workshop, there has been significant growth in the community’s interest at this intersection, and we believe it is the ideal moment to have a second iteration where we take stock of these recent developments.
We warmly invite your submissions to the workshop, and would like to remind you of the key dates:
30th of June 2025 (Extended) – Submission deadline
August 1 – Notification of acceptance, registration opens
August 22 – Camera-ready papers due
September 22–23 – IWCS main conference
September 24 – Workshop
September 25 – Community-building event
For more details, please visit the workshop website or get in touch with us: https://sites.google.com/view/2ndcxgsnlpworkshop/home
Bonn Talks on Research Trends in Applied Linguistics - Does AI language
processing align with human processing? (Prof. Scott Crossley,
Vanderbilt University, USA)
June 26, 12.15 pm - 1.45 pm CEST
Hybrid talk - Sign up under:
https://uni-bonn.zoom-x.de/meeting/register/SlGzaF2LTrux4HE06KmdvA
Abstract: This talk will provide an overview of the architecture that
underpins modern AI language models including n-gram language models,
word embedding models, and modern transformer models. These models will
be examined for alignment with theories of human language processing.
The talk will also focus on how AI models recreate classical language
processing pipelines associated with computational linguistics and
language processing.
Prof. Dr. Robert Fuchs | Head of Department and Professor of English
Linguistics | Department of English, American and Celtic Studies |
University of Bonn | Rabinstr. 8 53113 Bonn, Germany |
https://uni-bonn.academia.edu/RFuchs |
https://www.iaak.uni-bonn.de/bael/en/people/chair/prof-dr-robert-fuchs |
https://sites.google.com/view/rflinguistics/
*Recent publications:*
Coats, S., Basile, A., Morin, C. & Fuchs, R. (to appear). *The YouTube
Corpus of Singapore English Podcasts*. /English World-Wide/
Fuchs, R. et al. (to appear). *Non-standard morphosyntactic variation in
L2 English varieties world-wide: A corpus-based study
<https://www.sciencedirect.com/science/article/pii/S0024384125000737>*.
/Lingua/.
Fuchs, R., Wiltshire, C. & Sarmah, P. (to appear). *The role of English
in the linguistic ecology of Northeast India
<https://www.academia.edu/125365118/The_role_of_English_in_the_linguistic_ec…>*.
In P. Siemund, et al. (Eds.), /World Englishes in their Local
Multilingual Ecologies/. Amsterdam: Benjamins.
Lange, C., & Fuchs, R. (to appear). *English in India*. In R. Hickey &
K. Burridge (Eds.), /New Cambridge History of the English Language/.
Cambridge: CUP.
Fuchs, R. (2025). *Influencing people around the globe - The linguistic
expression of persuasion across varieties of English worldwide*
<https://www.academia.edu/107491904/Influencing_people_around_the_globe_The_…>.
In D. Dayter, & S. Rüdiger (Eds.), /Manipulation, Influence, and
Deception: The Changing Landscape of Persuasive Language/, 135-156.
Cambridge: CUP.
Bonn Talks on Research Trends in Applied Linguistics - Exploring the
learner lexicon through NLP Approaches (Prof. Scott Crossley, Vanderbilt
University, USA)
June 27, 2.15 pm – 3.45 pm CEST
Hybrid talk - Sign up under:
https://uni-bonn.zoom-x.de/meeting/register/nuoiB3N7Q7qx-ZNKwOm5Hw
Abstract:This talk and its subsequent workshop will explore lexical
properties in the English language and methods to automatically
calculate lexical features. The follow-up workshop will focus on
introducing natural language processing tools for lexical studies and
how they can be used to assess language learner data in a large corpus
collected in an English as a Foreign Language (EFL) setting. Data
analysis techniques and hands-on data exploration will provide practical
applications using learner corpora.
Prof. Dr. Robert Fuchs | Head of Department and Professor of English
Linguistics | Department of English, American and Celtic Studies |
University of Bonn | Rabinstr. 8 53113 Bonn, Germany |
https://uni-bonn.academia.edu/RFuchs |
https://www.iaak.uni-bonn.de/bael/en/people/chair/prof-dr-robert-fuchs |
https://sites.google.com/view/rflinguistics/
*Recent publications:*
Coats, S., Basile, A., Morin, C. & Fuchs, R. (to appear). *The YouTube
Corpus of Singapore English Podcasts*. /English World-Wide/
Fuchs, R. et al. (to appear). *Non-standard morphosyntactic variation in
L2 English varieties world-wide: A corpus-based study
<https://www.sciencedirect.com/science/article/pii/S0024384125000737>*.
/Lingua/.
Fuchs, R., Wiltshire, C. & Sarmah, P. (to appear). *The role of English
in the linguistic ecology of Northeast India
<https://www.academia.edu/125365118/The_role_of_English_in_the_linguistic_ec…>*.
In P. Siemund, et al. (Eds.), /World Englishes in their Local
Multilingual Ecologies/. Amsterdam: Benjamins.
Lange, C., & Fuchs, R. (to appear). *English in India*. In R. Hickey &
K. Burridge (Eds.), /New Cambridge History of the English Language/.
Cambridge: CUP.
Fuchs, R. (2025). *Influencing people around the globe - The linguistic
expression of persuasion across varieties of English worldwide*
<https://www.academia.edu/107491904/Influencing_people_around_the_globe_The_…>.
In D. Dayter, & S. Rüdiger (Eds.), /Manipulation, Influence, and
Deception: The Changing Landscape of Persuasive Language/, 135-156.
Cambridge: CUP.
The UKP Lab at the Department of Computer Science, Technical University Darmstadt, Germany, is looking for a
*** fully funded researcher (PhD or Postdoc)***
for an interdisciplinary project on Agentic LLMs. The project’s goal is to support writing and grading complex documents in education and beyond. You will work at the intersection of Natural Language Processing and agentic AI reasoning and planning embedded in a real-life product-level user-facing platform.
🔗 More information:
https://www.informatik.tu-darmstadt.de/ukp/ukp_home/jobs_ukp/2025_phd_agent…
📩 Apply here:
https://careers.ukp.informatik.tu-darmstadt.de/ukprecruitment
📅 Application deadline: July 11th, 2025
--------------------------------------------------------------------
Prof. Dr. Iryna Gurevych
UKP Lab
Technical University Darmstadt, Germany
http://www.ukp.tu-darmstadt.de/
*apologies for cross-postings*
=== Workshop SIR ===
First Workshop on Semantics for Interdisciplinary Research
SIR@IXCS2025 - Düsseldorf - September 24 2025
=================================
https://team.inria.fr/semagramme/first-workshop-on-semantics-for-interdisci…https://openreview.net/group?id=inria.fr/INRIA/S%C3%A9magramme/2025/SIR01
=================================
In recent years, Natural Language Processing (NLP) has increasingly intersected with the humanities and social sciences, offering new methodologies for analyzing textual data, interpreting meaning, and modelling language-based phenomena. The potential for multi-disciplinary research using NLP methods is particularly great in computational semantics (CS), as its ability to process and represent meaning opens up innovative pathways for researchers in history, philosophy, literary studies, political science, etc. This workshop aims to explore how semantic models and tools can be leveraged to tackle traditional and emerging questions in the Humanities in a broader sense (Social Sciences, Law, Economics, Management, Literature, Languages, Art, …).
A major theme of SIR is the role of semantics in NLP applied to the humanities (both statistical and symbolic approaches).
=== Topics to Explore ===
• CS and the humanities: issues, tools and applications
• Quantitative and qualitative approaches as a breakthrough in the Humanities
• NLP transforming humanities issues
• Contributions and limitations for understanding meaning
• Links between formal semantics and neural models
• Ambiguity, polyphony and interpretation in the Humanities
• Ethics and bias in semantic modelling
• Interdisciplinary dialogue between AI, NLP and Humanities
=== Dates ===
• Deadline : July 14th (anywhere on earth)
• Notification : August 25th (anywhere on earth)
• Camera Ready : September 10th (anywhere on earth)
• Workshop : September 24th (anywhere on earth)
=== Submission Information ===
Papers should describe original research and must not exceed 4 pages (with an extra page in the camera ready version for accepted papers). Papers should be submitted no later than 14 July 2025 (anywhere on earth).
Accepted papers will be published in the conference proceedings in the ACL Anthology. For inclusion in the proceedings, at least one author must register to the conference and present the paper in person.
Submissions should be fully anonymous to ensure double-blind reviewing.
=== Submission ===
https://openreview.net/group?id=inria.fr/INRIA/S%C3%A9magramme/2025/SIR01
=== Style Files ===
The workshop follow the IWCS 2025 template see the workshop web page.
=== Organizers ===
Maxime Amblard, Université de Lorraine
Ellen Breitholtz, Gothenburg University
=== Contact ===
maxime.amblard(a)univ-lorraine.fr and ellen.breitholtz(a)ling.gu.se
***Apologies for cross-posting ***
-------------------------------------------
CLEF 2026
Conference and Labs of the Evaluation Forum
Jena, Germany, September 21-24, 2026
https://clef2026.clef-initiative.eu/ <http://clef2022.clef-initiative.eu>
-------------------------------------------
Call for Lab Proposals
Background
The CLEF Initiative <http://www.clef-initiative.eu/>is a self-organised
body whose main mission is to promote research, innovation, and
development of information access systems with an emphasis on
multilingual information in different modalities - including text and
multimedia - with various levels of structure. CLEF promotes research
and development by providing an infrastructure for:
1.
Independent evaluation of information access systems
2.
Investigation of the use of unstructured, semi-structured,
highly-structured, and semantically enriched data in information access
3.
Creation of reusable test collections for benchmarking
4.
Exploration of new evaluation methodologies and innovative ways of
using experimental data
5.
Discussion of results, comparison of approaches, exchange of ideas,
and transfer of knowledge
Scope of CLEF Labs
We invite submission of proposals for two types of labs:
1.
"Campaign-style" Evaluation Labs for specific information access
problems (during the twelve months period preceding the conference),
similar in nature to the traditional CLEF campaign "tracks". Topics
covered by campaign-style labs can be inspired by any information
access-related domain or task.
2.
Labs that follow a more classical "workshop" pattern, exploring
evaluation methodology, metrics, processes, etc. in information
access and closely related fields, such as natural language
processing, machine translation, and human-computer interaction.
We highly recommend organisers new to the CLEF format of shared task
evaluation campaigns to first consider organising a lab workshop to
discuss the format of their proposed task, the problem space and
practicalities of the shared task. The CLEF 2026 programme will reserve
about half of the conference schedule for lab sessions.
During the conference, the lab organisers will present their overall
results in overview presentations during the plenary scientific paper
sessions to give non-participants insights into where the research
frontiers are moving. Lab organisers are expected to organise separate
sessions for their lab with ample time for general discussion and
engagement with all participants - not just those presenting campaign
results and papers. Organisers should plan time in their sessions for
activities such as panels, demos, poster sessions, etc. as appropriate.
CLEF is always interested in receiving and facilitating innovative lab
proposals.
Potential task proposers unsure of the suitability of their task
proposal or its format for inclusion at CLEF are encouraged to contact
the CLEF 2026 Lab Organizing Committee Chairs to discuss its suitability
or design at an early stage.
Proposal Submission
Lab proposals must provide sufficient information to judge the
relevance, timeliness, scientific quality, benefits for the research
community, and the competence of the proposers to coordinate the lab.
Each lab proposal should identify one or more organisers as responsible
for ensuring the timely execution of the lab. Proposals should be 3 to 4
pageslong and should provide the following information:
1.
Title of the proposed lab.
2.
A brief description of the lab topic and goals, its relevance to
CLEF and the significance for the field.
3.
A brief and clear statement on usage scenarios and domain to which
the activity is intended to contribute, including the evaluation
setup and metrics.
4.
Details on the lab organiser(s), including identifying the task
chair(s) responsible for ensuring the running of the task. This
should include details of any previous involvement in organising or
participating in evaluation tasks at CLEF or similar campaigns.
5.
The planned format of the lab, i.e., campaign-style (“track”) or
workshop.
6.
Is the lab a continuation of an activity from previous year(s) or a
new activity?
1.
For activities continued from previous year(s): Statistics from
previous years (number of participants/runs for each task), a
clear statement on why another edition is needed, an explicit
listing of the changes proposed, and a discussion of lessons to
be learned or insights to be made.
2.
For new activities: A statement on why a new evaluation campaign
is needed and how the community would benefit from the activity.
7.
Details of the expected target audience, i.e., who do you expect to
participate in the task(s), and how do you propose to reach them.
8.
Brief details of tasks to be carried out in the lab. The proposal
should clearly motivate the need for each of the proposed tasks and
provide evidence of its capability of attracting enough
participation. The dataset which will be adopted by the Lab needs to
be described and motivated in the perspective of the goals of the
Labs; also indications on how the dataset will be shared are useful.
It is fine for a lab to have a single task, but labs often contain
multiple closely related tasks, needing a strong motivation for more
than 3 tasks, to avoid useless fragmentation.
9.
Expected length of the lab session at the conference: half-day, one
day, two days. This should include high-level details of planned
structure of the session, e.g. participant presentations, invited
speaker(s), panels, etc., to justify the requested session length.
10.
Arrangements for the organisation of the lab campaign: who will be
responsible for activities within the task; how will data be
acquired or created, what tools or methods will be used, e.g., how
will necessary queries be created or relevance assessment carried
out; any other information which is relevant to the conduct of your lab.
11.
If the lab proposes to set up a steering committee to oversee and
advise its activities, include names, addresses, and homepage links
of people you propose to be involved.
Lab proposals must be submitted via EasyChair. The link will be
distributed, once EasyChair is set up.
Review Process
Each proposal submitted by 14 July 2025will be reviewed by the CLEF 2026
Lab Organising Committee. The acceptance decision will be sent by email
to the responsible organiser by 4 Aug 2025. The final length of the lab
session at the conference will be determined based on the overall
organisation of the conference and the number of participant submissions
received by a lab.
Advertising Labs at CLEF 2025 and ECIR 2026
Organisers of accepted labs are expected to advertise their labs at both
CLEF 2025 (September 9-12, 2025, Madrid, Spain) and ECIR 2026 (March 29
- April 2, 2026, Delft, Netherlands). So, at least one lab
representative should attend these events.
Advertising at CLEF 2025 will consist of displaying a poster describing
the new lab and advertising/announcing it during the closing session.
Advertising at ECIR 2026 will consist of submitting a lab description
(abstract submission deadline TBA by ECIR) to be included in ECIR 2026
proceedings and advertising the lab in a booster session during ECIR 2026.
Lab Proposals from Newcomers
If you have not organised a lab before, do not panic! The CLEF 2026 Lab
Organising Committee Lab is willing to mentor you by offering help,
guidance, and feedback on the writing of your draft lab proposal.
If you are a newcomer interested in receiving guidance, please send an
e-mail with the following tag in the subject “[Mentorship CLEF 2026 Lab
Proposals]” to Sean.MacAvaney at glasgow.ac.uk and julia.struss at
fh-potsdam.de
We also encourage newcomers to refer toFriedberg et al. (2015)
<https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004…>for
initial guidance on preparing their proposal:
Friedberg I, Wass MN, Mooney SD, Radivojac P. Ten simple rules for a
community computational challenge. PLoS Comput Biol. 2015 Apr
23;11(4):e1004150.
Important Dates
*
14 July 2025:Hard deadline to submit proposal to Easychair
*
4 August 2025:Notification of lab acceptance
*
9-12 September 2025:Advertising Accepted Labs at CLEF 2025, Madrid,
Spain
*
October 2025 (TBA by ECIR):Submission of short lab description for
ECIR 2026
*
April 2026:Advertising labs at ECIR 2026, Delft, Netherlands
*
April-May:Lab evaluation cycle
*
May-June:Review process of participant papers
*
June 2026:Review of the condensed labs overviews
*
July 2026:CEUR-WS Working Notes Preview for Checking by Authors and
Lab Organisers
*
21-24 September, 2026:Labs at CLEF 2026
CLEF 2026 Lab Chairs
*
Julia Maria Struß, Fachhochschule Potsdam University of Applied Sciences
*
Sean MacAvaney, University of Glasgow
--
___________________________
Prof. Dr. Julia Maria Struß
Fachhochschule Potsdam
University of Applied Sciences
Fachbereich Informationswissenschaften
Kiepenheuerallee 5
14469 Potsdam
Telefon: +49 331 580 4532
Zoom:https://fh-potsdam.zoom-x.de/my/juliamstruss
10th Symposium on Corpus Approaches to Lexicogrammar (LxGr2025)
LxGr2025 will be held online on Friday 11 and Saturday 12 July 2025.
Symposium programme and registration (free): https://ehu.ac.uk/lxgr
If you have problems registering, or have any questions, please contact lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>.
________________________________
Edge Hill University<http://ehu.ac.uk/home/emailfooter>
Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter>
University of the Year, Educate North 2021/21
________________________________
This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>
*** First Call for Papers ***
The 25th International Conference on Autonomous Agents and Multiagent
Systems (AAMAS 2026)
May 25-29, 2026, 5* Coral Beach Hotel & Resort, Paphos, Cyprus
https://cyprusconferences.org/aamas2026/
We invite you to submit your best work in agents and multiagent systems to AAMAS 2026, the
25th International Conference on Autonomous Agents and Multiagent Systems, to be held in
Paphos, Cyprus in May 2026.
All submissions will be rigorously peer-reviewed and evaluated on the basis of the overall
quality of their technical contribution, taking into account criteria such as originality,
significance, soundness, reproducibility, clarity, relevance to the conference, quality of
presentation, as well as understanding and appropriate referencing of the state of the art. The
papers will be published under CC BY license.
Important Dates
• Abstract submission: October 1, 2025
• Paper submission: October 8, 2025
• Rebuttal period: November 21-25, 2025
• Author notification: December 22, 2025
• Camera-ready paper: February 11, 2026
• Conference: May 25-29, 2026
All deadlines are at the end of the specified day, anywhere on Earth (UTC-12).
For submission instructions, please see here:
https://cyprusconferences.org/aamas2026/submission-instructions/
Areas of Interest
We welcome the submission of technical papers describing significant and original research on
all aspects of the theory and practice of autonomous agents and multiagent systems. If you are
new to this community, then we encourage you to consult the proceedings of previous editions
of the conference to fully appreciate the scope of AAMAS. At the time of submission, you will be
asked to associate your paper with one of the following areas of interest:
• Learning and Adaptation (LEARN)
• Generative and Agentic AI (GAAI)
• Game Theory and Economic Paradigms (GTEP)
• Coordination, Organizations, Institutions, Norms, and Ethics (COINE)
• Search, Optimization, Planning, and Scheduling (SOPS)
• Representation, and Reasoning (RR)
• Engineering and Analysis of Multiagent Systems (EMAS)
• Modeling and Simulation of Societies (SIM)
• Human-Agent Interaction (HAI)
• Robotics and Control (ROBOT)
• Innovative Applications (IA)
More information on these areas and the topics covered can be found here:
https://cyprusconferences.org/aamas2026/call-for-papers-main-track/
Special Tracks
In addition to the main track, AAMAS 2026 will feature five special tracks (AAAI Track, JAAMAS
Track, Blue Sky Ideas Track, Demo Track, and Competitions Track), as well as the Doctoral
Consortium.
The AAAI Track welcomes AAAI-25 submissions rejected from the main AAAI track that are
relevant to the AAMAS research community and received no reject review recommendations (all
review scores are weak reject or above).
The JAAMAS Track offers authors of papers recently published in the Journal of Autonomous
Agents and Multiagent Systems (JAAMAS) that have not previously appeared as full papers in an
archival conference the opportunity to present their work at AAMAS 2026.
The focus of the Blue Sky Ideas Track is on visionary ideas, long-term challenges, new
research opportunities, and controversial debate.
The Demo Track allows participants from both academia and industry to showcase their latest
developments in agent-based and robotic systems.
The Competitions Track is an effective mechanism for motivating researchers to enhance
discussions, share knowledge, and boost the development and evaluation of theory and
practice of autonomous agents and multiagent systems.
Finally, AAMAS invites PhD students working in the research areas covered by AAMAS to take
part in the Doctoral Consortium (DC). The DC is an opportunity to interact closely with
established researchers in your field as well as other PhD students to receive feedback on your
work and to get advice on managing your career.
The calls for each track above and for the Doctoral Consortium are available on the AAMAS
2026 web site.
Organizing Committee
AAMAS 2026 General Chairs
• Viviana Mascardi, University of Genova, Italy
• John Thangarajah, RMIT University, Australia
AAMAS 2026 Program Chairs
• Chris Amato, Northeastern University, United States of America
• Louise Dennis, University of Manchester, United Kingdom
AAMAS 2026 Local Chairs
• George A. Papadopoulos, University of Cyprus, Cyprus (Chair)
• Panayiotis Kolios, University of Cyprus, Cyprus (Vice Chair)
If you have additional questions, please contact the Program Chairs using
aamas2026pcs(a)gmail.com .
Dear colleagues,
We are pleased to announce the SymGenAI4Sci Workshop on Symbolic and Generative AI for Science, taking place as part of SEMANtiCS 2025 in Vienna, Austria, from September 3–5, 2025 (hybrid format).
Workshop Theme
SymGenAI4Sci brings together researchers working at the intersection of generative AI and symbolic methods to advance scientific reasoning, experimentation, and knowledge structuring. The goal is to explore hybrid approaches that combine the flexibility of generative models with the precision and explainability of symbolic AI in scientific applications.
Topics of Interest include (but are not limited to):
- Generative AI tailored to scientific domains (e.g., text, tables, workflows)
- Integration of symbolic reasoning with deep learning
- Ontologies, schema induction, and structured knowledge generation
- Human-in-the-loop and agentic AI for scientific research
- Evaluation frameworks for scientific reliability and factuality
- Applications in scientific discovery, data curation, or experimentation
Important Dates
Submission deadline: July 20, 2025
Workshop date: September 3–5, 2025
Location: Vienna, Austria & Online
More information and submission guidelines: https://sga4s.semantic.foundation/
We warmly invite researchers from NLP, AI, knowledge representation, and the sciences to submit their work and join the conversation on developing more grounded, explainable, and scientifically useful AI systems.
Best regards,
The SymGenAI4Sci 2025 Organizing Committee
https://sga4s.semantic.foundation
7th Workshop on Natural Legal Language Processing (NLLP 2025)
8 November 2025, Suzhou, China (collocated with EMNLP 2025)
Website: http://nllpw.org/workshop
Twitter: @nllpworkshop
Bluesky: https://bsky.app/profile/nllpworkshop.bsky.social
Contact: nllp.chairs(a)gmail.com<mailto:nllp.chairs@gmail.com>
= Important Dates =
Submission deadline ― 26 August 2025
Submission of EMNLP papers with reviews and ARR commitment ― 2 September 2025
Notification for direct submissions, ARR and EMNLP papers ― 30 September 2025
Camera ready due ― 7 October 2025
Workshop ― 8 November 2025
All deadlines are 11.59pm UTC -12h
Submission website: https://openreview.net/group?id=EMNLP/2025/Workshop/NLLP
For the full text: https://nllpw.org/workshop/call/
= Goal =
Following the success of the first six editions of the NLLP workshop (EMNLP 2021 - 2024, KDD 2020, NAACL 2019), the workshop aims to bring together researchers and practitioners working on NLP, LLMs and other AI fields with legal practitioners and researchers.We welcome submissions describing original work on legal data, as well as data with legal relevance.= Topics =Applications of NLP methods to tasks in the legal domain including, but not limited to:
• Case outcome analysis and prediction
• Summarization and analysis of long-form and complex legal documents
• Information extraction
• Contract drafting
• Chatbots and assistants for legal or negotiation support
• Legal analysis and commentary
• Legal argumentation analysis
• Legal reasoning
• Information retrieval and question-answering (incl. retrieval-augmented generation)
• Detection and mitigation of legal misinformation
• Copyright and intellectual property law applications, incl. infringement detection, licensing compliance, generative content auditing
• Agentic applications for conducting tasks in the legal domain
Methods for applying Large Language Models (LLMs) to the legal domain including, but not limited to:
• Adaptation of LLMs to the legal domain
• Prompt engineering and prompt chaining
• Composite methods using symbolic or rule-based reasoning
• Groundedness and attributability of generations
• Privacy and bias risks in legal LLM applications
• Copyright compliance, dataset provenance and transparency and fair use analysis in LLM training and usage
Methodological innovations for legal tasks including, but not limited to:
• Classification
• Summarization and generation
• Information extraction incl. entity recognition, disambiguation, event extraction, query understanding, anonymization, data extraction, knowledge base population
• Question answering incl. retrieval-augmented generation
• Information retrieval incl. sparse, dense or hybrid approaches
• Multi-modal document parsing incl. using structured, semi-structured and metadata (e.g. tables, charts, images)
• Clustering, clause similarity and topic modeling
• Link and citation prediction
• Causal inference and counterfactual reasoning for legal decision-making
• Conversational agents incl. conversational question answering, contract analysis and review, negotiation support agents or multi-agent coordination
• Planning and reasoning
Tasks, Resources and Evaluation for NLP in the Legal domain:
• Description of new tasks for NLP in the legal domain e.g. legal argument reasoning, legal QA attribution
• Task overviews and survey papers that identify current research gaps
• Dataset development for LLM benchmarking for legal applications
• Publicly available datasets curated and annotated by legal experts
• Methods for automatic evaluation of LLM performance on legal domains
NLP for Online Platforms, Social Media and Regulations:
• Detection and moderation of illegal content (e.g. harassment, defamation)
• NLP for platform compliance under regulatory regimes (e.g. Data Services Act, AI Act, etc.)
• Legal transparency tooling for platform decisions (e.g. Statement of Reasons analysis)
• Misinformation and disinformation detection with legal implications
• Online dispute resolution, appeals and access to justice via social platforms
• Legal evidence mining from user-generated content and public discourse
• Legal implications of chatbots and agents operating in or for social media platforms
• NLP aided analysis of Terms of Services and platform policies
Systems, Demos and Industry Applications
• System descriptions of real-world legal NLP systems
• Industry applications in legal tech or compliance
• NLP systems for legal professionals such as E-Discovery, contract review, risk assessment.
• Open or proprietary NLP tools for citizens, lawyers, courts, or regulators
Interdisciplinary position papers on topics including, but not limited to:
• Legal or socio-legal analyses relating to the role NLP in the legal domain
• Ethical, legal and regulatory aspects of data collection and LLM use in the legal domain
• Critical reflections about the benefits and challenges with using NLP technologies in the legal domain
• The role of NLP in Access to Justice and Digital Legal Empowerment
• The role of NLP in platform governance and content moderation including legal, regulatory and ethical aspects of automated moderation, accountability under emerging platform regulations (e.g., DSA, DMA, AI Act) and impacts on freedom of expression and access to justice
• Legal and ethical challenges of NLP in the context of copyright and IP
= Submissions =We accept papers reporting original (unpublished) research of two types:
• Long papers (max 8 pages of content)
• Short papers (max 4 pages of content)
Appendices, references, optional limitations section, optional ethics section and acknowledgements do not count against the maximum page limit and should be formatted according to the guidelines below.
To submit a paper, please access the submission link: https://openreview.net/group?id=EMNLP/2025/Workshop/NLLP
Conference proceedings will be published on the ACL Anthology.
= Ethics Section =
The NLLP workshop adheres to the same standards regarding ethics as the EMNLP 2025 conference (link). Authors will be allowed extra space after the 8th page (4th for short papers) for an optional broader impact statement or other discussion of ethics. Note that an ethical considerations section is not required, but papers working with sensitive data or on sensitive tasks that do not discuss these issues will not be accepted.= Non-archival Option =The authors have the option of submitting previously unpublished research as non-archival, meaning that only the abstract will be published in the conference proceedings. We expect these submissions to describe the same quality of work as archival submissions. These will be reviewed following the same procedure as archival submissions. This option accommodates publication of the work or a superset at a later date in a conference or journal which does not allow previously archived work and to encourage presentation and feedback on mature, yet unpublished work. Non-archival submissions should adhere to the same formatting and length constraints as archival submissions.
= Dual Submission and Preprint Policy =
Papers that are under consideration at other workshops, conferences or journals during the review period must explicitly indicate so at submission time. Authors of papers accepted for presentation at the NLLP 2025 workshop must notify the organizers by the camera-ready deadline as to whether the paper will be published or withdrawn.
There is no anonymity period or limitation on posting or discussing non-anonymous preprints while the work is under peer review. However, if the preliminary version of a paper was posted on arXiv, the paper should *not* have a self-reference to it in the submission.
= ACL Rolling Review Submissions =
Our workshop also welcomes submissions from ACL Rolling Review (ARR). Authors of any papers that are submitted to ARR and have their meta review ready may submit their papers and reviews for consideration for the workshop until 2 September 2025. This should include submissions to ARR for the May deadline. The decision of publication will be announced by 7 October 2025. The commitment should be done via the workshop submission website: https://openreview.net/group?id=EMNLP/2025/Workshop/NLLP_ARR_Commitment
= EMNLP 2025 Submissions =
Authors of any papers that have been reviewed for EMNLP 2025 and were rejected have the opportunity to send their paper and reviews to be considered for publication in the NLLP workshop proceedings as long as the topics are relevant to those described in this call for papers.
The deadline for submitting papers and reviews is 2 September 2025. The decision of publication will be announced by 7 October 2025. The submission should be done via the workshop submission website: https://openreview.net/group?id=EMNLP/2025/Workshop/NLLP_ARR_Commitment
= Double-Blind Reviewing =
The review process is double-blind and should follow the ARR guidelines on ensuring two-way anonymized review available here. Papers that violate these requirements will be desk rejected.
= Submission Style & Format Guidelines =
Paper submissions must use the official ACL style templates, which are available here (Latex and Word). Please follow the paper formatting guidelines general to “*ACL” conferences available here. Authors may not modify these style files or use templates designed for other conferences.Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review.
= Presentation=
Presentation format for each paper and schedule will be announced between acceptance notification and the camera-ready deadline.At least one author of each accepted paper must register for the NLLP 2025 workshop by the registration deadline in order for the submission to be published in the proceedings.
=Organizing Commitee=
Nikolaos Aletras - University of Sheffield
Leslie Barrett ― Bloomberg
Ilias Chalkidis - University of Copenhagen
Catalina Goanta - Utrecht University
Daniel Preotiuc-Pietro - Bloomberg
Gerasimos (Jerry) Spanakis - Maastricht University
BCS Search Industry Awards 2025
We are delighted to announce this year's Search Industry Awards, celebrating the best search innovations of 2025. Presented by the Information Retrieval Specialist Group of the BCS <https://www.bcs.org/membership-and-registrations/member-communities/informa…>, these awards recognize people, projects, and organisations around the world that have excelled in the design of search and information retrieval products and services. If you know of any people, projects, or products that deserve recognition, let us know by submitting a nomination. Alternatively, if you're involved with something special yourself, you can submit an application <https://docs.google.com/forms/d/e/1FAIpQLSfxTx0oN3xCRcy1rgktug-k4e8kmVvvLQL…> today.
Categories
This year we are offering four awards:
Best Search Project recognises the most impactful implementation of search technology or methodology in solving a specific problem or need. Previous winners include:
Datafari Enterprise Search <https://www.datafari.com/en/index.html>, an open-source end-to-end solution covering the needs of enterprise search scenarios
Wikiframe Visual Graph <https://wikiframe.library.unlv.edu/>, a search capability for Special Collections data stored on Wikidata
CiteSeerX <https://citeseerx.ist.psu.edu/>, one of the largest open-source academic search engines with over 10 million documents
Search Professional of the Year is made to an individual who has made a significant contribution through their work and professionalism. Previous winners include:
Jayaprakash Sundararaj <https://www.linkedin.com/in/osjayaprakash/>, Lead Engineer at Google
Amey Porobo Dharwadker <https://ameydhar.com/>, Machine Learning Tech Lead Manager at Meta
Adam Tocock <https://www.whittington.nhs.uk/mini-apps/staff/profile/?id=2478>, Library Assistant at NHS
Most promising Start-up (or new Enterprise) recognises the innovative and disruptive potential of a business model, technology, or solution. Previous winners include:
deepset.ai <http://deepset.ai/>, a leader in framework and platform technology that accelerates AI application development with large language models (LLMs); and the creator of the Haystack open-source framework
batteryincluded.ai <http://batteryincluded.ai/>, First BI Product Discovery Framework incl. 3 pillars for highest relevance within global product listings
Giotto AI <https://www.giotto.ai/>, an all-in-one platform to automatize, digitalize, and standardize the data collection, analysis and writing of a Clinical Evaluation Report
Best Presentation at Search Solutions Previous winners include:
Taketomo Isazawa, Microsoft Research: “Beyond RAG: Integrating Knowledge with LLMs"
Charlie Hull, OSC: “Pragmatic AI-powered Search – Keeping it Simple, not Stupid”.
Filip Radlinski, Google: “Challenges with Really Understanding Natural Language in Conversational Recommendation”
The last award is open only to presenters at Search Solutions, and will be judged on the day of the event. For all others, apply today <https://docs.google.com/forms/d/e/1FAIpQLSfxTx0oN3xCRcy1rgktug-k4e8kmVvvLQL…>!
Judging Panel
Winners will be selected by our panel of judges (details to be announced shortly).
Awards Ceremony
The awards ceremony will take place during Search Solutions 2025 <https://www.bcs.org/membership-and-registrations/member-communities/informa…>.
Apply
We’ve designed the application process <https://docs.google.com/forms/d/e/1FAIpQLSfxTx0oN3xCRcy1rgktug-k4e8kmVvvLQL…> to be simple to complete. If you are unsure which category to apply for, or have questions about the application process, contact us via the address below. For further details, see: https://www.bcs.org/membership-and-registrations/member-communities/informa… <https://www.bcs.org/membership-and-registrations/member-communities/informa…>
Nominations will remain open until 31st October 2025.
Contact
If you have any questions on the above, please contact the Awards Chair at udo.kruschwitz(a)ur.de <mailto:udo.kruschwitz@ur.de> with a copy to the IRSG Events Organiser at tgr2uk+irsg(a)gmail.com <mailto:tgr2uk+irsg@gmail.com>
About IRSG
The IRSG is a Specialist Group of BCS <https://www.bcs.org/>. Its mission is to provide a focus for the European IR community, facilitate communication between researchers and practitioners and promote the adoption of IR research within industry. We host a major European conference (ECIR) and provide an associated programme of workshops, seminars and events. The IRSG is free to join via the BCS website, which provides access to further IR articles, events and resources.
BCS is the industry body for IT professionals. With members in over 100 countries around the world, BCS is the leading professional and learned society in the field of computers and information systems.
**Social Media Access Days**
Social media data between research and infrastructure – sustainable archiving, indexing and access
Conference topics
Online platforms, especially social media platforms, are both research objects and sources of data for a variety of research approaches in the humanities and social sciences, computer science, and the natural and life sciences. The historical evolution of social media makes them a part of our digital cultural heritage. However, the processes used by institutions to archive and document social media data are still only rudimentary, not least because of their economic, social and aesthetic characteristics and the unique attributes of media technology. Researchers, research institutions and cultural heritage institutions therefore face a wide range of problems in terms of their archiving, indexing and use. Researchers who wish to work with social media data also encounter numerous new challenges, especially when fundamental changes such as the elimination of application programming interfaces (APIs) impact access to specific data from online platforms.
The archiving, indexing and use of dynamic data from social media are therefore fraught with problems which researchers, research institutions, libraries and archives have to tackle in a consistent manner. Ideally, solutions to these problems should be developed cooperatively, since this requires extensive effort which would be beyond the scope of a single data community or discipline.
The aim of the conference is therefore to enable libraries, archives, infrastructural facilities, research institutions and researchers to network and exchange experiences with archiving and the sustainable use of data and digital objects from social media. We explicitly welcome case studies and presentations on solutions and their practical implementation as well as reports on research findings.
We are particularly interested in contributions on the following key topics:
- Sustainable infrastructure for collecting and providing access to social media content
- Interaction between researchers and archiving institutions
- Ethical issues and best practices
- Legal issues and solutions
- Challenges posed by restrictive data access from social media platforms
- Experiences with data access in the context of the Digital Services Act
- Status and preservation of social media from an archival and cultural-historical perspective, e.g. posts, interactions, platform elements
- Consolidation of collections, corpora, holdings
- Metadata, data documentation and indexing social media data
- Use of AI & LLMs for data documentation and indexing purposes
- Initiatives focusing on archiving and access
- Concepts for the provision and use of derivatives (aggregated or derivative formats) from social media
- Experiences with the reusability of available data
Date
The conference will take place from 17 to 19 March 2026 at the German National Library in Frankfurt am Main. The main conference language is German. However, contributions can also be submitted in English. We aim to schedule all English-language presentations together in one day, if possible.
Submissions
We look forward to receiving your submissions for presentations and your proposals for tutorials, workshops or interactive formats.
Presentations / posters: Please submit your proposals in the form of abstracts containing a maximum of 500 words (plus bibliographies and max. 1 illustration). Contributions can be based on research findings or personal experience and may be presented in German or English. The programme committee will decide which contributions to accept as oral presentations and which as posters.
Further formats: Proposals for tutorials, workshops, themed sessions and other interactive formats should not exceed two pages and should contain the following information: proposed format and realisation, language, target group (potential number of participants), motivation and goals. In addition, please tell us whether you require special technical equipment or facilities. We will then determine how these can be provided on site.
Please send your submissions as a PDF document to: twarchiv(a)dnb.de
Timeline
Deadline for submitting abstracts: 31 October 2025
Response by: 30 November 2025
Conference: 17.-19.03.2026
Participation
The conference will take place in person at the German National Library in Frankfurt am Main. There are currently no plans to stream the event online. The participation fee is approx. 50 Euros.
Speakers who do not have their own travel funds may apply for up to 300 euros to cover travel and accommodation expenses (per presentation for max. one person).
Organisation
German National Library Frankfurt am Main
Dr Britta Woldering
twarchiv(a)dnb.de
Program Committee
Stefan Dietze (GESIS and HHU Düsseldorf)
Dimitar Dimitrov (GESIS)
Philippe Genêt (German National Library)
Tatjana Scheffler (Ruhr University Bochum)
Claus-Michael Schlesinger (UB der HU Berlin)
Katrin Weller (GESIS and HHU Düsseldorf)
Britta Woldering (German National Library)
Cooperation partners
BERD@NFDI
German National Library
GESIS – Leibniz Institute for the Social Sciences
NFDI4DataScience
Text+
Conference website: https://www.dnb.de/EN/smad
---
Tatjana Scheffler (she/her)
GB 5/157
Ruhr-Universität Bochum
Digital Forensic Linguistics
Fakultät für Philologie, Germanistisches Institut
Universitätsstraße 150
44780 Bochum
Germany
Mail: tatjana.scheffler(a)rub.de
Web: http://staff.germanistik.rub.de/digitale-forensische-linguistik/
Mastodon: https://fediscience.org/@tschfflr
Tel.: +49 234 32-21471
Dear colleagues,
We are pleased to announce the first call for papers of the
*1st Workshop on Multilingual Data Quality Signals at COLM 2025*
Important information:
🗓️ CfP Deadline Extended to: July 3, Workshop: October 10
📍 Montréal, Canada
🌐 https://wmdqs.org
Scope
Recent research has shown that large language models (LLMs) not only need large quantities of data, but also need data of sufficient quality. Ensuring data quality is even more important in a multilingual setting, where the amount of acceptable training data in many languages is limited. Indeed, for many languages even the fundamental step of language identification remains a challenge, leading to unreliable language labels and thus noisy datasets for underserved languages.
In response to these challenges, we will be holding the first Workshop on Multilingual Data Quality Signals (WMDQS) in tandem with COLM. We invite the submission of long and short research papers related to data quality in multilingual data.
Even though most previous work on data quality has been targeted at LLM development, we believe that research in this area can also benefit other research communities in areas such as web search, web archiving, corpus linguistics, digital humanities, political sciences and beyond. We therefore encourage submissions from a wide range of disciplines.
WMDQS will also include a shared task on language identification for web text. We invite participants to submit novel systems which address current problems with language identification for web text. We will provide a training set of annotated documents sourced from Common Crawl to aid development.
Topics
We welcome submissions of (1) original research papers, (2) review/opinion papers, (3) online systems on the topics listed below, and (4) extended abstracts. We especially welcome work-in-progress projects and all novel ideas covering research in multilinguality, underserved/low-resource languages, under-represented linguistic communities and all types of work covering data quality signals. Suggested areas include:
- Data pipelines for data annotation and data filtering
- Undesirable content detection in a multilingual setting
- Multilingual or language independent content ranking
- Human annotation platforms and systems
- Multilingual tokenization mechanisms
- Small language models and embeddings
- Linguistic studies in underserved languages
- Corpus creation and curation methods, especially for underserved languages
- Machine translation
- Digital humanities
- Historical and constructed languages
Shared task
The lack of training data—especially high-quality data—is the root cause of poor language model performance for many languages. One obstacle to improving the quantity and quality of available text data is language identification (LangID or LID). Lang ID remains far from solved for many languages. Several of the commonly used LangID models were introduced in 2017 (e.g. fastText and CLD3). The aim of this shared task is to encourage innovation in open-source language identification and improve accuracy on a broad range of languages.
All accepted authors will be invited to contribute a larger paper, which will be submitted to a high-impact NLP venue.
Important dates for the Workshop:
Workshop paper submission deadline (extended): July 3, 2025
Workshop paper acceptance notification: July 24, 2025
Workshop: October 10, 2025
Important dates for the Shared Task:
1st Deadline to contribute annotations: July 7, 2025
1st Annotations released (train split): July 14, 2025
Abstract Deadline: July 21, 2025
Decision Notification: July 24, 2025
Camera Ready Deadline: September 21, 2025
(All deadlines are 23:59 AoE.)
Organizers:
For any questions, please drop a mail to wmdqs-pcs(a)googlegroups.com
Program Chairs:
Pedro Ortiz Suarez (Common Crawl Foundation)
Sarah Luger (MLCommons)
Laurie Burchell (Common Crawl Foundation)
Kenton Murray (Johns Hopkins University)
Catherine Arnett (EleutherAI)
Organizing Committee:
Thom Vaughan (Common Crawl Foundation)
Sara Hincapié (Factored)
Rafael Mosquera (MLCommons)
We are pleased to announce MAHED 2025, the first multimodal shared task dedicated to Hope and Hate Detection in Arabic content. This novel multimodal challenge will be co-located with EMNLP 2025 at the ArabicNLP 2025 Conference.
MAHED 2025 addresses critical real-world challenges in Arabic natural language processing by focusing on the detection of hate speech, hope speech, and emotions in both Arabic text and memes. This shared task aims to advance research in ethical AI while addressing the linguistic diversity and dialectal variations inherent in Arabic content.
The shared task comprises three subtasks:
Task 1: Text-based Hope & Hate Speech Classification
Participants will develop models to classify Arabic text as containing hope speech, hate speech, or neutral content.
Task 2: Multitask Learning for Emotion, Offensive Content, and Hate Detection
This task involves simultaneous detection of emotions, offensive language, and hate speech in Arabic text.
Task 3: Multimodal Hateful Meme Detection
Participants will work with Arabic memes to detect hateful content using both textual and visual modalities.
Registration Links:
* Task 1: https://www.codabench.org/competitions/9136/
* Task 2: https://www.codabench.org/competitions/9166/
* Task 3: https://www.codabench.org/competitions/9192/
Important Dates:
* June 10, 2025: Training data and evaluation scripts released
* July 20, 2025: Final registration deadline and test set release
* July 25, 2025: Test submission deadline
* November 5-9, 2025: ArabicNLP 2025 Workshop at EMNLP 2025, Suzhou, China
Resources and Registration:
Website: https://marsadlab.github.io/mahed2025/
Dataset and Code: https://github.com/marsadlab/MAHED2025Dataset
*** Last Call for Papers ***
The 16th IEEE International Conference on Knowledge Graphs (ICKG 2025)
November 13-14, 2025, 5* St. Raphael Resort and Marina, Limassol, Cyprus
https://cyprusconferences.org/ickg2025/
(*** Proceedings to be published by IEEE ***)
(*** Submission Deadline: July 4, 2025 AoE (extended and firm!) ***)
The annual IEEE International Conference on Knowledge Graph (ICKG) provides a premier
international forum for presentation of original research results in knowledge discovery and
graph learning, discussion of opportunities and challenges, as well as exchange and
dissemination of innovative, practical development experiences. The conference covers all
aspects of knowledge discovery from data, with a strong focus on graph learning and
knowledge graph, including algorithms, software, platforms. ICKG 2025 intends to draw
researchers and application developers from a wide range of areas such as knowledge
engineering, representation learning, big data analytics, statistics, machine learning, pattern
recognition, data mining, knowledge visualization, high performance computing, and World
Wide Web etc. By promoting novel, high quality research findings, and innovative solutions to
address challenges in handling all aspects of learning from data with dependency relationship.
All accepted papers will be published in the conference proceedings by the IEEE Computer
Society. Awards, including Best Paper, Best Paper Runner up, Best Student Paper, Best Student
Paper Runner up, will be conferred at the conference, with a check and a certificate for each
award. The conference also features a survey track to accept survey papers reviewing recent
studies in all aspects of knowledge discovery and graph learning. At least five high quality
papers will be invited for a special issue of the Knowledge and Information Systems Journal,
in an expanded and revised form. In addition, at least eight quality papers will be invited for a
special issue of Data Intelligence Journal in an expanded and revised form with at least 30%
difference.
TOPICS OF INTEREST
Topics of interest include, but are not limited to:
• Foundations, algorithms, models, and theory of knowledge discovery and graph learning
• Knowledge engineering with big data.
• Machine learning, data mining, and statistical methods for data science and engineering.
• Acquisition, representation and evolution of fragmented knowledge.
• Fragmented knowledge modeling and online learning.
• Knowledge graphs and knowledge maps.
• Graph learning security, privacy, fairness, and trust.
• Interpretation, rule, and relationship discovery in graph learning.
• Geospatial and temporal knowledge discovery and graph learning.
• Ontologies and reasoning.
• Topology and fusion on fragmented knowledge.
• Visualization, personalization, and recommendation of Knowledge Graph navigation and
interaction.
• Knowledge Graph systems and platforms, and their efficiency, scalability, and privacy.
• Applications and services of knowledge discovery and graph learning in all domains
including web, medicine, education, healthcare, and business.
• Big knowledge systems and applications.
• Crowdsourcing, deep learning and edge computing for graph mining.
• Large language models and applications
• Open source platforms and systems supporting knowledge and graph learning.
• Datasets and benchmarks for graphs
• Neurosymbolic & Hybrid AI systems
• Graph Retrieval Augmented Generation
SURVEY TRACK
Survey paper reviewing recent study in keep aspects of knowledge discover and graph learning.
In addition to the above topics, authors can also select and target the following Special Track
topics.
Each special track is handled by respective special track chairs, and the papers are also
included in the conference proceedings.
• Special Track 01: KGC and Knowledge Graph Building
• Special Track 02: KR and KG Reasoning.
• Special Track 03: KG and Large Language Model
• Special Track 04: GNN and Graph Learning
• Special Track 05: QA and Graph Database
• Special Track 06: KG and Multi-modal Learning.
• Special Track 07: KG and Knowledge Fusion.
• Special Track 08: Industry and Applications
SUBMISSION GUIDELINES
Paper submissions should be no longer than 8 pages, in the IEEE 2-column format, including
the bibliography and any possible appendices. Submissions longer than 8 pages will be
rejected without review. All submissions will be reviewed by the Program Committee based on
technical quality, originality, significance, and clarity. For survey track paper, please preface the
descriptive paper title with “Survey:”, followed by the actual paper title. For example, a paper
entitled “A Literature Review of Streaming Knowledge Graph”, should be changed as “Survey: A
Literature Review of Streaming Knowledge Graph”. This is for the reviewers and chairs to clearly
bid and handle the papers. Once the paper is accepted, the word, such as “Survey:”, can be
removed from the camera-ready copy.
For special track paper, please preface the descriptive paper title with “SS##:”, where “##” is
the two digits special track ID. For example, a paper entitled “Incremental Knowledge Graph
Learning”, intended to target Special Track 01 (Machine learning and knowledge graph) should
be changed as “SS01: Incremental Knowledge Graph Learning”.
All manuscripts are submitted as full papers and are reviewed based on their scientific merit.
The reviewing process is single blind, meaning that each submission should list all authors and
affiliations. There is no separate abstract submission step. There are no separate industrial,
application, or poster tracks. Manuscripts must be submitted electronically in the online
submission system. No email submission is accepted. To help ensure correct formatting, please
use the style files for U.S. Letter as template for your submission. These include LaTeX and
Word.
SUBMISSION LINK
https://wi-lab.com/cyberchair/2025/ickg25/
IMPORTANT DATES
• Paper submission (abstract and full paper): July 4, 2025 (AoE) (extended and firm!)
• Notification of acceptance/rejection: September 5, 2025
• Camera-ready, copyright forms and author registration: September 20, 2025
• Early (non-author) registration: October 10, 2025
• Conference dates: November 13-14, 2025
ORGANISATION
Conference and Local Organising Chair
• George A. Papadopoulos, University of Cyprus
Conference Co-Chair
• Dan Guo, Hefei University of Technology
Program Chairs
• Cesare Alippi, Università della Svizzera italiana
• Shirui Pan, Griffith University
Local Organising Vice Chair
• Irene Kinlanioti, National Technical University of Athens
Finance Chair
• Constantinos Pattichis, University of Cyprus
Steering Committee Chair
• Xindong Wu, Hefei University Of Technology
*** NARNiHS 2026
*** North American Research Network in Historical Sociolinguistics
*** Eighth Annual Meeting
*** 100% IN PERSON
*** Co-Located with the Linguistic Society of America (LSA) Annual Meeting
*** New Orleans, Louisiana USA
*** 8-11 January 2026
This event offers an opportunity for historical sociolinguistics scholars from all over the world to gather and share leading research. We encourage our fellow historical sociolinguists and scholars in related fields from our global scholarly community to **join us in New Orleans** for our Eighth Annual Meeting.
Consult this Call for Abstracts on the web: https://narnihs.org/?page_id=3135 .
--------------- Call for Abstracts ---------------.
Abstract submission online:
https://easyabs.linguistlist.org/conference/NARNiHS_26/ .
Deadline: Friday, 15 August 2025, 11:59 PM US Eastern Time.
Late abstracts will not be considered.
The North American Research Network in Historical Sociolinguistics (NARNiHS) is accepting abstracts for its Eighth Annual Meeting in New Orleans, Thursday, January 8 -- Sunday, January 11, 2026. The 8th edition of this inclusive NARNiHS event seeks to provide a collaborative environment where presenters bring fully developed work for presentation and enrichment. We see the NARNiHS Annual Meeting as a place for showcasing excellent projects in historical sociolinguistics, seeking feedback from peers, and engaging in productive development of the field’s enduring questions.
NARNiHS welcomes papers in all areas of historical sociolinguistics, which is understood as the application and/or development of sociolinguistic theories, methods, and models for the study of historical language variation and change over time, or more broadly, the study of the interaction of language and society in historical periods and from historical perspectives. Thus, a wide range of linguistic areas, subdisciplines, methodologies, and adjacent disciplines easily find their place within historical sociolinguistics, and we encourage submission of abstracts that reflect this broad scope.
Abstracts will be accepted for both 20-minute papers and posters. Please note that, at the NARNiHS annual meeting, poster presentations are an integral part of the conference (not second-tier presentations). Abstracts will be assigned a paper or a poster presentation based on determinations in the review process about the most effective format for the submission. However, if you prefer that your submission be considered primarily for poster presentation, please specify this in your abstract.
Successful abstracts will demonstrate *thorough grounding* in historical sociolinguistics, *scientific rigor* in the formulation of research questions, and promise for rich discussion of ideas. Successful abstracts will be explicit about which *theoretical frameworks*, *methodological protocols*, and *analytical strategies* are being applied or critiqued. *Data sources and examples* should be sufficiently presented, so as to allow reviewers a full understanding of the scope and claims of the research. Please note that the *connection of your research to the field of historical sociolinguistics* should be explicitly outlined in your abstract. Failure to adhere to these criteria will likely result in rejection.
*** Abstract Format Guidelines***.
- Abstracts must be submitted in PDF format.
- Abstracts must fit on one 8.5x11 inch page, with margins no smaller than 1 inch and a font style and size no smaller than Times New Roman 12 point. You are encouraged to use the entire page, providing a full and robust description of the research. All additional supporting content (visualizations, trees, tables, figures, captions, examples, and references) must fit on a single (1) additional page. No exceptions to these requirements are allowed; abstracts longer than one page or with more than one additional page of supporting content will be rejected without review.
- Specify if you prefer your submission be considered primarily for a poster presentation.
- Anonymize your abstract. We realize that sometimes complete anonymity is not attainable, but there is a difference between the nature of the research creating an inability to anonymize and careless non-anonymizing (in citations, references, file names, etc.). Be sure to anonymize your PDF file (you may do so in Adobe Acrobat Reader by clicking on "File", then "Properties", removing your name if it appears in the "Author" line of the "Description" tab, and re-saving the file before submission). Do not use your name when saving your PDF (e.g. Smith_Abstract.pdf); file names will not be automatically anonymized by the EasyAbs system. Rather, use non-identifying information in your file name (e.g. HistSoc4Lyfe.pdf). Your name should only appear in the online form accompanying your abstract submission. Papers that are not sufficiently anonymized wherever possible will be rejected without review.
*** General Requirements ***.
- Abstracts must be submitted electronically using the following link: https://easyabs.linguistlist.org/conference/NARNiHS_26/ .
- Authors may submit a maximum of two abstracts: One single-author abstract and one co-authored abstract.
- Authors may not submit identical abstracts for presentation at the NARNiHS annual meeting and the LSA annual meeting or another LSA sister society meeting (ADS, ANS, NAHoLS, SCiL, SPCL, or SSILA).
- After submission, no changes of author, title, or wording of the abstract may occur. If your abstract is accepted, adjustment of typographical errors is permitted before a final version of the abstract is printed in the conference booklet.
- Papers and posters must be delivered as projected in the abstract or represent bona fide developments of the same research.
- Authors are expected to attend the conference in-person and present their own papers and posters. This will not be a hybrid event.
Contact us at NARNiHistSoc(a)gmail.com with any questions.
We invite you to submit your ongoing, published or pre-reviewed works to our workshop on Large Language Models for Cross-Temporal Research (XTempLLMs) at COLM 2025.
Our workshop website is available at https://xtempllms.github.io/2025/
*The deadline for submission has been extended to June 30, 2025 AOE*
Workshop Description:
Large language models (LLMs) have been used for a variety of time-sensitive applications such as temporal reasoning, forecasting and planning. In addition, there has been a growing number of interdisciplinary works that use LLMs for cross-temporal research in several domains, including social science, psychology, cognitive science, environmental science and clinical studies. However, LLMs are hindered in their understanding of time due to many different reasons, including temporal biases and knowledge conflicts in pretraining and RAG data but also a fundamental limitation in LLM tokenization that fragments a date into several meaningless subtokens. Such inadequate understanding of time would lead to inaccurate reasoning, forecasting and planning, and time-sensitive findings that are potentially misleading.
Our workshop looks for (i) cross-temporal work in the NLP community and (ii) interdisciplinary work that relies on LLMs for cross-temporal studies.
Cross-temporal work in the NLP community:
* Novel benchmarks for evaluating the temporal abilities of LLMs across diverse date and time formats, culturally grounded time systems, and generalization to future contexts;
* Novel methods (e.g., neuro-symbolic approaches) for developing temporally robust, unbiased, and reliable LLMs;
* Data analysis such as the distribution of pretraining data over time and conflicting knowledge in pretraining and RAG data;
* Interpretability regarding how temporal information is processed from tokenization to embedding across different layers, and finally to model output;
* Temporal applications such as reasoning, forecasting and planning;
* Consideration of cross-lingual and cross-cultural perspectives for linguistic and cultural inclusion over time.
Interdisciplinary work that relies on LLMs for cross-temporal studies:
* Time-sensitive discoveries, such as social biases over time and personality testing over time;
* Assessment of time-sensitive discoveries to identify misleading findings if any;
* Interdisciplinary evaluation benchmarks for LLMs’ temporal abilities, e.g., psychological time perception and episodic memory evaluation.
Submission Modes:
* Standard submissions: We invite the submission of papers that will receive up to three double-blind reviews from the XTempLLMs committee, and a final decision of acceptance from the workshop chairs.
* Pre-reviewed submissions: We invite unpublished papers that have already been reviewed either through ACL ARR, or recent AACL/EACL/ACL/EMNLP/COLING venues. These papers will not receive new reviews but will be judged together with their reviews via a meta-review from the workshop chairs.
* Published papers: We invite papers that have been published recently elsewhere to present at XTempLLMs. Please send the details of your paper (Paper title, authors, publication venue, abstract, and a link to download the paper) directly to xtempllms(a)gmail.com. This allows such papers to gain more visibility from the workshop audience.
All deadlines are 11.59 pm UTC -12h (“Anywhere on Earth”):
* June 30, 2025: Submission deadline (standard and published papers)
* July 18, 2025: Submission deadline for papers with ARR reviews
* July 24, 2025: Notification of acceptance
* October 10, 2025: Workshop day
Invited Speakers:
* Jose Camacho Collados, Cardiff University, United Kingdom
* Ali Emami, Brock University, Canada
* Alexis Huet, Huawei Technologies, France
* Bahare Fatemi, Google Research, Canada
* Vivek Gupta, Arizona State University, United States
Organizing Committee:
* Wei Zhao, University of Aberdeen, United Kingdom
* Maxime Peyrard, Université Grenoble Alpes & CNRS, France
* Katja Markert, Heidelberg University, Germany
[Apologies for cross-postings]
FIRST CALL FOR PAPERS
LREC 2026
Organised by the ELRA Language Resources Association
Palma, Mallorca, Spain
11-16 May 2026
The Fifteenth biennial Language Resources and Evaluation Conference
(LREC) will be held at the Palau de Congressos de Palma in Palma,
Mallorca, Spain, on 11-16 May 2026. LREC serves as the primary forum for
presentations describing the development, dissemination, and use of
language resources involving both traditional and recently developed
approaches.
The scientific program will include invited talks, oral presentations,
and poster and demo presentations, as well as a keynote address by the
winner of the Antonio Zampolli Prize. Submissions describing all aspects
of language resource development and use are invited, including, but not
limited to, the following:
Language Resource Development
Methods and tools for mono- and multi-lingual language resource
development and annotation
Knowledge discovery/representation (knowledge graphs, linked data,
terminologies, lexicons, ontologies, etc.)
Resource development for less-resourced/endangered languages
Guidelines, standards, best practices, and models for interoperability
Language Resource Use
Use of language resources in systems and applications for any area
of language and speech processing
Use of language resources in assistive technologies, support for
accessibility
Efficient/low-resource methods for language and speech processing
Evaluation
Methodologies and protocols for evaluation and benchmarking of
language technologies
Measures for validation of language resources and quality assurance
Usability of user interfaces and dialogue systems
Bias, safety, and user satisfaction metrics
Interpretability/explainability of language models and language and
speech processing tools
Language Resources and Large Language Models
Language resource development for LLMs (monolingual, multilingual,
multimodal)
(Semi-)automatic generation of training data
Training, fine-tuning, adaptation, alignment, and representation
learning
Guardrails, filters, and modules for generative AI models
Policy and Organizational Considerations
International and national activities, projects, initiatives, and
policies
Language coverage and diversity
Replicability and reproducibility
Organisational, economic, ethical, climate, and legal issues
Separate calls will be issued for Workshops, Tutorials and Industry Track.
Submission
Submissions should be 4 to 8 pages in length (excluding references) and
follow the LREC stylesheet, which will soon be available on the
conference website.
At the time of submission, authors are offered the opportunity to share
related language resources with the community. All repository entries
are linked to the LRE Map [https://lremap.elra.info/], which provides
metadata for the resource.
Accepted papers will appear in the conference proceedings, which include
both oral and poster papers in the same format. Determination of the
presentation format (oral vs. poster) is based solely on an assessment
of the optimal method of communication (more or less interactive), given
the paper content.
Important dates
(All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”)
Oral and poster (or poster+demo) paper submission: 17 October 2025
Notification of acceptance: 13 February 2026
Camera Ready due: 6 March 2026
Workshop and tutorial proposals submission: 17 October 2025
LREC 2026 conference: 11-16 May 2026
More information on LREC 2026: https://lrec2026.info/
Contact: info(a)lrec2026.info
The First Workshop on Optimal Reliance and Accountability in Interactions
with Generative Language Models (*ORIGen*) will be held in conjunction
with
the Second Conference on Language Modeling (COLM) at the Palais des
Congrès
in Montreal, Quebec, Canada, on October 10, 2025!
*The deadline for submission has been extended to June 27, 2025, Anywhere
on Earth.*
With the rapid integration of generative AI, exemplified by large language
models (LLMs), into personal, educational, business, and even governmental
workflows, such systems are increasingly being treated as “collaborators”
with humans. In such scenarios, underreliance or avoidance of AI
assistance
may obviate the potential speed, efficiency, or scalability advantages of
a
human-LLM team, but simultaneously, there is a risk that subject matter
non-experts may overrely on LLMs and trust their outputs uncritically,
with
consequences ranging from the inconvenient to the catastrophic. Therefore,
establishing optimal levels of reliance within an interactive framework is
a
critical open challenge as language models and related AI technology
rapidly
advances.
* What factors influence overreliance on LLMs?
* How can the consequences of overreliance be predicted and guarded against?
* What verifiable methods can be used to apportion accountability for the
outcomes of human-LLM interactions?
* What methods can be used to imbue such interactions with appropriate
levels
of “friction” to ensure that humans think through the decisions they make
with LLMs in the loop?
The ORIGen workshop provides a new venue to address these questions and
more
through a multidisciplinary lens. We seek to bring together broad
perspectives from AI, NLP, HCI, cognitive science, psychology, and
education
to highlight the importance of mediating human-LLM interactions to
mitigate
overreliance and promote accountability in collaborative human-AI
decision-making.
Submissions are due *June 27, 2025*. Please see our call for papers [1]
for
more!
[1] https://origen-workshop.github.io/submissions/
Organizers:
- Nikhil Krishnaswamy, Colorado State University
- James Pustejovsky, Brandeis University
- Dilek Hakkani-Tür, University of Illinois Urbana Champaign
- Vasanth Sarathy, Tufts University
- Tejas Srinivasan, University of Southern California
- Mariah Bradford, Colorado State University
- Timothy Obiso, Brandeis University
- Mert Inan, Northeastern University
Dear colleagues,
EUSKORPORA, a newly created Linguistic Data Center for Basque digital technologies based in San Sebastián (Donostia), Spain, is seeking candidates for two key roles in its Technology area:
1) Senior AI and Language Technologies Specialist
2) Junior AI and Language Technologies Specialist
Both positions are part of the Center's mission to position the Basque language in the global digital space through open-source development and cutting-edge research.
=== SENIOR AI AND LANGUAGE TECHNOLOGIES SPECIALIST ===
EUSKORPORA, the Linguistic Data Center for Basque Digital Technologies, a new association based in Donostia/San Sebastián, is seeking a senior expert in AI technologies applied to natural language processing, with experience, to lead key tasks related to language technologies applied to the Basque language.
The selected person will be part of an interdisciplinary team and will participate in projects involving the collection, analysis, and annotation of linguistic data, as well as the development of open-source foundational language models (ASR, TTS, MT, NLP) oriented to Basque, in a research and development context closely connected to industry.
Responsibilities:
- Supervise and optimize processes for linguistic corpus collection, annotation, and management
- Lead the design and development of foundational language models applied to Basque (speech recognition, synthesis, translation, text processing, etc.)
- Contribute to the technological architecture of the Center
- Coordinate internal and external teams and mentor junior staff
- Identify innovation opportunities and contribute to proposals, reports, and dissemination
- Establish strategic relationships with ecosystem stakeholders
Requirements:
- Advanced degree (Master or PhD) in Computational Linguistics, NLP, AI, Computer Engineering, Data Science or related fields
- Minimum 5 years of experience in language or speech technologies
- Proven experience with ASR, TTS, MT, or NLP models
- Strong programming skills in Python and familiarity with frameworks such as Hugging Face, PyTorch, TensorFlow, spaCy, Kaldi, ESPnet, Fairseq
- Knowledge of MLOps, Git, and data science best practices
- Familiarity with open repositories and licensing
Languages:
- Basque: desirable, intermediate level (B2 or higher)
- Spanish: fluent
- English: high level (especially technical)
We offer:
- Participation in strategic national and international projects
- Competitive salary according to experience
- Interdisciplinary environment and opportunities for professional growth
=== JUNIOR AI AND LANGUAGE TECHNOLOGIES SPECIALIST ===
EUSKORPORA, the Linguistic Data Center for Basque Digital Technologies, a new association based in Donostia/San Sebastián, is seeking young professionals at the beginning of their careers to support key tasks related to the creation of linguistic resources and language technologies for the Basque language.
Selected individuals will join an interdisciplinary team and participate in projects involving the collection, annotation, and analysis of linguistic data, as well as the development of open-source foundational language models (ASR, TTS, MT, NLP) oriented to Basque, in a research and development context closely connected to industry.
Responsibilities:
- Support the collection, cleaning and annotation of linguistic corpora (text and audio)
- Assist in the training and evaluation of language and speech models
- Collaborate in the documentation and maintenance of language resources
- Contribute to the integration of open-source NLP tools and libraries
- Assist in reports and dissemination activities
- Work in coordination with technical, linguistic and project management profiles
Requirements:
- Degree or Master in Computational Linguistics, Computer Engineering, Data Science, or similar
- Basic knowledge of NLP, language models, or speech technologies
- Python programming (basic/intermediate level)
- Familiarity with linguistic annotation or text processing tools
- Experience with Git and frameworks like Hugging Face or spaCy is a plus
Languages:
- Basque: high level (B2 or higher)
- Spanish: fluent
- English: high level (B2 or higher)
We offer:
- Dynamic and innovative environment based in San Sebastián
- Continuous training in cutting-edge technologies
- Real opportunities for growth within the team
- Competitive salary according to training and experience
For further information or to apply, please contact:
info(a)euskorpora.eus
Best regards,
EUSKORPORA
[Euskorpora]<https://www.euskorpora.eus/>
Euskorpora
info(a)euskorpora.eus<mailto:sarregi@euskorpora.eus>
+(34) 611 02 81 72
Mezu elektroniko honetan jasotzen den informazioa hartzaileen erabilera pertsonal eta konfidentzialerako da. Okerreko mezu hau jaso baduzu, mesedez, jakinarazi eta ezabatu.
[https://www.euskorpora.eus/wp-content/uploads/2025/02/eco.png] Ez inprimatu mezu hau behar-beharrezkoa ez bada.
We are pleased to invite submissions for the first Interdisciplinary
Workshop on Observations of Misunderstood, Misguided and Malicious Use of
Language Models (OMMM 2025). The workshop will be held with the RANLP 2025
conference in Varna, Bulgaria, on 11-13 September 2025.
Overview
The use of Large Language Models (LLMs) pervades scientific practices in
multiple disciplines beyond the NLP/AI communities. Alongside benefits for
productivity and discovery, widespread use often entails misuse due to
misalignment of values, lack of knowledge, or, more rarely, malice. LLM
misuse has the potential to cause real harm in a variety of settings.
Through this workshop, we aim to gather researchers interested in
identifying and mitigating inappropriate and harmful uses of LLMs. These
include misunderstood usage (e.g., misrepresentation of LLMs in the
scientific literature); misguided usage (e.g., deployment of LLMs without
adequate training or privacy safeguards); and malicious usage (e.g.,
generation of misinformation and plagiarism). Sample topics are listed
below, but we welcome submissions on any domain related to the scope of the
workshop.
Important Dates
Submission deadline *[NEW]*: *15 July 2025*, at 23:59 Anywhere on Earth
Notification of acceptance: 01 August 2025
Camera-ready papers due: 30 August 2025
Workshop dates: September 11, 12, or 13, 2025
Submission Guidelines
Submissions will be accepted as short papers (4 pages) and as long papers
(8 pages), plus additional pages for references. All submissions undergo a
double-blind review, so they should not include any identifying
information. Submissions should conform to the RANLP guidelines; for
further information and templates, please see
https://ranlp.org/ranlp2025/index.php/submissions/
We welcome submissions from diverse disciplines, including NLP and AI,
psychology, HCI, and philosophy. We particularly encourage reports on
negative results that provide interesting perspectives on relevant topics.
In-person presenters will be prioritised when selecting submissions to be
presented at the workshop, but the workshop will take place in a hybrid
format. Accepted papers will be included in the workshop proceedings in the
ACL Anthology.
Papers should be submitted on the RANLP conference system at
https://softconf.com/ranlp25/OMMM2025/
Keynote Speaker
We are excited to have Dr. Stefania Druga as the keynote speaker for the
inaugural OMMM workshop. Dr. Druga is a Research Scientist at Google
DeepMind, where she designs novel multimodal AI applications.
Topics of Interest
We welcome paper submissions on all topics related to inappropriate and
harmful uses of LLMs, including but not limited to:
-
Misunderstood use (and how to improve understanding):
-
Misrepresentation of LLMs (e.g., anthropomorphic language)
-
Attribution of consciousness
-
Interpretability
-
Overreliance on LLMs
-
Misguided use (and how to find alternatives):
-
Underperformance and inappropriate applications
-
Structural limitations and ethical considerations
-
Deployment without proper training or safeguards
-
Malicious use (and how to mitigate it):
-
Adversarial attacks, jailbreaking
-
Detection and watermarking of machine-generated content
-
Generation of misinformation or plagiarism
-
Bias mitigation and trust design
For more information, please refer to the workshop website:
https://ommm-workshop.github.io/2025/. For any questions, please contact
the organisers at ommm-workshop(a)googlegroups.com.
The organisers,
Piotr Przybyła, Universitat Pompeu Fabra
Matthew Shardlow, Manchester Metropolitan University
Clara Colombatto, University of Waterloo
Nanna Inie, IT University of Copenhagen
[Apologies for cross-posting]
Terminology Translation Task at WMT2025 - Call for Participation
We are excited to announce the third Shared Task on Terminology Translation<https://www2.statmt.org/wmt25/terminology.html>, which would be run within the 10th Conference on Machine Translation (WMT2025) in Suzhou, China.
TL;DR:
- We test the sentence-level and document-level translation of the texts in finance and IT domains, given the explicit terminology.
- The language pairs are: English -> {Spanish, German, Russian, Chinese}, Chinese -> English.
- We evaluate the overall quality of translation, terminology success rate and consistency. Additionally, we compare the performance of systems given no terms provided, proper terminology and random terms.
- The task starts on 20th June 2025 AOE, the submission deadline is 20th July 2025 AOE.
- Please pre-register via Google Forms here: https://forms.gle/ZSn2pNJkQJAzHFnA6 .
OVERVIEW
The advances in neural MT and LLM-assisted translation of the last decade show nearly human quality in general domain translation at least for the high-resource languages. However, when it comes to specialized domains like science, finance, or legal texts, where the correct and consistent use of special terms is crucial, the task is far from being solved. The Terminology Shared Task aims to assess the extent to which machine translation models can utilize additional information regarding the translation of terminologies. Compared to two previous editions, 2021 and 2023, the new test data have more various test cases, are more consistent in domains for each translation direction, and are broader in language coverage.
TASK DESCRIPTION
Track №1: Sentence/Paragraph-Level Translation
You will be provided with sequence of input sentences long, and small terminology dictionaries that will correspond only to the terms present in the given sentence.
Language Pairs:
* en-de (English → German)
* en-ru (English → Russian)
* en-es (English → Spanish)
Domain: information technology
Track №2: Document-Level Translation
The setup is similar to Track №1, with two exceptions: the length of the input texts now equals the document, and the dictionaries correspond to the whole set of input texts (i.e. they are corpus-level). This makes the task close to the real-life setup (where the dictionaries exist independently from the texts), while it may complicate the implementation (since for the solutions that require storing the whole dictionary it will take more memory). Additionally, for the whole document setup, the problem of the consistent usage of terms is becoming more important.
Language Pairs:
en-zh-Hant (English → Traditional Chinese)
zh-Hant-en (Traditional Chinese → English)
Domain: finance
EVALUATION
Terminology Modes:
You are expected to compare your system’s performance under three modes:
1. No terminology: the system is only provided with input sentences/documents.
2. Proper terminology: the system is provided with input texts (same as 1.) and dictionaries of the format {source_term: target_term}.
3. Random terminology: the system is provided with input texts and translation dictionaries of the same format as in 2. The difference is that the dictionary items are not special terms but words randomly drawn from input texts. This mode is of special interest since we want to measure to what extent the proper term translations help to improve the system performance (2.), as opposed to an arbitrary broader input that does not contain the domain-specific terminology.
Metrics:
1. Overall Translation Quality: we will evaluate the general aspects of machine translation outputs such as fluency, adequacy and grammaticality. We will do that with the general MT automatic metrics such as BLEU or COMET. In addition to that, we will pay special attention to the grammaticality of the translated terms.
2. Terminology Success Rate: This metric assesses the ability of the system to accurately translate technical terms given the specialized vocabulary. This will be carried out by comparing the occurrences of the correct term translations (i.e. the ones present in the dictionary) to the output terms. The goal is to have a higher success rate that will show adherence to dictionary translations.
3. Terminology Consistency: for domains such as science or legal texts, the consistent use of an introduced term throughout the text is crucial. In other words, we want a system to not only pick up a correct term in a target language but to use it consistently once it is chosen. This will be evaluated by comparing all translations of a given source term in a text and measuring the percentage of deviations from the most consistent translation. This metric is more important for the Document-Level track, but it will be used for both tracks.
IMPORTANT DATES
All dates are end of Anywhere on Earth (AoE).
Data snippets released: 7th May 2025
Dev data released: 22nd May 2025
Test data release, task starts: 20th June 2025 (postponed)
Submission deadline: 20th July 2025 (postponed)
Paper submission to WMT25: in-line with WMT25
Camera-ready submission to WMT25: in-line with WMT25
Conference in Suzhou, China: 05-09 November 2025
SUBMISSION GUIDELINES
0. Please notify us about your participation prior to submission. This is optional, but will be very helpful for us for better understanding of our workload after submission. Please do it through this Google Form: https://forms.gle/ZSn2pNJkQJAzHFnA6
1. Check your submission files with the validation script. It will be published at test date publication.
2. Write a description of your system (optional).
3. Submit your system via Google Forms. The Google form with all necessary submission details will be published at the test set date.
All details on submission as well as FAQ can be found at the webpage of the shared task.
ORGANIZERS
* Kirill Semenov (University of Zurich), main contact: FirstNаmе [dоt] LаstNаmе {аt} uzh /dоt/ ch
* Nathaniel Berger (Heidelberg University)
* Pinzhen Chen (University of Edinburgh & Aveni.ai)
* Xu Huang (Nanjing University)
* Arturo Oncevay (JP Morgan)
* Dawei Zhu (Amazon)
* Vilém Zouhar (ETH Zurich)
WEBSITE: https://www2.statmt.org/wmt25/terminology.html
In case of query, please send an email to Kirill Semenov (see email above).
Call for papers: The First Workshop on Natural Language Processing and Language Models for Digital Humanities
(LM4DH_2025) @ RANLP_2025
Date: 11th- to 13th September 2025 (TBC)
Venue : Varna, Bulgaria
Website: https://www.clarin.eu/event/2025/clarin-workshop-ranlp-2025
Submissions Portal: https://softconf.com/ranlp25/LM4DH2025/
Digital Humanities has emerged as an interdisciplinary field of research, serving as an intersection of computer science with many other fields such as linguistics, social sciences, history, psychology, etc. With the development of Large Language Models (LLMs), state-of-the-art Natural Language Processing (NLP) tasks such as entity recognition, sentiment analysis, and text summarisation have been significantly enhanced, offering powerful tools to analyse and interpret complex historical and cultural data. These developments offer transformative capabilities for analysing and interpreting complex historical and cultural datasets, including oral histories, archival documents, and literary texts. These advancements provide powerful tools for analysing and interpreting intricate historical, cultural, and social data, enabling researchers to identify patterns, extract meaningful relationships, and generate interpretations at unprecedented scale and precision.
This workshop aims to provide a common platform for researchers, practitioners, and students from diverse disciplines to collaboratively explore and apply AI-driven techniques in the Digital Humanities. Through interdisciplinary discussion, the event aims to generate creative approaches, exchange best practices, and create a community committed to furthering AI-based research on human culture and history. The focus of the workshop is on applying natural language processing techniques to digital humanities research. The topics can be anything of digital humanities interest with a natural language processing or LLM-based application. We expect contributions related (but not limited) to the following topics:
* Text analysis and processing related to the humanities using computational methods
* Usage of the interpretability of large language models' output for DH-related tasks
* Dataset creation and curation for NLP (e.g. digitisation, datafication, and data preservation
* Automatic error detection, correction, and normalisation of textual data
* Generation and analysis of literary works such as poetry and novels
* Analysis and detection of text genres
* Emotion analysis for the humanities and literature
* Modelling of information and knowledge in the Humanities, Social Sciences, and Cultural Heritage
* Low-resource and historical language processing
* Search for scientific and/or scholarly literature
* Profiling and authorship attribution
Submission & Publication
All papers must represent original and unpublished work that is not currently under review. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop.
Submissions must follow the RANLP 2025 submission guidelines<https://ranlp.org/ranlp2025/index.php/submissions/>, using ACL-style templates (LaTeX or MS Word).
Paper must be submitted using SoftConf at https://softconf.com/ranlp25/LM4DH2025/
All papers will be double-blind peer reviewed. Authors of the accepted papers will present their work in either the oral or poster session. All accepted papers will appear in the workshop proceedings that will be published in ACL Anthology.
Important Dates
* Paper submission deadline: 20th July 2025
* Notification of acceptance: 2nd August 2025
* Camera-ready paper: 20th August 2025
* Workshop date: 11th September 2025
Organising Committee
* Isuri Anuradha, Lancaster University, UK
* Francesca Frontini, CNR-ILC, Italy & CLARIN ERIC
* Paul Rayson, Lancaster University, UK
* Ruslan Mitkov, Lancaster University, UK
* Deshan Sumanathilake, Swansea University, UK
This workshop has been organised with the generous support and coordination of CLARIN-EU.
Gmail: dhranlp2(a)gmail.com<mailto:%20dhranlp2@gmail.com>
*Call for Participation in Tracks
*
*FIRE 2025: 17th meeting of the Forum for Information Retrieval Evaluation*
Indian Institute of Technology (BHU) Varanasi
17th - 20th December
Website: fire.irsi.org.in <http://fire.irsi.org.in/>
*Call for Participation in Tracks*
FIRE 2025 offers the following exciting tracks this year:
* Cross-Lingual Mathematical Information Retrieval (CLMIR)
<https://clmir2025.github.io/>
* Code-Mixed Information Retrieval from Social Media Data (CMIR)
<https://cmir-iitbhu.github.io/cmir/index.html>
* Hate Speech and Offensive Content Identification in Memes in
Bengali, Hindi, Gujarati and Bodo (HASOC-meme)
<https://hasocfire.github.io/hasoc/2025/>
* Information Retrieval in Software Engineering (IRSE)
<https://sites.google.com/view/irse-2025/home>
* Misinformation Detection and Prompt Recovery (PROMID)
<https://promid.github.io/index.html>
* Multilingual Story Illustration: Bridging Cultures through AI
Artistry (MUSIA) <https://cse-iitbhu.github.io/MUSIA/index.html>
* Offensive Language Identification in Dravidian Languages
(DravidianCodeMix)
<https://dravidian-codemix.github.io/2025/dataset.html>
* Opinion Extraction and Question Answering from
CryptoCurrency-Related Tweets and Reddit posts (CryptOQA)
<https://sites.google.com/view/cryptoqa-2025/>
* Research Highlight Generation from Scientific Papers (SciHigh)
<https://sites.google.com/jadavpuruniversity.in/scihigh2025/home>
* Spoken-Query Cross-Lingual Information Retrieval for the Indic
Languages (SqCLIR) <https://sites.google.com/view/sqclir-2025>
* Varanasi Tourism in Question Answer System (VATIKA)
<https://sites.google.com/view/vatika-2025/>
* Word-Level Identification of Languages in Dravidian Languages (WILD)
<https://www.codabench.org/competitions/7902/>
Research groups are invited to participate in the experiments. Please
register directly with the organizers.
FIRE 2025 is the 17th edition of the annual meeting of Forum for
Information Retrieval Evaluation (fire.irsi.org.in). Since its inception
in 2008, FIRE had a strong focus on shared tasks similar to those
offered at Evaluation forums like TREC, CLEF, and NTCIR. The shared
tasks focus on solving specific problems in the area information access
and, more importantly help in generating evaluation datasets for the
research community.
Visit fire.irsi.org.in <http://fire.irsi.org.in>
The 2st Workshop on DHOW: Diffusion of Harmful Content on Online Web
Workshop
The workshop will be conducted in a *hybrid* format to ensure maximum
participation, accommodating attendees both *online* and in person.
Submission deadline: *July 11 2025 AOE*
*Workshop site*: https://dhow-workshop.github.io/2025/
*Co-located with ACMMM 2025*
https://acmmm2025.org/ <https://lrec-coling-2024.org/>
Dublin, Ireland, 27-31 October 2024
*Important Dates*
Submission deadline: extended to *July 11, 2025*
Notification of acceptance: August 01, 2025
Camera-ready papers due: August 11, 2025
Workshop date: October 27/28, 2025
*Workshop Description*
With the advancement of digital technologies and gadgets, online content
is easily accessible. At the same time, harmful content also gets
spread. There are different harmful content available on different
platforms in multiple languages. The topic of harmful content is broad
and covers multiple research directions. But from the user’s aspect,
they are affected by them all. Often, it is studied individually, like
misinformation and hate speech. Research has been done on one platform,
monolingual, on a particular issue. It leads to harmful content
spreaders switching platforms and languages to reach the user base.
Harmful is not limited to social media but also news media. Spreader
shares harmful content in posts, news articles, comments, and
hyperlinks. So, there is a need to study the harmful content by
combining cross-platform, language, multimodal data and topics.
We will bring the research on harmful content under one umbrella so that
research on different topics (hate speech, misinformation,
disinformation, self-harm, offensive content, etc.) can bring some novel
methods and recommendations for users, leveraging text analysis with
image, audio, and video recognition to detect harmful content in diverse
formats. The workshop will cover the ongoing issue of war or elections
in 2025.
We believe this workshop will provide a unique opportunity for
researchers and practitioners to exchange ideas, share latest
developments, and collaborate on addressing the challenges associated
with harmful contents spread across the Web. We expect that the workshop
will generate insights and discussions that will help advance the field
of societal artificial intelligence (AI) for the development of safer
internet. In addition to attracting high quality research contributions
to the workshop, one of the aims of the workshop is to mobilise the
researchers working on the related areas to form a community.
*Submissions Topics*
•Studying different types of harmful content
•Computational fact-checking & Misinformation Detection
•Role of Generative AI in Mitigating Harmful Content
•Harassment, Bullying, and Hate Speech Detection
•Explainable AI for Harmful Content Analysis
•Multimodal and Multilingual Harmful Content Detection such as fake
news, spam, and troll detection.
•Deepfake and Synthetic Media
•Ethical & Societal Implications of AI in Content Moderation
•Both Qualitative and Quantitative study on harmful content
•Psychological effects of harmful content like mental health
•Approaches for data collection or data annotation using multimodal
large models on harmful content
•User study on the effects of harmful content on human beings
*Submissions*
- Submission Instructions: https://dhow-workshop.github.io/2025/#call
<https://dhow-workshop.github.io/2025/#call>
- Submission Link:
https://openreview.net/group?id=acmmm.org/ACMMM/2025/Workshop/DHOW
<https://openreview.net/group?id=acmmm.org/ACMMM/2025/Workshop/DHOW>
***Workshop organizers*
•Thomas Mandl (University of Hildesheim, Germany)
•Haiming Liu (University of Southampton, United Kingdom)
•Gautam Kishore Shahi(University of Duisburg-Essen, Germany)
•Amit Kumar Jaiswal (University of Surrey, United Kingdom )
•Durgesh Nandini (University of Bayreuth, Germany)
DHOW 2025
Ethical LLMs 2025: The first Workshop on Ethical Concerns in Training, Evaluating and Deploying Large Language Models<https://sites.google.com/view/ethical-llms-2025> @ RANLP2025<https://ranlp.org/ranlp2025/>
Call for papers:
Scope
Large Language Models (LLMs) represent a transformative leap in Artificial Intelligence (AI), delivering remarkable language-processing capabilities that are reshaping how we interact with technology in our daily lives. With their ability to perform tasks such as summarisation, translation, classification, and text generation, LLMs have demonstrated unparalleled versatility and power. Drawing from vast and diverse knowledge bases, these models hold the potential to revolutionise a wide range of fields, including education, media, law, psychology, and beyond. From assisting educators in creating personalised learning experiences to enabling legal professionals to draft documents or supporting mental health practitioners with preliminary assessments, the applications of LLMs are both expansive and profound.
However, alongside their impressive strengths, LLMs also face significant limitations that raise critical ethical questions. Unlike humans, these models lack essential qualities such as emotional intelligence, contextual empathy, and nuanced ethical reasoning. While they can generate coherent and contextually relevant responses, they do not possess the ability to fully understand the emotional or moral implications of their outputs. This gap becomes particularly concerning when LLMs are deployed in sensitive domains where human values, cultural nuances, and ethical considerations are paramount. For example, biases embedded in training data can lead to unfair or discriminatory outcomes, while the absence of ethical reasoning may result in outputs that inadvertently harm individuals or communities. These limitations highlight the urgent need for robust research in Natural Language Processing (NLP) to address the ethical dimensions of LLMs. Advancements in NLP research are crucial for developing methods to detect and mitigate biases, enhance transparency in model decision-making, and incorporate ethical frameworks that align with human values. By prioritising ethics in NLP research, we can better understand the societal implications of LLMs and ensure their development and deployment are guided by principles of fairness, accountability, and respect for human dignity. This workshop will dive into these pressing issues, fostering a collaborative effort to shape the future of LLMs as tools that not only excel in technical performance but also uphold the highest ethical standards.
Submission Guidelines
We follow the RANLP 2025 standards for submission format and guidelines. EthicalLLMs 2025 invites the submission of long papers, up to eight pages in length, and short papers, up to six pages in length. These page limits only apply to the main body of the paper. At the end of the paper (after the conclusions but before the references) papers need to include a mandatory section discussing the limitations of the work and, optionally, a section discussing ethical considerations. Papers can include unlimited pages of references and an unlimited appendix.
To prepare your submission, please make sure to use the RANLP 2025 style files available here:
* Latex<https://ranlp.org/ranlp2025/wp-content/uploads/2025/05/ranlp2025-LaTeX.zip>
* Word<https://ranlp.org/ranlp2025/wp-content/uploads/2025/05/ranlp2025-word.docx>
Papers should be submitted through Softconf/START using the following link: https://softconf.com/ranlp25/EthicalLLMs2025/
Topics of interest
The workshop invites submissions on a broad range of topics related to the ethical development and evaluation of LLMs, including but not limited to the following.
1.
Bias Detection and Mitigation in LLMs
Research focused on identifying, measuring, and reducing social, cultural, and algorithmic biases in large language models.
2.
Ethical Frameworks for LLM Deployment
Approaches to integrating ethical principles—such as fairness, accountability, and transparency—into the development and use of LLMs.
3.
LLMs in Sensitive Domains: Risks and Safeguards
Case studies or methodologies for deploying LLMs in high-stakes fields such as healthcare, law, and education, with an emphasis on ethical implications.
4.
Explainability and Transparency in LLM Decision-Making
Techniques and tools for improving the interpretability of LLM outputs and understanding model reasoning.
5.
Cultural and Contextual Understanding in NLP Systems
Strategies for enhancing LLMs’ sensitivity to cultural, linguistic, and social nuances in global and multilingual contexts.
6.
Human-in-the-Loop Approaches for Ethical Oversight
Collaborative models that involve human expertise in guiding, correcting, or auditing LLM behaviour to ensure responsible use.
7. Mental Health and Emotional AI: Limits of LLM Empathy
Discussions on the role of LLMs in mental health support, highlighting the boundary between assistive technology and the need for human empathy.
Organisers
Damith Premasiri – Lancaster University, UK
Tharindu Ranasinghe – Lancaster University, UK
Hansi Hettiarachchi – Lancaster University, UK
Contact
If you have any questions regarding the workshop, please contact Damith: d.dolamullage(a)lancaster.ac.uk
Dear all,
We are currently doing a project aiming to make querying in syntactically annotated corpora easier and more accessible.
For this purpose, we want to know what researchers are actually searching for.
If you have a minute of your time, please feel free to fill out this form.
https://forms.office.com/e/a8DgETSabB
Feel free to reach out to ekavol(a)chalmers.se or nikdew(a)chalmers.se if you have any further questions.
Best regards
Niklas Deworetzki & Katja Voloshina
PhD Students
Department of Computer Science and Engineering
Chalmers University of Technology | University of Gothenburg
SE-412 96 Göteborg, Sweden
www.gu.se<http://www.gu.se/>
www.chalmers.se<http://www.chalmers.se/>
[cid:a8138665-78e4-4530-80d5-cf9cbf2bd3c2]
CLEF 2025 – Registration Open
Conference and Labs of the Evaluation Forum
We are pleased to announce CLEF 2025, taking place 9–12 September 2025 in Madrid, Spain at UNED. This peer‑reviewed conference and associated labs foster research in multilingual, multimodal, and cross‑language information access https://clef2025.clef-initiative.eu/.
Register now – Early‑bird registration is open! Standard registration opened earlier this year, and early-bird rates are currently available .
Why attend?
*
Present and discuss original research at main conference.
*
Engage in innovative labs and challenges, including LifeCLEF, ImageCLEF, EXIST, eRisk, CheckThat!, and more https://clef2025.clef-initiative.eu/index.php?page=Pages/labs.html.
*
Benefit from rich networking with academic and industry experts in IR, NLP, multimedia retrieval, and evaluation sciences.
For detailed conference and lab registration, registration deadlines, and pricing, please visit the official site: https://clef2025.clef-initiative.eu/index.php?page=Pages/registrationConfer…
Important Dates
*
Early‑bird registration ongoing
*
Registration closes: 31 August 2025
*
Conference & labs: 9–12 September 2025 — Madrid, Spain
We look forward to welcoming participants from across the global community — see you this September in Madrid at CLEF 2025!
Jorge Carrillo-de-Albornoz
On behalf of the CLEF 2025 Organising Committee
AVISO LEGAL. Este mensaje puede contener información reservada y confidencial. Si usted no es el destinatario no está autorizado a copiar, reproducir o distribuir este mensaje ni su contenido. Si ha recibido este mensaje por error, le rogamos que lo notifique al remitente.
Le informamos de que sus datos personales, que puedan constar en este mensaje, serán tratados en calidad de responsable de tratamiento por la UNIVERSIDAD NACIONAL DE EDUCACIÓN A DISTANCIA (UNED) c/ Bravo Murillo, 38, 28015-MADRID-, con la finalidad de mantener el contacto con usted. La base jurídica que legitima este tratamiento, será su consentimiento, el interés legítimo o la necesidad para gestionar una relación contractual o similar. En cualquier momento podrá ejercer sus derechos de acceso, rectificación, supresión, oposición, limitación al tratamiento o portabilidad de los datos, ante la UNED, Oficina de Protección de datos<https://www.uned.es/dpj>, o a través de la Sede electrónica<https://sede.uned.es/> de la Universidad.
Para más información visite nuestra Política de Privacidad<https://descargas.uned.es/publico/pdf/Politica_privacidad_UNED.pdf>.
Apologies for cross-posting.
---------------------------------------------------------------------------
*CALL FOR PAPERS: Language Resources and Evaluation Journal- Special Issue
on Machine Translation for Low-Resource Languages*
https://link.springer.com/collections/gbdgacbgbg
*Guest Editors:*
- Atul Kr. Ojha (Insight Research Ireland Centre for Data Analytics,
DSI, University of Galway, Ireland)
- Chao-Hong Liu (Industrial Technology Research Institute, Potamu
Research Ltd.)
- Ekaterina Vylomova (University of Melbourne, Australia)
- Flammie Pirinen (UiT The Arctic University of Norway, Tromsø)
- Jonathan Washington (Swarthmore College, USA)
- Nathaniel Oco (De La Salle University, Philippines)
- Xiaobing Zhao (Minzu University of China)
Machine translation (MT) technologies have been improved significantly in
the last decade using neural MT (NMT) approaches. However, most of these
methods rely on the availability of large parallel data for training the MT
systems, resources which are not available for the majority of language
pairs. Hence, current technologies often fall short in their ability to be
applied to low-resource languages. Developing MT technologies using
relatively small corpora still presents a major challenge for the MT
community. In addition, many methods for developing MT systems still rely
on several natural language processing (NLP) tools to pre-process texts in
source languages and post-process MT outputs in target languages. The
performance of these tools often has a great impact on the quality of the
resulting translation. The availability of MT technologies and NLP tools
can facilitate equal access to information for the speakers of a language
and determine on which side of the digital divide they will end up. The
lack of these technologies for many of the world's languages provides
opportunities both for the field to grow and for making tools available for
speakers of low-resource languages.
In the past few years, several workshops and evaluations have been
organized to promote research on low-resource languages. NIST has been
conducting Low Resource Human Language Technology evaluations (LoReHLT)
annually from 2016 to 2019. In LoReHLT evaluations, there is no training
data in the evaluation language. Participants receive training data in
related languages but need to bootstrap systems in the surprise evaluation
language at the start of the evaluation. Methods for this include pivoting
approaches and taking advantage of linguistic universals. The evaluations
are supported by DARPA's Low Resource Languages for Emergent Incidents
(LORELEI) program, which seeks to advance technologies that are less
dependent on large data resources and that can be quickly pivoted to new
languages within a very short amount of time so that information from any
language can be extracted in a timely manner to provide situation awareness
to emergent incidents. There are also the Workshop on Technologies for MT
of Low-Resource Languages (LoResMT), Special Interest Group on
Under-resourced Languages (SIGUL), Workshop on Resources and Technologies
for Indigenous, Endangered and Lesser-resourced Languages in Eurasia
(EURALI), the Workshop on Deep Learning Approaches for Low-Resource Natural
Language Processing (DeepLo). AfricaNLP, TurkLang, Conference on Machine
Translation (WMT), and International Conference on Spoken Language
Translation (IWSLT) workshop, which provide a venue for sharing research
and working on research and development in this field.
This topical collection solicits original research papers on MT
systems/methods and related NLP tools for low-resource languages in
general. LoReHLT, LORELEI, LoResMT, SIGUL, EURALI, DeepLo, WMT, and IWSLT
participants are very welcome to submit their work to the special issue.
Summary papers on MT research for specific low-resource languages, as well
as extended versions (>40% difference) of published papers from relevant
conferences/workshops, are also welcome.
Topics of the special issue include, but are not limited to:
* Research and review papers on MT systems/methods for low-resource
languages
* Research and review papers on pre-processing and/or post-processing NLP
tools for MT
* Word tokenizers/de-tokenizers for low-resource languages
* Word/morpheme segmenters for low-resource languages
* Use of morphological analyzers and/or morpheme segmenters in MT
* Multilingual/cross-lingual NLP tools for MT
* Review of available corpora of low-resource languages for MT
* Pivot MT for low-resource languages
* Zero-shot MT for low-resource languages
* Fast building of MT systems for low-resource languages
* Re-usability of existing MT systems and/or NLP tools for low-resource
languages
* Machine translation for language preservation
* Techniques that work across many languages and modalities
* Techniques that are less dependent on large data resources
* Use of language-universal resources
* Bootstrap-trained resources for the short development cycle
* Entity, relation- and event-extraction
* Sentiment detection in MT
* MT Summarisation
* Processing diverse languages, genres (news, social media, etc.) and
modalities (text, speech, video, etc.)
* Speech Translation for low-resource languages
* Multimodal MT for low-resource languages
* MT models using LLMs for low-resource languages
* Generative AI models for low-resource languages
* Evaluation metrics and datasets for low-resource languages
For further information on this initiative, please refer to
https://link.springer.com/collections/gbdgacbgbg
*IMPORTANT DATES*
*August 26, 2025: Paper submission deadlineDecember 05, 2025: Revised
papers dueMarch 2026: Publication*
* SUBMISSION GUIDELINES*
Authors should follow the "Instructions for Authors
<https://link.springer.com/journal/10579/submission-guidelines> (
https://link.springer.com/journal/10579/submission-guidelines or Overleaf
<https://link.springer.com/journal/10579/updates/17234296>)" on the LRE
journal website <https://link.springer.com/journal/10579>.
Thanks,
In this newsletter:
LDC data and commercial technology development
New publications:
Chinese Sentence Pattern Structure Treebank<https://catalog.ldc.upenn.edu/LDC2025T06>
IWSLT 2022-2023 Shared Task Training, Development and Test Set<https://catalog.ldc.upenn.edu/LDC2025S05>
KAIROS Schema Learning Complex Event Annotation<https://catalog.ldc.upenn.edu/LDC2025T07>
________________________________
LDC data and commercial technology development
For-profit organizations are reminded that an LDC membership is a pre-requisite for obtaining a commercial license to almost all LDC databases. Non-member organizations, including non-member for-profit organizations, cannot use LDC data to develop or test products for commercialization, nor can they use LDC data in any commercial product or for any commercial purpose. LDC data users should consult corpus-specific license agreements for limitations on the use of certain corpora. Visit the Licensing<https://www.ldc.upenn.edu/data-management/using/licensing> page for further information.
________________________________
New publications:
Chinese Sentence Pattern Structure Treebank<https://catalog.ldc.upenn.edu/LDC2025T06> was developed at Beijing Normal University<https://english.bnu.edu.cn/> and Peking University<https://english.pku.edu.cn/>. It contains 5,016 sentences and 119,627 tokens syntactically annotated following the concept of sentence constituent analysis which emphasizes sentence pattern structure. The source data consists of 27 chapters extracted from modern Mandarin and ancient Chinese works. There are three annotation layers: lexical sense and structural mode for dynamic words; syntactic structure for clauses; and inter-clause relation within complex sentence and sentence clusters. These structures can be visualized using the Jbw-viewer tool<https://github.com/bnucip/jbwviewer> which is included in the release.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
IWSLT 2022 - 2023 Shared Task Training, Development and Test Set<https://catalog.ldc.upenn.edu/LDC2025S05> was developed by LDC and contains 210 hours of Tunisian<https://catalog.ldc.upenn.edu/LDC2025S05> Arabic conversational telephone speech, transcripts, English translations, speaker metadata, and documentation. This material constitutes the training, development, and test data used in the International Conference on Spoken Language Translation (IWSLT) Dialectal Speech Translation task (2022)<https://iwslt.org/2022/dialect> and the Dialectal and Low-resource track (2023)<https://iwslt.org/2023/low-resource>.
The telephone speech was collected by LDC in 2016-2017 from native speakers of Tunisian Arabic in Tunis. Speakers were recruited to make telephone calls to people in their social networks from a variety of noise conditions and handsets. Transcripts are orthographic following Buckwalter<https://catalog.ldc.upenn.edu/LDC2004L02> transliteration and cover 175 hours of the collected speech. IPA transcripts were added to a subset of the data. All transcribed segments were translated into English.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
KAIROS Schema Learning Complex Event Annotation<https://catalog.ldc.upenn.edu/LDC2025T07> was developed by LDC to support the DARPA KAIROS program. It contains English and Spanish text, audio, video, and image data labeled for 93 real-world complex events with event, relation, and argument annotations linking to document provenance. Source data was collected from the web; 3431 root web pages were collected and processed, yielding 1919 text data files, 24019 image files, 1472 video files, and 16 audio files.
The DARPA KAIROS (Knowledge-directed Artificial Intelligence Reasoning Over Schemas) program aimed to build technology capable of understanding and reasoning about complex real-world events in order to provide actionable insights to end users. KAIROS systems utilized formal event representations in the form of schema libraries that specified the steps, preconditions, and constraints for an open set of complex events; schemas were then used in combination with event extraction to characterize and make predictions about real-world events in a large, multilingual, multimedia corpus.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
Dear CLIN enthusiasts
We are extending the submission deadline for CLIN abstracts by one week. The new, final deadline is June 20th. Below you can find the original call for abstracts with a modified date.
Website: https://clin35.ccl.kuleuven.be/
We invite submissions for CLIN35, the 35th edition of the Computational Linguistics in the Netherlands (CLIN) conference, which will take place in Leuven on September 12th, 2025.
Abstracts describing theoretical or applied research in any area of computational linguistics and natural language processing are welcome. We especially encourage submissions related to the Dutch language, but contributions on other languages and multilingual approaches are equally welcome. Abstracts must be written in English and should not exceed 500 words.
Submissions should include:
* Name and affiliation of each author
* Contact details
* Presentation title and short abstract (max. 500 words)
* Keywords
* Your presentation format preference (We will do our best to accommodate your preference but may need to make changes to provide a well-balanced program)
Abstracts must be submitted via the form on the website<https://clin35.ccl.kuleuven.be/call-for-abstracts> by Friday, 20th of June 2025. Notifications of acceptance will be sent out by Friday, 4th of July 2025. Accepted abstracts will be presented at the conference as oral or poster presentations. Authors with accepted abstracts will also have the opportunity to submit a full paper after the conference for publication in the CLIN Journal<https://www.clinjournal.org/clinj/>.
Please share this call with your interested colleagues and network! For any questions you can reach us at this email address (clin35(a)kuleuven.be<mailto:clin35@kuleuven.be>).
We look forward to your submissions and to welcoming you to CLIN35!
CLIN35 local organizers
________________________________
Denk je aan het milieu? Print alleen als het nodig is.
Aan dit bericht kunnen geen rechten worden ontleend.
Het bericht is alleen bestemd voor de geadresseerde.
Indien het bericht niet voor u is bestemd, verzoeken wij
u dit aan ons te melden en het bericht te verwijderen.
This message shall not constitute any obligations.
This message is intended solely for the addressee.
If you have received this message in error, please
inform us and delete the message.
________________________________
******************************************************
********* EVALITA 2026: Call for tasks *********
******* NEW DEADLINES and TIMELINE ******
******************************************************
EVALITA 2026 is an initiative of AILC (Associazione Italiana di Linguistica
Computazionale, AILC https://www.ai-lc.it/).
As in the previous editions (https://www.evalita.it/), EVALITA 2026 will be
organized along a few selected tasks, which provide participants with
opportunities to discuss and explore both emerging and traditional
areas of Natural
Language Processing and Speech. The participation is encouraged for teams
working both in academic institutions and industrial organizations.
TASK PROPOSAL SUBMISSION
Task proposals should be no longer than 4 pages and should include:
-
task title and acronym;
-
names and affiliation of the organizers (minimum 2 organizers);
-
brief task description, including motivations and state of the art;
-
explanation of the international relevance of the task;
-
description and examples of the data, including information about their
availability, development stage, and issues concerning privacy and data
sensitivity. The examples are mandatory because they are intended to give
potential participants an idea of what the task data will look like, how
it’ll be formatted, etc.
-
expected number of participants and attendees;
-
names and contact information of the organizers.
We also accept the re-annotation/expansion of datasets from previous years
and previous challenges with new annotation levels, and texts from publicly
available corpora. However, test annotations must be new and unpublished,
as participants must not have access to the test data annotations until the
end of EVALITA campaign. For new tasks, organizers must specify in the
proposal why it would attract a reasonable number of participants, and why
it is needed. For re-runs, organizers must describe the element of novelty
from previous challenges.
In submitting your proposal, please bear in mind that we strongly encourage:
-
tasks that pose non-trivial challenges and stimulate the creation of
innovative systems (i.e., that integrate linguistic insights or external
knowledge sources), rather than being easily addressed by off-the-shelf LLM
prompting techniques;
-
tasks focused on multimodality, e.g., considering both textual and
visual or any other modality;
-
tasks characterized by different levels of complexity, e.g., with a
straightforward main subtask and one or more sophisticated additional
subtasks;
-
to consider providing competitive baselines (e.g., small-scale LLMs in
zero-shot setups), which participants are expected to improve upon, in
order to encourage the design of advanced solutions;
-
application-oriented tasks, that is, tasks that have a clearly defined
end-user application showcasing;
-
multilingual tasks, i.e. with data both in Italian and in other
languages;
-
industrial tasks, i.e. tasks with real data provided by companies.
The organizers of the accepted tasks should take care of planning,
according to the scheduled deadlines (see below):
-
the development and distribution of datasets needed for the contest,
i.e. data for training and development, and data for testing; the scorer to
be used to evaluate the submitted systems should be included in the release
of development data;
-
the development of task guidelines, where all the instructions for the
participation are made clear, together with a detailed description of data
and evaluation metrics applied for the evaluation of the participant's
results;
-
the collection of participants' results;
-
the evaluation of participants' results according to standard metrics
and baseline(s);
-
the solicitation of participation and submissions;
-
the reviewing process of the papers describing the participants'
approach and results (according to the template to be made available by the
EVALITA 2026 chairs);
-
the production of a paper describing the task (according to the template
to be made available by the EVALITA 2026 chairs).
*** Email your proposal in PDF format to evalitacampaign(a)gmail.com with
"EVALITA 2026 TASK Proposal" as the subject line by the submission
deadline: July 28th 2025. ***
Please feel free to contact the EVALITA 2026 chairs at
evalitacampaign(a)gmail.com in case of any questions or suggestions.
Deadlines of the task proposal:
-
July 21th 2025 July 28th 2025: submission of task proposals
-
July 31th 2025 August 7th 2025: notification of task proposal acceptance
Timelines of EVALITA 2026:
-
22nd September 2025: development data available to participants
-
3 - 17th November 2025: evaluation windows
-
28th November 2025: assessments returned to participants
-
15th December 2025: final reports (from participants) due to task
organizers
-
22nd December 2025: final reports (from task organizers) due to EVALITA
chairs
-
19th January 2025: review deadline
-
2nd February 2026: camera-ready version deadline
-
26 - 27th February 2026: final workshop in Bari
EVALITA 2026 CHAIRS
Francesco Cutugno (Università di Napoli)
Alessio Miaschi (Istituto di Lingustica Computazionale “A. Zampolli” - CNR)
Alessio Palmero Aprosio (Università di Trento)
Giulia Rambelli (Università di Bologna)
Lucia Siciliani (Università di Bari)
Marco Antonio Stranisci (Università di Torino)
FURTHER INFORMATION
Website: https://www.evalita.it/campaigns/evalita-2026/call-for-tasks/
Mail: evalitacampaign(a)gmail.com
Marco,
UNITO <https://www.unito.it/persone/mstranis> and aequa-tech
<https://aequa-tech.com/>
The UKP Lab at the Department of Computer Science, Technical University Darmstadt, Germany, is looking for
*** two fully funded 𝗣𝗵𝗗 𝗦𝘁𝘂𝗱𝗲𝗻𝘁𝘀 𝗮𝗻𝗱/𝗼𝗿 𝗣𝗼𝘀𝘁𝗱𝗼𝗰𝘀 ***
for an exciting project in machine-generated text detection. This is a unique opportunity to join the UKP Lab on the intersection of AI Safety, Natural Language Processing and Machine Learning. If you're excited about shaping the future of Large Language Models, AI Agents, human-AI interaction, building novel prototypes, and publishing at top-tier venues of NLP, ML and AI, we’d love to hear from you.
🔗 More information:
https://www.informatik.tu-darmstadt.de/ukp/ukp_home/jobs_ukp/2025_phd_ukp.e…
📩 Apply here:
https://careers.ukp.informatik.tu-darmstadt.de/ukprecruitment
📅 Application deadline: June 29th, 2025
--------------------------------------------------------------------
Prof. Dr. Iryna Gurevych
UKP Lab
Technical University Darmstadt, Germany
http://www.ukp.tu-darmstadt.de/
Third call for papers Sixth Workshop on Resources for African
Indigenous Language (RAIL)
Co-located with DHASA 2025
https://sadilar.org/rail-2025/
RAIL Workshop date: 10 November 2025
DHASA Conference dates: 10-14 November 2025
Venue: CSIR International Convention Centre.
The sixth RAIL workshop website: https://sadilar.org/rail-2025/
DHASA website: https://digitalhumanities.org.za/
The sixth Resources for African Indigenous Languages (RAIL) workshop
will be co-located with the Digital Humanities Association of Southern
Africa (DHASA) 2025 conference at the CSIR International Convention
Centre in Pretoria, South Africa, on 10 November 2025. The RAIL
workshop is an interdisciplinary platform for researchers working on
African indigenous languages resources such as natural languages
processing (NLP) tools, Human Language Technologies (HLT), data
collections, and annotations. This workshop aims to foster a
scientific community of practice that focuses on computational
linguistic tools and data that are designed for or applied to the
indigenous languages of Africa.
Many African languages are under-resourced while only a few are
considered to be somewhat better resourced. These languages often share
interesting properties such as writing systems, making them different
from most high-resourced languages. From a computational perspective,
these languages lack enough corpora to undertake high level development
of NLP and HLT tools, which in turn impedes the development of African
languages in these areas. During previous workshops, it was noted that
the problems and solutions presented were not only applicable to
African languages but were also relevant to many other low-resource
languages across the world. Because these languages share similar
challenges, this workshop provides researchers with opportunities to
work collaboratively on issues of language resource development and
learn from each other.
The RAIL workshop has several aims. First, the workshop brings together
researchers who work on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances sharing of knowledge on the development of
low-resource languages. Finally, it enables discussions on how to
improve the quality as well as availability of the resources.
The workshop has “Language resources in the age of large language
models” as its theme, but submissions on any topic related to
properties of African indigenous languages (including related non-
African languages) may be accepted. Suggested topics include (but are
not limited to) the following:
* Digital representations of linguistic structures
* Descriptions of corpora or other data sets of African indigenous
languages
* Building resources for (under-resourced) African indigenous languages
* Developing and using African indigenous languages in the digital age
* Effectiveness of digital technologies for the development of African
indigenous languages
* Revealing unknown or unpublished existing resources for African
indigenous languages
* Developing desired resources for African indigenous languages
* Improving quality, availability and accessibility of African
indigenous language resources
Submission requirements:
We invite papers on original, unpublished work related to the topics of
the workshop. Submissions, presenting completed work, may consist of up
to eight (8) pages of content plus additional pages of references. The
final camera-ready version of accepted long papers are allowed one
additional page of content (up to 9 pages) so that reviewers’ feedback
can be incorporated. Papers should be formatted according to the DHASA
style sheet which is provided on the Journal of the Digital Humanities
Association of Southern Africa website
(https://upjournals.up.ac.za/index.php/dhasa/about). Reviewing is
double-blind, so make sure to anonymise your submission (e.g., do not
provide author names, affiliations, project names, etc.) Limit the
amount of self citations (anonymised citations should not be used). The
RAIL workshop follows the DHASA submission requirements.
Please submit papers in PDF format (the submission link will be
available soon). Accepted papers will be published in proceedings
linked to the DHASA conference.
Important dates:
Submission deadline: 14 July 2025
Date of notification: 16 September 2025
Camera ready copy deadline: 24 October 2025
Workshop: 10 November 2025
DHASA conference: 10 November 2025-14 November 2025
Organising Committee
Rooweither Mabuya, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Muzi Matfunjwa, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Mmasibidi Setaka, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources
(SADiLaR), South Africa
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
Third call for papers DHASA Conference 2025
https://dh2025.digitalhumanities.org.za
Theme: The role of humanities in digital humanities and artificial
intelligence
The Digital Humanities Association of Southern Africa (DHASA) is
pleased to announce its fifth conference, focusing on the theme The
role of humanities in digital humanities and artificial intelligence.
In a region where the field of Digital Humanities is still relatively
underdeveloped, this conference aims to address this gap and foster
growth and collaboration in the field. The conference offers an
opportunity for researchers interested in showcasing their work in the
broad field of Digital Humanities to come together. By doing so, the
conference provides a comprehensive overview of the current state-of-
the-art in Digital Humanities, particularly within the Southern Africa
region. As such, we welcome submissions related to Digital Humanities
research conducted by individuals from Southern Africa or research
focused on the geographical area of Southern Africa in the broad sense.
Furthermore, the conference serves as a platform for information
sharing and networking among researchers passionate about Digital
Humanities. By bringing together experts working on Digital Humanities
in Southern Africa or with a focus on Southern Africa, we aim to
promote collaboration and facilitate further research in this dynamic
field. In addition to the main conference, affiliated workshops and
tutorials will be organised, providing researchers with valuable
insights into novel technologies and tools. These supplementary events
are designed for researchers interested in specific aspects of Digital
Humanities or seeking practical information to enter or advance their
knowledge in the field.
The DHASA conference welcomes interdisciplinary contributions from
researchers in various domains of Digital Humanities, including, but
not limited to, language, literature, visual art, performance and
theatre studies, media studies, music, history, sociology, psychology,
language technologies, library studies, philosophy, methodologies,
software and computation, AI, and more. Our goal is to cultivate an
inclusive scientific community of practice within Digital Humanities.
Suggested topics include the following:
* The role of AI in digital humanities, the role of Digital Humanities
in shaping AI, and the broader role of the humanities in both AI and DH
projects;
* Digital archives and the preservation of marginalised voices;
* Intersectionality and the digital humanities: exploring the
intersections of race, gender, sexuality, culture, and class in digital
research and activism;
* Activism and social change through digital media: how digital
humanities tools and methodologies can be used to promote inclusion;
* Engaging marginalised communities in the creation and use of digital
tools, resources, and AI;
* Exploring the role of digital humanities in decolonising knowledge
and promoting indigenous perspectives;
* The ethics of data collection and analysis in digital humanities and
AI research;
* The role of digital humanities and AI in promoting inclusive and
equitable pedagogy;
* Digital humanities and inclusion in the context of African and global
perspectives and international collaborations;
* Critical approaches to digital humanities and inclusion: examining
the limitations and possibilities of digital tools and methodologies in
promoting inclusion; and
* Collaborative digital humanities projects with non-profit
organisations, community groups, and cultural institutions;
* Development of digital and AI tools for supporting digital
humanities;
* Novel utilisation of digital and AI tools for performing digital
humanities research;
* The role of digital humanities in the classroom: reimagining literacy
and AI fluency
* Digital humanities data and project management;
* The role of librarians in the digital humanities project;
* Any other digital humanities-related topic that serves the Southern
African community.
Submission Guidelines
The DHASA conference 2025 asks for three types of submissions:
* Long papers: Authors may submit long papers with a maximum of 8
content pages and unlimited pages for references and appendices. The
final versions of accepted long papers will be granted an additional
page (leading to a total of up to 9 content pages) to incorporate
reviewers' comments. Long papers accepted for the conference will be
presented in 30-minute time slots (which includes 10 minutes for
questions).
* Short papers: Authors may submit short papers with a maximum of 5
content pages and unlimited pages for references and appendices. The
final versions of accepted short papers will be allowed an extra page
(leading to a total of up to 6 content pages) to accommodate reviewers'
comments. Short papers accepted for the conference will be presented in
15-minute time slots (which includes 5 minutes for questions).
* Executive summaries: Authors can submit an executive summary for work
in progress, limited to 1 page. Executive summaries accepted for the
conference will be presented as posters during a dedicated poster
presentation slot.
All accepted long and short paper submissions that are presented at the
conference will be published in the JDHASA journal, see
https://upjournals.up.ac.za/index.php/dhasa. In addition, the executive
summaries for the poster presentations will be published in a book of
executive summaries before the conference.
We particularly encourage student submissions where the first author is
a student.
All submissions should adhere to the ACL style guide:
https://acl-org.github.io/ACLPUB/formatting.html
Submissions should be submitted in PDF format. Submissions that do not
adhere to the prescribed style guide will be rejected.
Follow this link to go to the submission platform:
https://dh2025.digitalhumanities.org.za/submission/
Authors are encouraged to upload their datasets to the SADiLaR
repository: https://repo.sadilar.org/. In case of difficulties
uploading the datasets, please reach out to Benito Trollip
(benito.trollip(a)nwu.ac.za).
Important dates
Submission deadline: 14 July 2025
Date of notification: 16 September 2025
Camera-ready copy deadline: 24 October 2025
Conference: 10 November 2025 - 14 November 2025
Conference venue: CSIR ICC, Pretoria, South Africa
Co-located events
Several co-located events are currently being prepared, including
workshops and tutorials. These will be updated on the conference
website.
Organising Committee
Aby Louw, Council for Scientific and Industrial Research
Andiswa Bukula, South African Centre for Digital Language Resources
Avi Moodley, Council for Scientific and Industrial Research
Franco Mak, Council for Scientific and Industrial Research
Franziska Pannach, Rijksuniversiteit Groningen
Ilana Wilken, Council for Scientific and Industrial Research
Johannes Sibeko, Nelson Mandela University
Juan Steyn, South African Centre for Digital Language Resources
Laurette Marais, Council for Scientific and Industrial Research
Marissa Griesel, South African Centre for Digital Language Resources
Menno van Zaanen, South African Centre for Digital Language Resources
Privolin Naidoo, Council for Scientific and Industrial Research
Sthembiso Mkhwanazi, Council for Scientific and Industrial Research
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
WINLP 2025 WORKSHOP
The Widening NLP (WiNLP) workshop aims to foster an inclusive
environment that highlights the contributions of researchers from
underrepresented groups in NLP. Anyone who self-identifies as being from
an underrepresented background--based on gender, ethnicity, nationality,
sexual orientation, disability, or otherwise--is encouraged to submit.
In 2025, WiNLP will continue placing emphasis on access, disability, and
diversity across scientific backgrounds, disciplines, training, and
underrepresented languages.
Our annual Widening Natural Language Processing Workshop (WiNLP) will be
held in conjunction with EMNLP 2025 in Suzhou, China. Since EMNLP is
anticipating a hybrid format for their conference, we also anticipate
our workshop will be hybrid, with both online and in-person attendees.
The one-day workshop will occur during EMNLP's workshop period with an
exact date to be announced soon.
The full-day event includes invited talks, oral presentations, and
poster sessions. The workshop provides an excellent opportunity for
junior members in the community to showcase their work and connect with
senior mentors for feedback and career advice. It also offers
recruitment opportunities with leading industrial labs. Most
importantly, the workshop will provide an inclusive and accepting
space, and work to lower structural barriers to joining and
collaborating with the NLP community at large.
Information on Submission guidelines at:
https://www.winlp.org/call-for-submissions-2025/
PRE-SUBMISSION MENTORSHIP PROGRAM
WiNLP offers an optional pre-submission mentorship program to help
authors improve the quality of their writing and presentation before
final submission. The program focuses on enhancing the clarity and
structure of the paper, not critiquing the research content.
* Submission: Authors must submit a draft of their paper via the
designated Google Form (https://forms.gle/J33K2ea6VruN82ke9) by June 20,
2025. The draft should adhere to the same formatting and length
guidelines as final submissions.
* Mentor Assignment: Organizers will check the draft for compliance
with formatting requirements before assigning a mentor. The mentor will
not be involved in reviewing the final submission.
* Feedback: Mentors will provide feedback by July 18, 2025, offering
suggestions to improve writing and presentation. Authors are encouraged
to incorporate this feedback before the final submission deadline.
* Non-Anonymous: The mentorship process is not anonymized.
* Final Submission: Authors who participate in the mentorship program
should submit their final paper as a new submission via OpenReview by
August 1st, 2025 to be considered for WiNLP workshop. Participation in
the mentorship program is not a prerequisite for submitting a paper to
WiNLP.
TRAVEL SUPPORT
WiNLP offers a limited number of travel grants to support one author per
accepted submission. Grants may cover expenses such as registration,
travel, lodging, or visa costs. Funded authors may choose to attend
virtually if preferred.
* Travel grant application deadline: September 26, 2025
* Notification: October 6, 2025
* Eligibility: One author per accepted submission is eligible. The
funded author must be identified in the travel grant application form.
Additional funding for virtual attendance by other authors may be
considered if surplus funds are available, but in-person attendance for
additional authors is not guaranteed. Travel expenses are handled via
reimbursement (primarily through USD check or PayPal). Authors unable to
front travel costs should contact the organizers early to discuss
alternatives.
Authors are encouraged to explore local funding options (e.g.,
institutional support) to maximize the reach of WiNLP's limited funds.
We recommend additional student authors keep an eye out for the EMNLP
call for student volunteers or call for D&I subsidies as opportunities
for further funding.
IMPORTANT DATES
All deadlines are 11:59 PM UTC-12:00 "Anywhere on Earth"
* Pre-submission mentoring deadline: June 20, 2025
* Pre-submission feedback returned: July 18, 2025
* Paper submission deadline: August 1, 2025
* Acceptance notifications: September 15, 2025
* Camera-ready deadline: October 1, 2025
* Travel grant applications due: September 26, 2025
* Travel grant notifications: October 6, 2025
CONTACT INFORMATION
Website: https://www.winlp.org/call-for-submissions-2025/
Twitter: @winlpworkshop [1]
Facebook: Widening NLP [2]
LinkedIn: Widening NLP [3]
E-mail: winlp-chairs(a)googlegroups.com
Links:
------
[1] https://twitter.com/WiNLPWorkshop
[2] https://www.facebook.com/WideningNLP
[3] https://www.linkedin.com/company/winlp
[CFP] - (R2LM) From Rules to Language Models: Comparative Performance Evaluation @ RANLP 2025 (Varna, Bulgaria) - 11-13 September 2025
https://r2lm2025.github.io/R2LM/
Workshop Description
Deep learning (DL) and large language models (LLMs) have driven major advances in natural language processing (NLP), enabling impressive performance across many tasks. However, they continue to face key challenges in handling complex linguistic phenomena such as multiword expressions, long-context reasoning, and robustness to adversarial inputs. In parallel, concerns remain about the scalability, interpretability, and domain adaptability of these models, particularly in applications requiring high precision, such as grammar checking, legal analysis, or medical NLP. These limitations have sparked renewed interest in rule-based and knowledge-based approaches, which often offer better explainability and remain competitive, especially in low-resource or high-stakes scenarios.
Our workshop aims to gather contributions that deal with the following topics:
• Role of rule-based and knowledge-based NLP methods in modern applications
• Comparative analysis of rule-based, machine-learning, deep-learning and large language models for different NLP tasks
• Emerging trends in NLP research beyond deep learning and Large Language Models
• Limitations and performance bottlenecks in scalability and accuracy of deep learning models
Submission Details
• Long papers: up to 8 pages (excluding references)
• Short papers: up to 4 pages (excluding references)
• Format: ACL-style (LaTeX or MS Word)
• Submission portal and template info available on the RANLP 2025 website
Important dates
Paper Submission Deadline: 6 July 2025
Notification of Acceptance: 31 July 2025
Workshop date: 11, 12 or 13 September 2025
Organising Committee:
Alicia Picazo-Izquierdo, University of Alicante, Spain
Ernesto Luis Estevanell-Valladares, University of Alicante, Spain
Rafael Muñoz Guillena, University of Alicante, Spain
Ruslan Mitkov, Lancaster University, UK
Raúl García Cerdá, University of Alicante, Spain
The Marseille Computer Science and Systems Laboratory (https://www.lis-lab.fr/) is seeking a candidate for a three-year thesis grant as part of the ANR Cre@lame project, in collaboration with the University of Turku in Finland.
The subject concerns the modeling of the literary writing and revision process carried out by authors. The starting point is an already written text, which is to be revised in the manner of an author. The problem is seen as a problem of predicting edit operations, taking the original text as input and producing edit operations. These can concern the lexicon, syntax or textual organization.
The thesis's problem is structured around three directions.
The first is the nature of the object produced by the prediction process, which could take the form of a sequence of edit operations or a more complex form, such as a graph. The prediction model itself will depend largely on the nature of the predicted object.
The second concerns data. Revision data, which associates revision operations with a text, is not very common in general, and those concerning literary revision are even less so. We will rely on all available data available and, possibly, produce them using language models, in order to train the revision models.
The final direction concerns evaluation. Given an original text and a revised version, how can we judge the quality of the latter? And how can we assess that the changes made to the original text are consistent with the objectives of the revision process.
We are looking for candidates with a strong background in machine learning, mainly in deep learning, as well as knowledge in Natural Language Processing.
Application deadline: June 22
Contacts: Patrice Bellot (patrice.bellot(a)univ-amu.fr), Christophe Leblay (chrleb(a)utu.fi) and Alexis Nasr (alexis.nasr(a)univ-amu.fr)
Dear Colleagues,
The evaluation period for the brand new Model Compression track
<https://www2.statmt.org/wmt25/model-compression.html> at WMT 2025
<https://www2.statmt.org/wmt25/index.html> is approaching!
LATEST ANNOUNCEMENTS:
-
Test data release brought forward to June 19, 2025! Participants now
have two full weeks to prepare their submissions.
-
Submission upload space available upon request (see the task’s page for
details)
OVERVIEW
This shared task aims to evaluate the potential of model compression
techniques in reducing the size of large, general-purpose language models,
with the goal of achieving an optimal balance between practical
deployability and high translation quality in specific machine translation
(MT) scenarios. The broader objectives of the task include:
-
fostering research into efficient, accessible, and sustainable
deployment of LLMs for MT;
-
establishing a common evaluation framework to monitor progress in model
compression across a wide range of languages; and
-
enabling meaningful comparisons with state-of-the-art MT systems through
standardized evaluation protocols that assess not only translation quality
but also efficiency.
Although the focus is on model compression, the task is closely aligned
with the General MT shared task
<https://www2.statmt.org/wmt25/translation-task.html>, sharing language
directions, test data, and protocols for automatic MT quality evaluation.
Additionally, the task follows the same timeline as the flagship WMT task.
We warmly invite participation from academic teams and industry players
interested in applying existing compression methods to MT or exploring
innovative, cutting-edge approaches.
THE TASK IN A NUTSHELL
-
Goal: Reduce the size of a general-purpose LLM while maintaining a
balance between model compactness and MT performance.
-
Languages: The first round will focus on the same language pairs as the
General MT track.
-
Conditions:
-
Constrained: Participants work within a predefined model and language
setting for directly comparable results.
-
Unconstrained: Participants are free to compress any model across
language directions of their choice.
-
Evaluation Criteria:
-
Translation quality: Automatically measured using the LLM-as-a-judge
framework from the General MT task
-
Model size: Defined by the memory usage
-
Inference speed: Measured by total processing time over the test set
IMPORTANT DATES (UPDATED)
-
Test data released: 26th June 2025 19th June 2025
-
Translation submission deadline: 3rd July 2025
-
System description abstract paper: 10th July 2025
-
System description submission: 14th August 2025
WEBSITE: https://www2.statmt.org/wmt25/model-compression.html
ORGANIZERS:
-
Marco Gaido, Fondazione Bruno Kessler
-
Matteo Negri, Fondazione Bruno Kessler
-
Roman Grundkiewicz - Microsoft Translator
-
TG Gowda - Microsoft Translator
CONTACTS:
-
Marco Gaido - mgaido(a)fbk.eu
-
Matteo Negri - negri(a)fbk.eu
--
--
Le informazioni contenute nella presente comunicazione sono di natura
privata e come tali sono da considerarsi riservate ed indirizzate
esclusivamente ai destinatari indicati e per le finalità strettamente
legate al relativo contenuto. Se avete ricevuto questo messaggio per
errore, vi preghiamo di eliminarlo e di inviare una comunicazione
all’indirizzo e-mail del mittente.
--
The information transmitted is
intended only for the person or entity to which it is addressed and may
contain confidential and/or privileged material. If you received this in
error, please contact the sender and delete the material.
*🎓 *We are happy to announce the next webinar in the CIRCE online
seminar series organized by the CIRCE <https://www.circe-project.eu/>
project in collaboration with DFCLAM University of Siena
<https://www.dfclam.unisi.it/en>, H2IOSC <https://www.h2iosc.cnr.it/>
project and CNR-ILC <https://www.ilc.cnr.it/en/>.
*Dr. Giuliana Regnoli*
/University of Salerno, Italy & University of Regensburg, Germany/
*/Unveiling linguistic bias: Approaches to accent perception and
discrimination/*
📅 *May 26, 2025*
🕓 *4:40 PM – 5:30 PM (CEST)*
*Venue*: Online
*Attendees*: Secondary school teachers, researchers, language instructors
*Summary: *Accent discrimination remains one of the most pervasive forms
of linguistic bias, influencing social perceptions, identity
construction, and attitudes towards language variation. This talk
examines how accents shape linguistic hierarchies and social
interactions, drawing on three research projects that employ distinct
methodologies. First, we will explore how folk linguistic methods, such
as map-drawing tasks, reveal nuanced spatial dimensions of language
attitudes, challenging homogenising conceptualisations of World
Englishes. This will be illustrated through a study on how a
first-generation Indian diasporic community in Germany perceives and
evaluates accent variation in Indian English. We will then turn to
traditional language attitude research methods, focusing on
questionnaire data to investigate overt stigmatisations and highlighting
the importance of scale validation in direct attitude measurement. This
discussion will be grounded in a pilot study on Italian university
students’ direct attitudes towards English in Italy and their
perceptions of Italian English. Finally, we will examine language
attitudes in primary education in Cameroon, emphasising the importance
of understanding children’s language perceptions within broader
ideological frameworks. This analysis draws on data from parental and
children’s questionnaires, as well as semi-structured interviews with
children. By shedding light on early linguistic gatekeeping and its role
in decolonising language education, this study also explores when and
how these beliefs become embedded in society. Taken together, these
projects demonstrate how different methodological approaches can be
employed to investigate attitudes towards accents and linguistic
variation, ultimately providing insights into how we can better
understand and tackle accent discrimination.
*Bio*: Dr. Giuliana Regnoli is assistant professor of English
linguistics at the University of Salerno and a postdoctoral research
fellow at the University of Regensburg. Her research interests include
variationist sociolinguistics, sociophonetics, language attitudes,
perceptual dialectology, and World Englishes. She is currently working
on children's English in Cameroon and Italian university students'
attitudes toward English(es) world-wide.
Upcoming webinars:
- Clara Molina (Monday, June 30, 2025)
- Sender Dovchin (Monday, July 7, 2025)
- Christian Ilbury (Monday, September 22, 2025)
The seminar is free of charge, but participants must register. To access
this and next events, you should create an account on theH2IOSC Training
Environment
<https://h2iosc-training-platform.ilc4clarin.ilc.cnr.it/registration>.
Once logged in with your credentials, choose the course “Language and
Accent Discrimination - Online Seminar Series” and activate it with the
code PbK837GtE. Make sure to have the Teams platform installed.
The registrations of the previous CIRCE Seminars are also available on
the H2IOSC Training Environment. For any inquiry, write to
contact(a)circe-project.eu.
*BAREC Shared Task 2025
<https://urldefense.proofpoint.com/v2/url?u=https-3A__barec.camel-2Dlab.com_…>*
*Arabic Readability Assessment*
*The Third Arabic Natural Language Processing Conference (ArabicNLP 2025)
<https://urldefense.proofpoint.com/v2/url?u=https-3A__arabicnlp2025.sigarab.…>*
@
*EMNLP 2025
<https://urldefense.proofpoint.com/v2/url?u=https-3A__2025.emnlp.org_&d=DwMF…>*
We are excited to announce the BAREC Shared Task
<https://urldefense.proofpoint.com/v2/url?u=https-3A__barec.camel-2Dlab.com_…>
2025
on fine-grained readability classification across 19 levels using the
Balanced Arabic Readability Evaluation Corpus (BAREC), a dataset of over 1
million words. Participants will build models for both sentence- and
document-level classification.
*Task 1: Sentence-level Readability Assessment*
Given an Arabic sentence, predict its readability level on a scale from 1
(i.e., first grade) to 19 (i.e., university level), indicating the degree
of reading difficulty.
*Task 2: Document-level Readability Assessment*
Given a document consisting of multiple sentences, predict its readability
level on a scale from 1 to 19, where the hardest (i.e., highest
readability) sentence in the document determines the overall document
readability level.
For each task, there will be three tracks, allowing different data sources
for training: Strict, Constrained, and Open.
*Important Dates:*
All deadlines are 11:59pm UTC-12 (anywhere on Earth):
- *June 10, 2025:* Release of training, dev and open test data, and
evaluation scripts.
- *July 20, 2025:* Registration deadline and release of test data.
- *July 25, 2025:* End of evaluation cycle (test set submission closes).
- *July 30, 2025: *Final results released.
- *August 15, 2025:* System description paper submissions due.
- *August 25, 2025: *Notification of acceptance.
- *September 5, 2025:* Camera-ready versions due.
*Awards:*
- *Top-performing Systems:*
- We will recognize the top-performing system in each of the two
tasks + track combinations (2 tasks × 3 tracks), with a *$100 *prize
per winning team.
- *Best System Description Papers:*
- We will award one or two prizes for Best System Description Papers.
These will recognize clarity, reproducibility, and insight, regardless of
leaderboard ranking:
- Best Paper: *$250*
- Runner-up or Honorable Mention: *$150*
*Organizers:*
- *Khalid N. Elmadani
<https://urldefense.proofpoint.com/v2/url?u=https-3A__khalid-2Delmadani.gith…>*:
New York University Abu Dhabi
- *Bashar Alhafni
<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.basharalhafni.com_…>*:
New York University Abu Dhabi and Mohamed bin Zayed University of
Artificial Intelligence
- *Hanada Taha-Thomure
<https://urldefense.proofpoint.com/v2/url?u=https-3A__hanadataha.com_&d=DwMF…>*:
Zayed University
- *Nizar Habash
<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nizarhabash.com_&d…>*:
New York University Abu Dhabi
*Shared Task Website: *https://barec.camel-lab.com/sharedtask2025
*Contact:*
For any questions related to this task, check out the *FAQs
<https://urldefense.proofpoint.com/v2/url?u=https-3A__barec.camel-2Dlab.com_…>*.
Feel free to post your questions on our *Slack workspace
<https://urldefense.proofpoint.com/v2/url?u=https-3A__join.slack.com_t_barec…>*.
You are also welcome to contact the organizers directly at this email
address: barec25.organizers(a)camel-lab.com.
Dear Corpora members,
following our previous Calls for Papers, we would like to inform you of
a small update to the submission policies:
*The maximum length of supplementary material has been extended to 3 pages.*
For any additional updates regarding the conference, please visit the
website: https://clic2025.unica.it/
The full text of the updated CfP can be found below.
Best regards,
The CLiC-it chairs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CLiC-it 2025 – Eleventh Italian Conference on Computational Linguistics
24 – 26 September 2025, Cagliari, Italy
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Over the years, CLiC-it has evolved into an important forum for the
Italian community of researchers in Computational Linguistics (CL) and
Natural Language Processing (NLP). CLiC-it aims to promote and
disseminate high-quality, original research covering different aspects
of automatic language processing, involving both written and spoken
language. Furthermore, it seeks to showcase cutting-edge theoretical
findings, experimental methodologies, technologies, and application
perspectives.
The spirit of the conference is inclusive. Recognizing the multifaceted
nature of language phenomena and the need for interdisciplinary
expertise, CLiC-it aims to bring together researchers from different
fields including Computational Linguistics and Natural Language
Processing, Linguistics, Cognitive Science, Machine Learning, Computer
Science, Knowledge Representation, Information Retrieval, and Digital
Humanities. CLiC-it welcomes contributions focusing on all languages,
with a particular emphasis on Italian.
CLiC-it 2025 will be held in Cagliari, from the 24th to the 26th of
September. CLiC-it is organised by the Italian Association of
Computational Linguistics (AILC — http://www.ai-lc.it/).
➢ CONFERENCE TOPICS
CLiC-it 2025 aims to have a broad technical program. Relevant topics for
the conference include, but are not limited to (in alphabetical order):
Computational Historical Linguistics
Computational Social Science and Cultural Analytics
Dialogue and Interactive Systems
Discourse and Pragmatics
Ethics and NLP
Generation
Handwritten Text Recognition
Information Extraction
Information Retrieval and Text Mining
Interpretability and Analysis of Models for NLP
Language Grounding to Vision, Robotics and Beyond
Large Language Models
Linguistic Diversity
Linguistic Theories, Cognitive Modeling, and Psycholinguistics
Machine Learning for NLP
Machine Translation
Multilingualism and Cross-Lingual NLP
NLP Applications
NLP for the Humanities
Phonology, Morphology, and Word Segmentation
Pragmatics and Creativity
Question Answering
Resources and Evaluation
Semantics: Lexical, Sentence-level Semantics, Textual Inference,
and Other Areas
Sentiment Analysis, Stylistic Analysis, and Argument Mining
Speech and Multimodality
Summarization
Syntax: Tagging, Chunking and Parsing
➢ RESEARCH COMMUNICATION
CLiC-it 2025 adopts a parallel submission policy for outstanding papers
accepted in 2024 and 2025 by major publication venues, namely the major
international CL conferences (workshops excluded) or international
journals. These contributions can be submitted to CLiC-it 2025 as short
research communications. Research communications will not be published
in the conference proceedings, they serve primarily to promote the
dissemination of high-quality research within the Italian CL community.
Submitted research communications must be in the scope of the CLiC-it
2025 conference.
The authors of papers that meet the above criteria are invited to submit
a written (maximum) one-page abstract of the original paper, including
the paper’s title and authors as well as a pointer to the original
conference or journal where the paper was published. If needed, research
communications will undergo a selection process overseen by the
conference chairs. Since these papers have already been reviewed, the
selection criteria will primarily consider their original publication
venue. Priority will be granted to papers that align most closely with
the conference program, ensuring a balanced representation across
various conference topics. The research communication papers will be
presented at the conference either orally or as a poster according to
the number of submissions received.
➢ PAPER SUBMISSION
Submitted papers must describe substantial, original, completed, and
unpublished work. Wherever appropriate, concrete evaluation and analysis
should be included.
CLiC-it 2025 allows for a multiple submission policy. In case of
acceptance of the paper in other venues, the authors must communicate
this information to the CLiC-it 2025 Chairs as soon as possible.
Papers may consist of at least six (6) and no more than eight (8) pages
of content and up to three (3) pages of references.
*UPDATE* Supplementary material is also allowed, but it should not
exceed *three (3) pages* in length. Authors are reminded that all
relevant content should be included in the main text of the paper and
that reviewers are not required to evaluate material presented in the
Appendix. In case additional space is needed (e.g. to include prompts,
examples, etc.), external links can be used. Please also note that
sections on limitations and ethical considerations are not mandatory; if
included, they will count toward the page limit.
Upon acceptance, final versions of the papers will be given one
additional page of content so that reviewers’ comments can be taken into
account.
Papers will be evaluated according to the following criteria:
soundness of approach
relevance to computational linguistics
novelty and clarity of relation with related work
quality of presentation
quality of evaluation (if applicable)
verifiability and ability to replicate (if applicable)
Papers can be either in English or Italian, with the abstract in
English. Accepted papers will be published on-line and will be presented
at the conference either orally or as a poster.
All accepted papers must be presented at the conference to appear in the
proceedings.
Reviewing will NOT be blind, so there is no need to remove author
information from manuscripts.
The required template for CLiC-it submissions must be compatible with
CEUR (https://ceur-ws.org/). You can download the conference-adapted
version at the following links:
LaTeX template
https://clic2025.unica.it/wp-content/uploads/2025/02/CLiC-it-2025-template.…
Word template
https://clic2025.unica.it/wp-content/uploads/2025/02/CLiC_it_2025_template.…
Should you encounter any issues with the compilation (as the CEUR
template has historically presented some challenges and is not
modifiable without risking exclusion from the proceedings), we provide a
read-only Overleaf template:
https://www.overleaf.com/read/hzyckyjzwhwb#06b27c
This template can be accessed and cloned to help resolve any technical
difficulties.
Papers and research communications must be submitted through the START
platform using the following link: https://softconf.com/p/clic-it2025
For research communications, the appropriate track should be selected.
➢ AWARDS
To acknowledge the contribution of young researchers to the field, the
title of “best paper” will be awarded to outstanding papers, provided
that a Master’s or PhD student is the first author and presents the work
at the conference. Recipients of this award will be invited to submit an
extended version of their papers to the Italian Journal of Computational
Linguistics (IJCoL).
To recognise excellence in student research as well as promote awareness
of our field, AILC is also conferring the “Emanuele Pianta” prize for
the best Master Thesis (Laurea Magistrale) in Computational Linguistics
submitted at an Italian University. The prize consists of 500 Euros plus
free membership to AILC for one year and free registration to the
upcoming CLiC-it.
➢ IMPORTANT DATES
09/06/2025 16/06/2025 [EXTENDED]– Paper submission deadline:
regular papers and research communications
21/07/2025 – Notification to authors of reviewing/selection outcome
04/08/2025 – Camera ready version of accepted papers
24-26/09/2025 – CLiC-it 2025 Conference, Cagliari
➢ PEOPLE
Conference Chairs:
Cristina Bosco (University of Torino)
Elisabetta Jezek (University of Pavia)
Marco Polignano (University of Bari)
Manuela Sanguinetti (University of Cagliari)
Senior Program Committee:
Elisa Bassignana (IT University of Copenhagen)
Pierluigi Cassotti (University of Gothenburg)
Simone Conia (University of Rome “La Sapienza”)
Elisa Di Nuovo (Joint Research Centre European Commission – Ispra)
Claudiu Daniel Hromei (University of Rome “Tor Vergata”)
Antonio Origlia (University of Naples “Federico II”)
Ludovica Pannitto (University of Bologna)
Beatrice Savoldi (Fondazione Bruno Kessler)
Gabriele Sarti (University of Groningen)
Lucia Siciliani (University of Bari)
Irene Siragusa (University of Palermo)
Rossella Varvara (University of Turin – University of Pavia)
Alessandro Vietti (University of Bolzano)
Local Organizing Committee:
Maurizio Atzori (DMI, University of Cagliari)
Andrea Loddo (DMI, University of Cagliari)
Alessandro Pani (DMI, University of Cagliari)
Alessandra Perniciano (DMI, University of Cagliari)
Luca Zedda (DMI, University of Cagliari)
Web chairs:
Maurizio Atzori
Andrea Loddo
➢ FURTHER INFORMATION
Mail: clicit2025cagliari(a)gmail.com
Key Deadlines:
Abstract submission for posters: June 15
Registration: June 30
Registrations are still open for WALP 2025! Places are filling up quickly—register now to secure your on-site accommodation.
Dear Colleagues,
WALP 2025 will bring together leading academic and industrial experts from all around the world working at the forefront of atomic layer processing (ALP), aiming to stimulate discussions on recent advances and emerging directions in ALP, particularly those driven by industrial and societal needs. The workshop will cover experimental, computational, and AI/ML-driven research in the development of ALP techniques, materials and their applications in semiconductor CMOS, energy storage (batteries, supercapacitors), and energy conversion (fuel cells, photovoltaics) technologies. WALP 2025 will host industry talks on ALP method development and scale-up for commercial energy and nanoelectronics applications.
WALP 2025 features a distinguished lineup of speakers from academia and industry, including
* Mikko Ritala (University of Helsinki, Finland)
* Louis Piper (WMG, University of Warwick, UK)
* Fred Roozeboom (University of Twente, the Netherlands)
* Anjana Devi (TU Dresden, Germany)
* Jeff Elam (Argonne National Lab, US)
* Seán Barry (University of Carleton, Canada)
* Riikka Puurunen (University of Aalto, Finland)
* Ralf Tonner-Zech (University of Leipzig, Germany)
* Richard Potter (university of Liverpool, UK)
* Jennifer D'Souza (Leibniz Universität Hannover, Germany)
* Industry talks by Merck Electronics KGaA, BioLogic Ltd., Schrödinger, Inc. Oxford Instruments Plasma Technologies, Entalpic, Hitachi Energy, Forge Nano, and ATLANT 3D.
WALP 2025 offers an excellent opportunity to network, exchange ideas, and explore collaborative research prospects in an inspiring academic and industrial setting. Delegates have the option to submit abstracts for poster presentations, with select submissions considered for short contributed talks.
The registration is still open, and the registration fees for academics are £250 (excluding accommodation) and £350 (including on-site accommodation for the nights of July 21st and 22nd). Registration fees cover all meals and refreshments during the conference and poster session. There is a limited number of bursaries available for students.
Please visit the WALP 2025 webpage for detailed information, registration and abstract submission: https://warwick.ac.uk/fac/sci/chemistry/chemevents/walp2025/
We gratefully acknowledge the valuable financial support to WALP 2025 by Merck (EMD) Electronics, Schrödinger, BioLogic, and Royal Society of Chemistry.
We look forward to welcoming you to WALP 2025.
Best wishes,
Bora Karasulu
Also on behalf of Dr. Adrie Mackus and Prof. Erwin Kessels
WALP 2025 is jointly organised by Warwick Chemistry and the Eindhoven University of Technology (TU/e, Netherlands).
SEMANTiCS 2025 EU
21st International Conference on Semantic Systems
Vienna, Austria
September 3 - 5, 2025
Follow us on *Twitter/X* <https://x.com/SemanticsConf>, *LinkedIn*
<https://www.linkedin.com/groups/7496190/?highlightedUpdateUrn=urn%3Ali%3Agr…>,
and *Bluesky*. <https://bsky.app/profile/semantics-conf.bsky.social>
Call for Posters & Demos
The Posters & Demos Track provides a platform for researchers to showcase
their latest findings, ongoing projects, and cutting-edge work in progress.
These include submissions on innovative applications, latest results,
unpublished ideas, prototypes of semantic technologies and their use in
various domains as well as applications, use cases, or pieces of code that
may attract developers and potential research or business partners. This
also concerns new datasets made publicly available.
The Posters & Demos Track offers an informal setting that promotes
engagement and dialogue between presenters and attendees. These discussions
can provide valuable feedback for presenters' future work while allowing
participants to gain insight into emerging research trends and network with
other researchers.
*The submission deadlines for the Posters & Demos Track have been extended
as follows:*
-
*Paper Submission Deadline: July 4, 2025*
-
*Notification of Acceptance: July 21, 2025 *
-
*Camera-Ready of Paper Deadline: July 28, 2025*
*All deadlines are set for 11:59 pm, Anywhere On Earth time (UTC-12)*
*Submission via Easychair on*
*https://easychair.org/conferences/?conf=semantics2025*
<https://easychair.org/conferences/?conf=semantics2025>.
Proceedings of SEMANTiCS 2025 EU will be made available open access by *
CEUR-WS.org*.
Topics of Interest We welcome contributions in the context of
semantic-based research and systems, which address – but are not limited to
– the topics of the Research Track
<https://2025-eu.semantics.cc/page/cfp_rev_rep>. Additionally, we encourage
submissions of visionary ideas, position statements, negative results, and
unconventional ideas. Demos should showcase innovative implementations and
technologies both, from academia and industry. We also very much encourage
submissions from industry, but they should be focused on presenting a novel
solution to a specific problem and not be in the nature of an advertisement
or commercial product description. Author Guidelines and Submission Poster
and demo submissions should consist of a paper that describes the work, its
contribution to the field or innovative aspects.
-
Poster and demo submissions are at most 5 pages long, including
references.
-
No double-blind submissions required.
-
Submissions must be either in PDF or HTML.
-
Submissions must be formatted in the style of CEUR-ART (
https://ceur-ws.org/HOWTOSUBMIT.html). An Overleaf page for LaTeX users
is available.
-
For demos, we ask authors to include links enabling the reviewers to
test the application or review the component. The absence of a pointer
affects the overall rating of the contribution.
-
Submissions must be original and must not have been submitted for
publication elsewhere.
-
At least one author of each accepted paper must register for the
conference and present the paper.
Posters and Demos Track Chairs
Ivan Heibi
Diego Collarana
Kind Regards,
On behalf of the organising committee.
=========================
Dr. Kossi Amouzouvi
ScaDS.AI Dresden/Leipzig, TU Dresden
--
DISCLAIMER: The contents of this email and any attachments are
confidential. They are intended for the named recipient(s) only. If you
have received this email by mistake, please notify the sender immediately
and you are herewith notified that the contents are legally privileged and
that you do not have permission to disclose the contents to anyone, make
copies thereof, retain or distribute or act upon it by any means,
electronically, digitally or in print. The views expressed in this
communication may be of a personal nature and not be representative of
AIMS-NEI and/or any of its Centres or Initiatives.
We are excited to announce the 2nd edition of the Open Language Data Initiative shared task at WMT25, co-located with EMNLP 2025.
**TASK DESCRIPTION**
The primary goal of this shared task is to expand OLDI’s open datasets to more languages. We are soliciting contributions to the following:
- The MT evaluation dataset FLORES+.,
- The MT Seed dataset.,
- Other high-quality, massively-parallel and open-source datasets.,
Contributions may consist of either the addition of entirely new languages, varieties or dialects to the above datasets, or substantial improvements to existing datasets. To describe and publicise their contributions, task participants will be asked to submit a 4-6 page paper to be presented at the WMT 2025 conference.
**IMPORTANT DATES**
All dates follow WMT/EMNLP.
- Paper and data submission deadline: 14 August,
- Notification of acceptance: 13 September,
**MORE INFORMATION**
- Shared task website: https://www2.statmt.org/wmt25/open-data.html,
- OLDI website: https://oldi.org/
Dear colleagues,
We are pleased to announce the first call for papers of the
*1st Workshop on Multilingual Data Quality Signals at COLM 2025*
Important information:
🗓️ CfP Deadline: June 23, Workshop: October 10
📍 Montréal, Canada
🌐 https://wmdqs.org
Scope
Recent research has shown that large language models (LLMs) not only need large quantities of data, but also need data of sufficient quality. Ensuring data quality is even more important in a multilingual setting, where the amount of acceptable training data in many languages is limited. Indeed, for many languages even the fundamental step of language identification remains a challenge, leading to unreliable language labels and thus noisy datasets for underserved languages.
In response to these challenges, we will be holding the first Workshop on Multilingual Data Quality Signals (WMDQS) in tandem with COLM. We invite the submission of long and short research papers related to data quality in multilingual data.
Even though most previous work on data quality has been targeted at LLM development, we believe that research in this area can also benefit other research communities in areas such as web search, web archiving, corpus linguistics, digital humanities, political sciences and beyond. We therefore encourage submissions from a wide range of disciplines.
WMDQS will also include a shared task on language identification for web text. We invite participants to submit novel systems which address current problems with language identification for web text. We will provide a training set of annotated documents sourced from Common Crawl to aid development.
Topics
We welcome submissions of (1) original research papers, (2) review/opinion papers, (3) online systems on the topics listed below, and (4) extended abstracts. We especially welcome work-in-progress projects and all novel ideas covering research in multilinguality, underserved/low-resource languages, under-represented linguistic communities and all types of work covering data quality signals. Suggested areas include:
- Data pipelines for data annotation and data filtering
- Undesirable content detection in a multilingual setting
- Multilingual or language independent content ranking
- Human annotation platforms and systems
- Multilingual tokenization mechanisms
- Small language models and embeddings
- Linguistic studies in underserved languages
- Corpus creation and curation methods, especially for underserved languages
- Machine translation
- Digital humanities
- Historical and constructed languages
Shared task
The lack of training data—especially high-quality data—is the root cause of poor language model performance for many languages. One obstacle to improving the quantity and quality of available text data is language identification (LangID or LID). Lang ID remains far from solved for many languages. Several of the commonly used LangID models were introduced in 2017 (e.g. fastText and CLD3). The aim of this shared task is to encourage innovation in open-source language identification and improve accuracy on a broad range of languages.
All accepted authors will be invited to contribute a larger paper, which will be submitted to a high-impact NLP venue.
Important dates for the Workshop:
Workshop paper submission deadline: June 23, 2025
Workshop paper acceptance notification: July 24, 2025
Workshop: October 10, 2025
Important dates for the Shared Task:
1st Deadline to contribute annotations: July 7, 2025
1st Annotations released (train split): July 14, 2025
Abstract Deadline: July 21, 2025
Decision Notification: July 24, 2025
Camera Ready Deadline: September 21, 2025
(All deadlines are 23:59 AoE.)
Organizers:
For any questions, please drop a mail to wmdqs-pcs(a)googlegroups.com
Program Chairs:
Pedro Ortiz Suarez (Common Crawl Foundation)
Sarah Luger (MLCommons)
Laurie Burchell (Common Crawl Foundation)
Kenton Murray (Johns Hopkins University)
Catherine Arnett (EleutherAI)
Organizing Committee:
Thom Vaughan (Common Crawl Foundation)
Sara Hincapié (Factored)
Rafael Mosquera (MLCommons)
KlarText Workshop on German Text Simplification & Readability Assessment
Co-located with KONVENS 2025 | Hildesheim, Germany | 10 September 2025
Website: https://klar-text.github.io/
============================================================
Please be reminded that the KlarText workshop paper submission deadline is in three weeks. The event aims to unite researchers, practitioners, and industry experts to discuss state-of-the-art methods in German text simplification and readability assessment. Our focus is to raise awareness about the diverse simplification goals and language forms in German, while attracting researchers who are addressing the challenges associated with this field.
Topics of interest include (but are not limited to):
- German Text Simplification
- Readability Assessment
- Resources & Approaches for Leichte Sprache
- The Role of Large Language Models (LLMs)
- Resources & Benchmarks
- Evaluation & Human-Centered Assessment
- Applications & Real-World Impact
- Cross-Linguistic & Multilingual Perspectives
Important Dates
- Submission deadline: June 30, 2025
- Notification of acceptance: August 1, 2025
- Camera-ready version due: August 15, 2025
- Workshop date: September 10, 2025
Submissions are managed via OpenReview (https://openreview.net/group?id=GSCL.org/KONVENS/2025/Workshop/KlarText).
Organizing Committee
- Salar Mohtaj, DFKI
- Stefan Hillmann, Technische Universität Berlin
- Sebastian Möller, Technische Universität Berlin
- Georg Groh, Technische Universität München
- Hadi Asghari, Technische Universität Berlin
- Miriam Anschütz, Technische Universität München
Contact
For questions or inquiries, please contact:
Salar Mohtaj – salar.mohtaj(a)dfki.de
*** First Call for Workshop & Tutorial Proposals
The 31st Annual ACM Conference on Intelligent User Interfaces (IUI 2026)
March 23-26, 2026, 5* Coral Beach Hotel & Resort, Paphos, Cyprus
https://iui.hosting.acm.org/2026/
We are pleased to invite proposals for workshops and tutorials to be held in conjunction with
the 31st International ACM Conference on Intelligent User Interfaces (ACM IUI 2026), Paphos,
Cyprus.
Workshops aim to provide a venue for presenting research on emerging or specialized topics
of interest and to offer an informal forum for discussing research questions and challenges.
Potential workshop topics should be related to the general theme of the conference
(“Where HCI meets AI”).
Tutorials aim to provide fundamental knowledge and experience on topics related to intelligent
user interfaces and the intersection between Human-Computer Interaction (HCI) and Artificial
Intelligence (AI).
We welcome proposals for a wide range of *full-day* or *half-day* workshops and tutorial
formats and activities, including but not limited to:
• Mini Conferences: Workshops that focus on a specific topic and may have their own paper
submission and review processes.
• Interactive Formats: Workshops that encourage active participation and hands-on
experiences through break-out sessions or group work to explore specific topics. They may
have their own paper submission and review process or target a report summarizing the
discussions and outcomes.
• Emerging Work Sessions: Workshops that foster discussion around emerging ideas.
Organizers may raise specific topics and invite position papers, late-breaking results, or
extended abstracts.
• Project-Centric Formats: Workshops tied closely to a specific existing large-scale funded
project(e.g., NSF, EU) with the goal to engage a broader community.
• Interactive Competitions: Formats that invite individuals and teams to participate in
challenges or hackathons on selected topics relevant to IUI.
• Tutorials: Sessions that provide a structured instruction on topics aligned with the conference
theme, such as HCI methods, AI techniques, methodological frameworks, or tools for building
intelligent user interfaces.
Review and Oversight by Workshop and Tutorial Chairs
Proposals will be reviewed and evaluated by the Workshop and Tutorial Chairs. It is possible
that workshops may be cancelled, shortened, merged, or restructured if there are insufficient
submissions.
Workshop and Tutorial summaries will be included in the ACM Digital Library for ACM IUI 2026.
We will also publish joint workshop proceedings for accepted workshop submissions (through
CEUR or a similar venue).
Responsibilities of Workshop and Tutorial Organizers
• Coordinate the Call for Papers, including solicitation, submission handling, and peer review
process.
• Create and maintain a dedicated website with Workshop or Tutorial information. The IUI
Website 2026 will link to this page.
• Prepare and communicate Call for Participation, targeting both IUI and broader relevant
communities (e.g., via mailing lists, social media, newsgroups, or offline events)
• Facilitate the planned activities, including paper presentations, discussions, and/or
interactive elements.
• Submit a workshop or tutorial summary for inclusion in the ACM Digital Library.
• Collect camera-ready papers and author agreements from workshop participants for the joint
workshop proceedings (CEUR or similar).
Note that for the joint proceedings (CEUR or similar), submissions should be peer-reviewed
and will need to meet publishers’ guidelines. CEUR, for example, requires a 5-page minimum
per contribution. Note that not all workshop and tutorial formats listed above may meet these
requirements, and we may not be able to include them.
IUI 2026 is an in-person event, and we expect workshop organizers to attend, allowing the
workshop to be conducted on-site. One author per paper is expected to attend in person to
present the work.
Proposal Format
Workshop or tutorial proposals should be a maximum of four pages long (single-column
format). Prepare your submission using the latest templates: Word Submission Template
(https://authors.acm.org/binaries/content/assets/publications/taps/acm_submi…),
or the LaTex Template
(https://authors.acm.org/proceedings/production-information/preparing-your-a…).
For Latex, please use “\documentclass[manuscript,review]{acmart}”.
The proposals should be organized as follows:
• Name and title: A one-word acronym and a full title. Please indicate “(Workshop)” or
“(Tutorial)” after the title, as appropriate.
• Abstract: A brief summary of the workshop or tutorial.
• Description of workshop or tutorial topic: Should discuss the relevance of the proposed
topic to IUI and its interest for the IUI 2026 audience. Include a concise discussion of why this
workshop or tutorial is particularly relevant for the intended audience and how it will
complement and enhance topics covered at the main conference.
• Previous history: List of previous workshops or tutorials on this topic, including the
conferences that hosted them and the number of participants. If available, report on past
editions of the workshop (including URLs), along with a brief statement of the workshop series
(e.g., covering topics, number of paper submissions, and participants), as well as post-
workshop publications over the years and acceptance statistics. If this is the first edition of the
workshop, describe how it differs from others on similar topics (e.g., by including conference
names and years).
• Organizer(s): Names, affiliations, emails, and web pages of the organizer(s). Provide a brief
description of the background of the organizer(s). Strong proposals normally include organizers
who bring differing perspectives on the topic and are actively connected to the communities of
potential participants. Please indicate the primary contact person and the organizers who will
attend the workshop. Also, please provide a list of other workshops or tutorials organized by
workshop organizers in the past.
• Workshop program committee: Names and affiliation of the members of the (tentative)
workshop program committee that will evaluate the workshop submissions.
• Participants: Include a statement of how many participants you expect and how you plan to
invite participants for the workshop or tutorial. If possible, include the names of at least 10
people who have expressed interest in participating in the workshop or tutorial.
• Workshop or Tutorial activities: A brief description of the format regarding the mix of
events or activities, such as paper presentations, invited talks, panels, demonstrations,
teaching activities, hands-on practical exercises, and general discussion. Please also list here
any materials you will make available to tutorial participants, such as slides, access to hardware
or software, and handouts.
• Planned outcomes of the workshop or tutorial: What are you hoping to achieve by the end
of the workshop or tutorial? Please list here any planned publications or other outcomes
expected.
• Length: Full-day or half-day.
Submission Platform
• All materials must be submitted electronically to PCS 2.0
http://new.precisionconference.com/~sigchi by the proposal submission deadline.
• In PCS 2.0, first click "Submissions" at the top of the page, from the dropdown menus for
society, conference, and track, select "SIGCHI", "IUI 2026", and then "IUI 2026 Workshops" or
“IUI 2026 Tutorials”, respectively, and press "Go".
We encourage both researchers and industry practitioners to submit workshop proposals. To
support diverse perspectives in the workshops, we strongly recommend including organizers
from varied institutions and backgrounds.
Furthermore, we welcome workshops with an innovative structure that can attract diverse types
of contributions and foster valuable interactions.
Prospective organizers are encouraged to contact the Workshop and Tutorial Chairs in advance
(workshops2026(a)iui.acm.org) to discuss ideas, receive feedback, or seek assistance in
preparing engaging proposals. Especially for workshop proposals featuring innovative
interactive formats, we are happy to help further develop and implement the ideas.
Important Dates (AoE)
• Workshop Proposals: August 22, 2025
• Decision notification: September 19, 2025
• Tutorial Proposals: October 17, 2025
• Tutorial Decision Notification: Nov 21, 2025
• Camera-ready Summaries: February 6, 2026
Workshop and Tutorial Chairs
Karthik Dinakar, Pienso, USA
Werner Geyer, IBM Research, USA
Patricia Kahr, University of Zurich, Switzerland
Antonela Tommasel, CONICET, Argentina
The Data Mining and Machine Learning research group at the University of Vienna is seeking graduates or advanced MSc students in Computer Science, Computational Linguistics, Statistics, or related fields who are interested in pursuing a PhD in Explainability for Machine Learning and Natural Language Processing. The successful candidate will join the group as a pre-doctoral researcher. The position is funded for three years and will be supervised by Prof. Benjamin Roth.
Application deadline: 24 June 2025
Research topics may include:
Personalized explanations of large language models
Explanations for complex AI agents
Training data-based explanations
Usability aspects of explanations
Evaluation methods for explainable AI
More information: https://jobs.univie.ac.at/job/University-assistant-predoctoral/1212525201/
--
Univ.-Prof. Dr. Benjamin Roth
Digitale Textwissenschaften
Universität Wien
Kolingasse 14
Raum 5.17
1090 Wien
email: benjamin.roth(a)univie.ac.at
tel: +43 14277 79513
virtual coffee (Tuesday 2pm CEST): https://www.benjaminroth.net/virtual_coffee
web: https://dm.cs.univie.ac.at/team/person/112089/