Ethical LLMs 2025: The first Workshop on Ethical Concerns in Training, Evaluating and Deploying Large Language Models<https://sites.google.com/view/ethical-llms-2025> @ RANLP2025<https://ranlp.org/ranlp2025/>
Final Call for papers:
Scope
Large Language Models (LLMs) represent a transformative leap in Artificial Intelligence (AI), delivering remarkable language-processing capabilities that are reshaping how we interact with technology in our daily lives. With their ability to perform tasks such as summarisation, translation, classification, and text generation, LLMs have demonstrated unparalleled versatility and power. Drawing from vast and diverse knowledge bases, these models hold the potential to revolutionise a wide range of fields, including education, media, law, psychology, and beyond. From assisting educators in creating personalised learning experiences to enabling legal professionals to draft documents or supporting mental health practitioners with preliminary assessments, the applications of LLMs are both expansive and profound.
However, alongside their impressive strengths, LLMs also face significant limitations that raise critical ethical questions. Unlike humans, these models lack essential qualities such as emotional intelligence, contextual empathy, and nuanced ethical reasoning. While they can generate coherent and contextually relevant responses, they do not possess the ability to fully understand the emotional or moral implications of their outputs. This gap becomes particularly concerning when LLMs are deployed in sensitive domains where human values, cultural nuances, and ethical considerations are paramount. For example, biases embedded in training data can lead to unfair or discriminatory outcomes, while the absence of ethical reasoning may result in outputs that inadvertently harm individuals or communities. These limitations highlight the urgent need for robust research in Natural Language Processing (NLP) to address the ethical dimensions of LLMs. Advancements in NLP research are crucial for developing methods to detect and mitigate biases, enhance transparency in model decision-making, and incorporate ethical frameworks that align with human values. By prioritising ethics in NLP research, we can better understand the societal implications of LLMs and ensure their development and deployment are guided by principles of fairness, accountability, and respect for human dignity. This workshop will dive into these pressing issues, fostering a collaborative effort to shape the future of LLMs as tools that not only excel in technical performance but also uphold the highest ethical standards.
Key Dates
Submissions Open - 1st June 2025
Paper Submission Deadline - 28th July 2025
Acceptance Notification - 10th August 2025
Camera-Ready Deadline - 20th August 2025
Submission Guidelines
We follow the RANLP 2025 standards for submission format and guidelines. EthicalLLMs 2025 invites the submission of long papers, up to eight pages in length, and short papers, up to six pages in length. These page limits only apply to the main body of the paper. At the end of the paper (after the conclusions but before the references) papers need to include a mandatory section discussing the limitations of the work and, optionally, a section discussing ethical considerations. Papers can include unlimited pages of references and an unlimited appendix.
To prepare your submission, please make sure to use the RANLP 2025 style files available here:
* Latex<https://ranlp.org/ranlp2025/wp-content/uploads/2025/05/ranlp2025-LaTeX.zip>
* Word<https://ranlp.org/ranlp2025/wp-content/uploads/2025/05/ranlp2025-word.docx>
Papers should be submitted through Softconf/START using the following link: https://softconf.com/ranlp25/EthicalLLMs2025/
Topics of interest
The workshop invites submissions on a broad range of topics related to the ethical development and evaluation of LLMs, including but not limited to the following.
1. Bias Detection and Mitigation in LLMs
Research focused on identifying, measuring, and reducing social, cultural, and algorithmic biases in large language models.
2. Ethical Frameworks for LLM Deployment
Approaches to integrating ethical principles—such as fairness, accountability, and transparency—into the development and use of LLMs.
3. LLMs in Sensitive Domains: Risks and Safeguards
Case studies or methodologies for deploying LLMs in high-stakes fields such as healthcare, law, and education, with an emphasis on ethical implications.
4. Explainability and Transparency in LLM Decision-Making
Techniques and tools for improving the interpretability of LLM outputs and understanding model reasoning.
5. Cultural and Contextual Understanding in NLP Systems
Strategies for enhancing LLMs’ sensitivity to cultural, linguistic, and social nuances in global and multilingual contexts.
6. Human-in-the-Loop Approaches for Ethical Oversight
Collaborative models that involve human expertise in guiding, correcting, or auditing LLM behaviour to ensure responsible use.
7. Mental Health and Emotional AI: Limits of LLM Empathy
Discussions on the role of LLMs in mental health support, highlighting the boundary between assistive technology and the need for human empathy.
Organisers
Damith Premasiri – Lancaster University, UK
Tharindu Ranasinghe – Lancaster University, UK
Hansi Hettiarachchi – Lancaster University, UK
Contact
If you have any questions regarding the workshop, please contact Damith: d.dolamullage(a)lancaster.ac.uk
Dear colleagues,
We are pleased to announce that the submission deadline for our upcoming workshop:
The First Workshop on Natural Language Processing and Language Models for Digital Humanities (LM4DH 2025)
(co-located with RANLP 2025) has been extended to 27 July 2025!
This interdisciplinary workshop invites contributions at the intersection of computational methods and the humanities, including work on:
* Text analysis and genre detection
* Interpretability of LLM outputs
* Historical and low-resource language processing
* Dataset creation and curation
* Emotion analysis, authorship attribution, and more
We welcome standard (up to 8 pages) and short papers (4–6 pages). Submissions must follow the RANLP 2025 ACL-style template.
* New Submission Deadline: 27 July 2025
* Notification of Acceptance: 2 August 2025
* Camera-Ready Deadline: 20 August 2025
* Workshop Date: 11-13 September 2025 (To be confirmed)
Don’t miss the chance to be part of this exciting event that brings together researchers across linguistics, cultural heritage, NLP, history, and more. Submit your paper and join the conversation on the future of AI in the Digital Humanities!
For more information, visit https://www.clarin.eu/content/call-papers-first-workshop-natural-language-p…
Best regards,
CLARIN ERIC
/[Apologies for multiple postings]/
We are happy to announce that 2 new speech databases are available in
our catalogue.
*Chinese Kids Speech database (Lower Grade)
<https://catalog.elra.info/en-us/repository/browse/ELRA-S0496/>***
ISLRN: 369-011-475-593-5 <http://www.islrn.org/resources/369-011-475-593-5>
The Chinese Kids Speech database (Lower Grade) contains the total
recordings of 184 Chinese Kids speakers (98 males and 86 females), from
6 to 10 years' old, recorded in quiet rooms using smartphones. 1,426
sentences were used. Recordings were made through smartphones and audio
data stored in .wav files as sequences of 16KHz Mono, 16 bits, Linear PCM.
*Chinese Kids Speech database (Upper Grade)
<https://catalog.elra.info/en-us/repository/browse/ELRA-S0497/>***
ISLRN:993-024-988-227-0 <http://www.islrn.org/resources/993-024-988-227-0>
The Chinese Kids Speech database (Upper Grade) contains the total
recordings of 161 Chinese Kids speakers (71 males and 90 females), from
10 to 12 years’ old recorded in quiet rooms using smartphone.1,859
sentences were used. Recordings were made through smartphones and audio
data stored in .wav files as sequences of 16KHz Mono, 16 bits, Linear PCM.
For more information on the catalogue or if you would like to enquire
about having your resources distributed by ELRA, please *contact us*
<mailto:contact@elda.org>.
_________________________________________
Visit the *ELRA Catalogue of Language Resources* <http://catalog.elra.info>
*Archives *
<https://www.elra.info/catalogues/language-resources-announcements/>of
ELRA Language Resources Catalogue Updates
/Please note that you receive this email because you are or have been a
customer or a provider of ELRA Language Resources./
/ELRA Privacy Policy is available //here/
<https://www.elra.info/elra-privacy-policy/>
/If you do not want to receive such e-mails in the future, //contact us/
<mailto:privacy@elda.org>
Dear colleagues,
The July edition of the CLARIN Newsflash is out!
This summer, CLARIN is turning up the heat at major conferences and summer schools — from the Corpus Linguistics 2025 conference in Birmingham, to DH2025 in Lisbon, and the ESU summer school in Besançon, with ACL and Interspeech just around the corner. We’re proud to showcase our most popular and newly developed tools and resources, strengthening visibility, fostering meaningful connections, and collaborating with local CLARIN communities and researchers worldwide.
It’s all in this month’s newsflash — take a look!
https://www.clarin.eu/content/clarin-newsflash-july-2025
Wishing you a relaxing summer break — we’ll be back in September with a CLARIN2025 special edition!
CLARIN ERIC
First CFP: CHOMPS – Confabulation, Hallucinations, & Overgeneration in Multilingual & Precision-critical Settings
(with our apologies for cross-posting)
Venue: IJCNLP-AACL 2025 (https://2025.aaclnet.org/), Mumbai, India
Date: 23/24th December 2025 (TBC)
Workshop website: https://chomps2025.github.io/
* Description *
Despite rapid advances, LLMs continue to "make things up": a phenomenon that manifests as hallucination, confabulation, and overgeneration. That is, produce unsupported and unverifiable text that sounds deceptively plausible. These outputs pose real risks in settings where accuracy and accountability are non-negotiable, including healthcare, legal systems, and education. The aim of the CHOMPS workshop is to find ways to mitigate one of major the hurdles that currently prevent the adoption of Large Language Models in real-world scenarios: namely, their tendency to hallucinate, i.e., produce unsupported and unverifiable text that sounds deceptively plausible.
The workshop will explore hallucination mitigation in practical situations, where this mitigation is crucial: in particular, precision-critical applications (such as those in the medical, legal and biotech domains), as well as multilingual settings (given the lack of resources available to reproduce what can be done for English in other linguistic contexts). In practice, we intend to invite works of the following (not exclusive) list of topics:
* Workshop topics *
- Metrics, benchmarks and tools for hallucination detection
- Factuality challenges in mission critical & domain-specific (e.g., medical, legal, biotech) and their consequences
- Mitigation strategies during inference or model training
- Studies of hallucinatory and confabulatory behaviors of LLMS in cross-lingual and multilingual scenarios
- Confabulations in language & multimodal (vision, text, speech) models
- Perspectives and case studies from other disciplines
- …
* Invited speakers *
- Anna ROGERS, IT University of Copenhagen
- Danish PRUTHI, IISc Bangalore
- Abhilasha RAVICHANDER, University of Washington
* Submission details *
The workshop is designed with a widely inclusive submission policy so as to foster as vibrant a discussion as possible.
Archival or non-archival submissions may consist of up to 8 pages (long) or 4 pages (short) of content. Dissemination submissions may consist of up to 1 pages of content. On acceptance, authors may add one additional page to accommodate changes suggested by the reviewers.
Please use the ACL style templates available here: https://github.com/acl-org/acl-style-files
The submissions need to be done in PDF format via (a) via Direct submission (https://openreview.net/group?id=aclweb.org/AACL-IJCNLP/2025/Workshop/CHOMPS) (b) via ARR commitment (https://openreview.net/group?id=aclweb.org/AACL-IJCNLP/2025/Workshop/CHOMPS…)
* Important dates *
Paper submission deadline: September 29, 2025
Direct ARR commitment: October 27, 2025
Author notification: November 3, 2025
Camera-Ready due: November 11, 2025
Workshop date: December 23-24, 2025 (TBC)
* Contact *
For questions, please send an email to chomps-aacl2025(a)googlegroups.com or contact one of the workshop chairs:
- Aman Sinha, Université de Lorraine, aman.sinha(a)univ-lorraine.fr
- Raúl Vázquez, University of Helsinki, raul.vazquez(a)helsinki.fi
- Timothee Mickus, University of Helsinki, timothee.mickus(a)helsinki.fi
Call for Papers: CASE 2025 @ RANLP (8. Challenges and Applications of Automated Extraction of Socio-political Events from Texts)
Dear Colleagues,
We are pleased to announce the 8th edition of the Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text, held in conjunction with RANLP 2025 (https://ranlp.org/ranlp2025/)!
CASE is a leading venue for research, resources, and practical advances in automated event extraction and analysis, focusing on social and political event data. It has been organized consistently in top venues like ACL, EMNLP, EACL, etc.
We invite submissions of research papers, resource papers, and position papers addressing (but not limited to) the following topics:
• Event extraction at the sentence, document, or cross-document level, including event coreference.
• Creation and annotation of datasets for event extraction.
• Modeling event-event relations such as subevents, causal, temporal, and spatial links.
• Evaluation of event datasets: reliability, validity, and coverage.
• Event schemas and ontologies: population, definition, and enrichment.
• Tools, pipelines, and infrastructure for event annotation and analysis.
• Linguistic aspects of event representation: lexical, syntactic, semantic, discursive, and pragmatic.
• Applications of event data in conflict prediction, early warning, and policy support.
• Detection of new event types, including protests, public health crises, and cyber activism.
• Bias, fairness, and misinformation in event extraction systems and datasets.
• Legal, ethical, and privacy considerations in dataset creation and dissemination.
• Cross-lingual, multilingual, and multimodal event extraction.
• Use of LLMs and generative AI for event extraction, analysis, and dataset generation.
• Release of new benchmarks, datasets, or annotation resources.
All accepted papers will be published in the ACL Anthology.
Website: https://emw.ku.edu.tr/case-2025/ (being updated! please get in touch with ahurriyetoglu(a)ku.edu.tr for any questions)
Link for submission: https://softconf.com/ranlp25/CASE2025/user/
Important dates:
Submission Deadline: 25 July 2025
Notification: August 17, 2025
Camera-ready deadline: August 30, 2025
Workshop date: September 11-13, 2025
Shared task
Multimodal detection of hate speech, humor, and stance in LGBTQ+ socio-political discourse
To know more and participate, please visit: https://github.com/therealthapa/case2025-multimodal/blob/main/README.md
All shared task papers will also be published in the ACL anthology.
Organizers: Surendrabikram Thapa, Siddhant Bikram Shah, Shuvam Shiwakoti, Kritesh Rauniyar, Surabhi Adhikari, Kristy Johnson, Ali Hürriyetoğlu, Hristo Tanev, Usman Naseem
Organizing committee:
Ali Hürriyetoglu
Hristo Tanev
Surendrabikram Thapa
Vanni Zavarella
Erdem Yörük
Hi, good morning
This is to share with you that our research group made publicly available the beta version of
*the chatbot for the Portuguese language based on open LLMs, Evaristo.ai <https://evaristo.ai/>
*
One of such LLMs is Gervásio 8B <https://huggingface.co/PORTULAN/gervasio-8b-portuguese-ptpt-decoder>, we developed for Portuguese and we are releasing also now,
and which is the LLM that you find active by default when arriving at this chatbot.
Though you may not be proficient in Portuguese, on-the-fly translators will likely help
you to get its content and its basic functioning. *We invite you, and your colleagues
and students, to visit it and try out this first test version. We welcome any help
you can give us in testing it, any feedback and any suggestions you can share with us.*
It has a unique set of features, among others: it is an open AI chatbot for
the Portuguese language; it is multi-model and multi-heteronym, as well as being agentic,
multi-tool and multi-modal; it does not track its users or pass on their content to third parties,
safeguarding user privacy and ownership of their content.
You'll find the presentation of its motivation in this press release <https://evaristo.ai/assets/pressRelease_EvaristoAI.pdf>(in English),
that can be complemented by the more complete description in the section about <https://evaristo.ai/about> (in Portuguese)
The current open LLMs available are typically between 10 and 100 times smaller than
the top-of-the-range closed LLMs used in commercial chatbots, so the costs associated
with training and operating them are much lower. The performance of open LLMs, however,
is much more satisfactory than this linear disproportion would suggest. They therefore have
an excellent ratio of performance quality versus cost, and are a viable option for
fully autonomous generative AI services focused on concrete use cases.
In this context, we see this chatbot as a milestone in the democratization of
generative technology for the Portuguese language, through open LLMs, by encouraging
more and more organizations to move forward with their own AI services,
rooted in their own computers, and focused on their concrete use cases.
Have a nice day,
António
Dear colleagues,
We are pleased to announce the last call for participation for 1st first Shared Task on Language Identification for Web Data at WMDQS/COLM 2025.
Important information:
🗓️ Registration Deadline: July 23 (AoE)
📍 Montréal, Canada
🌐 https://wmdqs.org/shared-task/
Registration:
To register, please submit a one-page document with a title, a list of authors, a list of provisional languages that you want to focus on, and a brief description of your approach. This document should be sent to wmdqs-pcs(a)googlegroups.com. You can change the list of languages or the system description during the shared task. This document's only purpose is to register your participation in the shared task. The shared task will run until the last week of September.
Motivation:
The lack of training data—especially high-quality data—is the root cause of poor language model performance for many languages. One obstacle to improving the quantity and quality of available text data is language identification (LangID or LID). LangID remains far from solved for many languages. Several of the commonly used LangID models were introduced in 2017 (e.g. fastText and CLD3). The aim of this shared task is to encourage innovation in open-source language identification and improve accuracy on a broad range of languages.
All participants will be invited to contribute a larger paper, which will be submitted to a high-impact NLP venue.
Description:
The main shared task is to submit LangID models that work well on a wide variety of languages on web data. We encourage participants to employ a range of approaches, including the development of new architectures and the curation of novel high-quality annotated datasets.
We recommend using the GlotLID corpus as a starting point for training data. Access to the data will be managed through the Hugging Face repository. Please note that this data should not be redistributed. We will use the same language label format as those used by GlotLID: an ISO 639-3 language code plus an ISO 15924 script code, separated by an underscore.
Although all systems will be evaluated on the full range of languages in our test set, we encourage submissions that focus on a particular language or set of languages, especially if those language(s) present particular challenges for language identification.
The shared task will take place in rounds. The first round will only include data from already existing datasets, subsequent rounds will include data annotated by the community as it is collected and processed. More languages will also be added in subsequent rounds.
Organizers:
For any questions, please drop a mail to wmdqs-pcs(a)googlegroups.com
Program Chairs:
Pedro Ortiz Suarez (Common Crawl Foundation)
Sarah Luger (MLCommons)
Laurie Burchell (Common Crawl Foundation)
Kenton Murray (Johns Hopkins University)
Catherine Arnett (EleutherAI)
Organizing Committee:
Thom Vaughan (Common Crawl Foundation)
Sara Hincapié (Factored)
Rafael Mosquera (MLCommons)
Dear colleagues,
My name is Alessandra Teresa Cignarella, I'm a postdoctoral researcher in the Language and Translation Technology Team (LT3) at Ghent University in Belgium. My research project is called RAINBOW [??] and I'm currently studying stereotypes about LGBTQIA+ people, particularly on social media, in online discourse, and in AI systems.
We have developed a brief questionnaire to gather diverse perspectives from those who experience or recognize these stereotypes. Your participation will support the creation of a multilingual dataset (Italian, Dutch, and Farsi) aimed at improving the inclusivity and reducing the harm caused by AI technologies toward queer communities. Whether you identify as LGBTQIA+, are an ally, or are interested in this research area, your input is highly valued.
Please find the questionnaire here:
*
ITALIAN: https://lnkd.in/dfPuyT6j
*
DUTCH: https://lnkd.in/d-3Di7WY
*
FARSI: https://lnkd.in/dfvWzWCu
Should you have any questions, please do not hesitate to contact me at: alessandrateresa.cignarella(a)ugent.be<mailto:alessandrateresa.cignarella@ugent.be>
I would greatly appreciate it if you could share this survey with your contacts who speak any of these three languages.
Thank you very much for your support!
Best regards,
Alessandra*
Alessandra Teresa Cignarella (she/her)
MSCA postdoctoral fellow
LT3, Language and Translation Technology Team
Department of Translation, Interpreting and Communication
Ghent University
[cid:8c809589-1d29-40cc-89a8-eea510c2a88f]