We are pleased to announce MAHED 2025, the first multimodal shared task dedicated to Hope and Hate Detection in Arabic content. This novel multimodal challenge will be co-located with EMNLP 2025 at the ArabicNLP 2025 Conference.
MAHED 2025 addresses critical real-world challenges in Arabic natural language processing by focusing on the detection of hate speech, hope speech, and emotions in both Arabic text and memes. This shared task aims to advance research in ethical AI while addressing the linguistic diversity and dialectal variations inherent in Arabic content.
The shared task comprises three subtasks:
Task 1: Text-based Hope & Hate Speech Classification
Participants will develop models to classify Arabic text as containing hope speech, hate speech, or neutral content.
Task 2: Multitask Learning for Emotion, Offensive Content, and Hate Detection
This task involves simultaneous detection of emotions, offensive language, and hate speech in Arabic text.
Task 3: Multimodal Hateful Meme Detection
Participants will work with Arabic memes to detect hateful content using both textual and visual modalities.
Registration Links:
* Task 1: https://www.codabench.org/competitions/9136/
* Task 2: https://www.codabench.org/competitions/9166/
* Task 3: https://www.codabench.org/competitions/9192/
Important Dates:
* June 10, 2025: Training data and evaluation scripts released
* July 20, 2025: Final registration deadline and test set release
* July 25, 2025: Test submission deadline
* November 5-9, 2025: ArabicNLP 2025 Workshop at EMNLP 2025, Suzhou, China
Resources and Registration:
Website: https://marsadlab.github.io/mahed2025/
Dataset and Code: https://github.com/marsadlab/MAHED2025Dataset
*** Last Call for Papers ***
The 16th IEEE International Conference on Knowledge Graphs (ICKG 2025)
November 13-14, 2025, 5* St. Raphael Resort and Marina, Limassol, Cyprus
https://cyprusconferences.org/ickg2025/
(*** Proceedings to be published by IEEE ***)
(*** Submission Deadline: July 4, 2025 AoE (extended and firm!) ***)
The annual IEEE International Conference on Knowledge Graph (ICKG) provides a premier
international forum for presentation of original research results in knowledge discovery and
graph learning, discussion of opportunities and challenges, as well as exchange and
dissemination of innovative, practical development experiences. The conference covers all
aspects of knowledge discovery from data, with a strong focus on graph learning and
knowledge graph, including algorithms, software, platforms. ICKG 2025 intends to draw
researchers and application developers from a wide range of areas such as knowledge
engineering, representation learning, big data analytics, statistics, machine learning, pattern
recognition, data mining, knowledge visualization, high performance computing, and World
Wide Web etc. By promoting novel, high quality research findings, and innovative solutions to
address challenges in handling all aspects of learning from data with dependency relationship.
All accepted papers will be published in the conference proceedings by the IEEE Computer
Society. Awards, including Best Paper, Best Paper Runner up, Best Student Paper, Best Student
Paper Runner up, will be conferred at the conference, with a check and a certificate for each
award. The conference also features a survey track to accept survey papers reviewing recent
studies in all aspects of knowledge discovery and graph learning. At least five high quality
papers will be invited for a special issue of the Knowledge and Information Systems Journal,
in an expanded and revised form. In addition, at least eight quality papers will be invited for a
special issue of Data Intelligence Journal in an expanded and revised form with at least 30%
difference.
TOPICS OF INTEREST
Topics of interest include, but are not limited to:
• Foundations, algorithms, models, and theory of knowledge discovery and graph learning
• Knowledge engineering with big data.
• Machine learning, data mining, and statistical methods for data science and engineering.
• Acquisition, representation and evolution of fragmented knowledge.
• Fragmented knowledge modeling and online learning.
• Knowledge graphs and knowledge maps.
• Graph learning security, privacy, fairness, and trust.
• Interpretation, rule, and relationship discovery in graph learning.
• Geospatial and temporal knowledge discovery and graph learning.
• Ontologies and reasoning.
• Topology and fusion on fragmented knowledge.
• Visualization, personalization, and recommendation of Knowledge Graph navigation and
interaction.
• Knowledge Graph systems and platforms, and their efficiency, scalability, and privacy.
• Applications and services of knowledge discovery and graph learning in all domains
including web, medicine, education, healthcare, and business.
• Big knowledge systems and applications.
• Crowdsourcing, deep learning and edge computing for graph mining.
• Large language models and applications
• Open source platforms and systems supporting knowledge and graph learning.
• Datasets and benchmarks for graphs
• Neurosymbolic & Hybrid AI systems
• Graph Retrieval Augmented Generation
SURVEY TRACK
Survey paper reviewing recent study in keep aspects of knowledge discover and graph learning.
In addition to the above topics, authors can also select and target the following Special Track
topics.
Each special track is handled by respective special track chairs, and the papers are also
included in the conference proceedings.
• Special Track 01: KGC and Knowledge Graph Building
• Special Track 02: KR and KG Reasoning.
• Special Track 03: KG and Large Language Model
• Special Track 04: GNN and Graph Learning
• Special Track 05: QA and Graph Database
• Special Track 06: KG and Multi-modal Learning.
• Special Track 07: KG and Knowledge Fusion.
• Special Track 08: Industry and Applications
SUBMISSION GUIDELINES
Paper submissions should be no longer than 8 pages, in the IEEE 2-column format, including
the bibliography and any possible appendices. Submissions longer than 8 pages will be
rejected without review. All submissions will be reviewed by the Program Committee based on
technical quality, originality, significance, and clarity. For survey track paper, please preface the
descriptive paper title with “Survey:”, followed by the actual paper title. For example, a paper
entitled “A Literature Review of Streaming Knowledge Graph”, should be changed as “Survey: A
Literature Review of Streaming Knowledge Graph”. This is for the reviewers and chairs to clearly
bid and handle the papers. Once the paper is accepted, the word, such as “Survey:”, can be
removed from the camera-ready copy.
For special track paper, please preface the descriptive paper title with “SS##:”, where “##” is
the two digits special track ID. For example, a paper entitled “Incremental Knowledge Graph
Learning”, intended to target Special Track 01 (Machine learning and knowledge graph) should
be changed as “SS01: Incremental Knowledge Graph Learning”.
All manuscripts are submitted as full papers and are reviewed based on their scientific merit.
The reviewing process is single blind, meaning that each submission should list all authors and
affiliations. There is no separate abstract submission step. There are no separate industrial,
application, or poster tracks. Manuscripts must be submitted electronically in the online
submission system. No email submission is accepted. To help ensure correct formatting, please
use the style files for U.S. Letter as template for your submission. These include LaTeX and
Word.
SUBMISSION LINK
https://wi-lab.com/cyberchair/2025/ickg25/
IMPORTANT DATES
• Paper submission (abstract and full paper): July 4, 2025 (AoE) (extended and firm!)
• Notification of acceptance/rejection: September 5, 2025
• Camera-ready, copyright forms and author registration: September 20, 2025
• Early (non-author) registration: October 10, 2025
• Conference dates: November 13-14, 2025
ORGANISATION
Conference and Local Organising Chair
• George A. Papadopoulos, University of Cyprus
Conference Co-Chair
• Dan Guo, Hefei University of Technology
Program Chairs
• Cesare Alippi, Università della Svizzera italiana
• Shirui Pan, Griffith University
Local Organising Vice Chair
• Irene Kinlanioti, National Technical University of Athens
Finance Chair
• Constantinos Pattichis, University of Cyprus
Steering Committee Chair
• Xindong Wu, Hefei University Of Technology
*** NARNiHS 2026
*** North American Research Network in Historical Sociolinguistics
*** Eighth Annual Meeting
*** 100% IN PERSON
*** Co-Located with the Linguistic Society of America (LSA) Annual Meeting
*** New Orleans, Louisiana USA
*** 8-11 January 2026
This event offers an opportunity for historical sociolinguistics scholars from all over the world to gather and share leading research. We encourage our fellow historical sociolinguists and scholars in related fields from our global scholarly community to **join us in New Orleans** for our Eighth Annual Meeting.
Consult this Call for Abstracts on the web: https://narnihs.org/?page_id=3135 .
--------------- Call for Abstracts ---------------.
Abstract submission online:
https://easyabs.linguistlist.org/conference/NARNiHS_26/ .
Deadline: Friday, 15 August 2025, 11:59 PM US Eastern Time.
Late abstracts will not be considered.
The North American Research Network in Historical Sociolinguistics (NARNiHS) is accepting abstracts for its Eighth Annual Meeting in New Orleans, Thursday, January 8 -- Sunday, January 11, 2026. The 8th edition of this inclusive NARNiHS event seeks to provide a collaborative environment where presenters bring fully developed work for presentation and enrichment. We see the NARNiHS Annual Meeting as a place for showcasing excellent projects in historical sociolinguistics, seeking feedback from peers, and engaging in productive development of the field’s enduring questions.
NARNiHS welcomes papers in all areas of historical sociolinguistics, which is understood as the application and/or development of sociolinguistic theories, methods, and models for the study of historical language variation and change over time, or more broadly, the study of the interaction of language and society in historical periods and from historical perspectives. Thus, a wide range of linguistic areas, subdisciplines, methodologies, and adjacent disciplines easily find their place within historical sociolinguistics, and we encourage submission of abstracts that reflect this broad scope.
Abstracts will be accepted for both 20-minute papers and posters. Please note that, at the NARNiHS annual meeting, poster presentations are an integral part of the conference (not second-tier presentations). Abstracts will be assigned a paper or a poster presentation based on determinations in the review process about the most effective format for the submission. However, if you prefer that your submission be considered primarily for poster presentation, please specify this in your abstract.
Successful abstracts will demonstrate *thorough grounding* in historical sociolinguistics, *scientific rigor* in the formulation of research questions, and promise for rich discussion of ideas. Successful abstracts will be explicit about which *theoretical frameworks*, *methodological protocols*, and *analytical strategies* are being applied or critiqued. *Data sources and examples* should be sufficiently presented, so as to allow reviewers a full understanding of the scope and claims of the research. Please note that the *connection of your research to the field of historical sociolinguistics* should be explicitly outlined in your abstract. Failure to adhere to these criteria will likely result in rejection.
*** Abstract Format Guidelines***.
- Abstracts must be submitted in PDF format.
- Abstracts must fit on one 8.5x11 inch page, with margins no smaller than 1 inch and a font style and size no smaller than Times New Roman 12 point. You are encouraged to use the entire page, providing a full and robust description of the research. All additional supporting content (visualizations, trees, tables, figures, captions, examples, and references) must fit on a single (1) additional page. No exceptions to these requirements are allowed; abstracts longer than one page or with more than one additional page of supporting content will be rejected without review.
- Specify if you prefer your submission be considered primarily for a poster presentation.
- Anonymize your abstract. We realize that sometimes complete anonymity is not attainable, but there is a difference between the nature of the research creating an inability to anonymize and careless non-anonymizing (in citations, references, file names, etc.). Be sure to anonymize your PDF file (you may do so in Adobe Acrobat Reader by clicking on "File", then "Properties", removing your name if it appears in the "Author" line of the "Description" tab, and re-saving the file before submission). Do not use your name when saving your PDF (e.g. Smith_Abstract.pdf); file names will not be automatically anonymized by the EasyAbs system. Rather, use non-identifying information in your file name (e.g. HistSoc4Lyfe.pdf). Your name should only appear in the online form accompanying your abstract submission. Papers that are not sufficiently anonymized wherever possible will be rejected without review.
*** General Requirements ***.
- Abstracts must be submitted electronically using the following link: https://easyabs.linguistlist.org/conference/NARNiHS_26/ .
- Authors may submit a maximum of two abstracts: One single-author abstract and one co-authored abstract.
- Authors may not submit identical abstracts for presentation at the NARNiHS annual meeting and the LSA annual meeting or another LSA sister society meeting (ADS, ANS, NAHoLS, SCiL, SPCL, or SSILA).
- After submission, no changes of author, title, or wording of the abstract may occur. If your abstract is accepted, adjustment of typographical errors is permitted before a final version of the abstract is printed in the conference booklet.
- Papers and posters must be delivered as projected in the abstract or represent bona fide developments of the same research.
- Authors are expected to attend the conference in-person and present their own papers and posters. This will not be a hybrid event.
Contact us at NARNiHistSoc(a)gmail.com with any questions.
We invite you to submit your ongoing, published or pre-reviewed works to our workshop on Large Language Models for Cross-Temporal Research (XTempLLMs) at COLM 2025.
Our workshop website is available at https://xtempllms.github.io/2025/
*The deadline for submission has been extended to June 30, 2025 AOE*
Workshop Description:
Large language models (LLMs) have been used for a variety of time-sensitive applications such as temporal reasoning, forecasting and planning. In addition, there has been a growing number of interdisciplinary works that use LLMs for cross-temporal research in several domains, including social science, psychology, cognitive science, environmental science and clinical studies. However, LLMs are hindered in their understanding of time due to many different reasons, including temporal biases and knowledge conflicts in pretraining and RAG data but also a fundamental limitation in LLM tokenization that fragments a date into several meaningless subtokens. Such inadequate understanding of time would lead to inaccurate reasoning, forecasting and planning, and time-sensitive findings that are potentially misleading.
Our workshop looks for (i) cross-temporal work in the NLP community and (ii) interdisciplinary work that relies on LLMs for cross-temporal studies.
Cross-temporal work in the NLP community:
* Novel benchmarks for evaluating the temporal abilities of LLMs across diverse date and time formats, culturally grounded time systems, and generalization to future contexts;
* Novel methods (e.g., neuro-symbolic approaches) for developing temporally robust, unbiased, and reliable LLMs;
* Data analysis such as the distribution of pretraining data over time and conflicting knowledge in pretraining and RAG data;
* Interpretability regarding how temporal information is processed from tokenization to embedding across different layers, and finally to model output;
* Temporal applications such as reasoning, forecasting and planning;
* Consideration of cross-lingual and cross-cultural perspectives for linguistic and cultural inclusion over time.
Interdisciplinary work that relies on LLMs for cross-temporal studies:
* Time-sensitive discoveries, such as social biases over time and personality testing over time;
* Assessment of time-sensitive discoveries to identify misleading findings if any;
* Interdisciplinary evaluation benchmarks for LLMs’ temporal abilities, e.g., psychological time perception and episodic memory evaluation.
Submission Modes:
* Standard submissions: We invite the submission of papers that will receive up to three double-blind reviews from the XTempLLMs committee, and a final decision of acceptance from the workshop chairs.
* Pre-reviewed submissions: We invite unpublished papers that have already been reviewed either through ACL ARR, or recent AACL/EACL/ACL/EMNLP/COLING venues. These papers will not receive new reviews but will be judged together with their reviews via a meta-review from the workshop chairs.
* Published papers: We invite papers that have been published recently elsewhere to present at XTempLLMs. Please send the details of your paper (Paper title, authors, publication venue, abstract, and a link to download the paper) directly to xtempllms(a)gmail.com. This allows such papers to gain more visibility from the workshop audience.
All deadlines are 11.59 pm UTC -12h (“Anywhere on Earth”):
* June 30, 2025: Submission deadline (standard and published papers)
* July 18, 2025: Submission deadline for papers with ARR reviews
* July 24, 2025: Notification of acceptance
* October 10, 2025: Workshop day
Invited Speakers:
* Jose Camacho Collados, Cardiff University, United Kingdom
* Ali Emami, Brock University, Canada
* Alexis Huet, Huawei Technologies, France
* Bahare Fatemi, Google Research, Canada
* Vivek Gupta, Arizona State University, United States
Organizing Committee:
* Wei Zhao, University of Aberdeen, United Kingdom
* Maxime Peyrard, Université Grenoble Alpes & CNRS, France
* Katja Markert, Heidelberg University, Germany
[Apologies for cross-postings]
FIRST CALL FOR PAPERS
LREC 2026
Organised by the ELRA Language Resources Association
Palma, Mallorca, Spain
11-16 May 2026
The Fifteenth biennial Language Resources and Evaluation Conference
(LREC) will be held at the Palau de Congressos de Palma in Palma,
Mallorca, Spain, on 11-16 May 2026. LREC serves as the primary forum for
presentations describing the development, dissemination, and use of
language resources involving both traditional and recently developed
approaches.
The scientific program will include invited talks, oral presentations,
and poster and demo presentations, as well as a keynote address by the
winner of the Antonio Zampolli Prize. Submissions describing all aspects
of language resource development and use are invited, including, but not
limited to, the following:
Language Resource Development
Methods and tools for mono- and multi-lingual language resource
development and annotation
Knowledge discovery/representation (knowledge graphs, linked data,
terminologies, lexicons, ontologies, etc.)
Resource development for less-resourced/endangered languages
Guidelines, standards, best practices, and models for interoperability
Language Resource Use
Use of language resources in systems and applications for any area
of language and speech processing
Use of language resources in assistive technologies, support for
accessibility
Efficient/low-resource methods for language and speech processing
Evaluation
Methodologies and protocols for evaluation and benchmarking of
language technologies
Measures for validation of language resources and quality assurance
Usability of user interfaces and dialogue systems
Bias, safety, and user satisfaction metrics
Interpretability/explainability of language models and language and
speech processing tools
Language Resources and Large Language Models
Language resource development for LLMs (monolingual, multilingual,
multimodal)
(Semi-)automatic generation of training data
Training, fine-tuning, adaptation, alignment, and representation
learning
Guardrails, filters, and modules for generative AI models
Policy and Organizational Considerations
International and national activities, projects, initiatives, and
policies
Language coverage and diversity
Replicability and reproducibility
Organisational, economic, ethical, climate, and legal issues
Separate calls will be issued for Workshops, Tutorials and Industry Track.
Submission
Submissions should be 4 to 8 pages in length (excluding references) and
follow the LREC stylesheet, which will soon be available on the
conference website.
At the time of submission, authors are offered the opportunity to share
related language resources with the community. All repository entries
are linked to the LRE Map [https://lremap.elra.info/], which provides
metadata for the resource.
Accepted papers will appear in the conference proceedings, which include
both oral and poster papers in the same format. Determination of the
presentation format (oral vs. poster) is based solely on an assessment
of the optimal method of communication (more or less interactive), given
the paper content.
Important dates
(All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”)
Oral and poster (or poster+demo) paper submission: 17 October 2025
Notification of acceptance: 13 February 2026
Camera Ready due: 6 March 2026
Workshop and tutorial proposals submission: 17 October 2025
LREC 2026 conference: 11-16 May 2026
More information on LREC 2026: https://lrec2026.info/
Contact: info(a)lrec2026.info
The First Workshop on Optimal Reliance and Accountability in Interactions
with Generative Language Models (*ORIGen*) will be held in conjunction
with
the Second Conference on Language Modeling (COLM) at the Palais des
Congrès
in Montreal, Quebec, Canada, on October 10, 2025!
*The deadline for submission has been extended to June 27, 2025, Anywhere
on Earth.*
With the rapid integration of generative AI, exemplified by large language
models (LLMs), into personal, educational, business, and even governmental
workflows, such systems are increasingly being treated as “collaborators”
with humans. In such scenarios, underreliance or avoidance of AI
assistance
may obviate the potential speed, efficiency, or scalability advantages of
a
human-LLM team, but simultaneously, there is a risk that subject matter
non-experts may overrely on LLMs and trust their outputs uncritically,
with
consequences ranging from the inconvenient to the catastrophic. Therefore,
establishing optimal levels of reliance within an interactive framework is
a
critical open challenge as language models and related AI technology
rapidly
advances.
* What factors influence overreliance on LLMs?
* How can the consequences of overreliance be predicted and guarded against?
* What verifiable methods can be used to apportion accountability for the
outcomes of human-LLM interactions?
* What methods can be used to imbue such interactions with appropriate
levels
of “friction” to ensure that humans think through the decisions they make
with LLMs in the loop?
The ORIGen workshop provides a new venue to address these questions and
more
through a multidisciplinary lens. We seek to bring together broad
perspectives from AI, NLP, HCI, cognitive science, psychology, and
education
to highlight the importance of mediating human-LLM interactions to
mitigate
overreliance and promote accountability in collaborative human-AI
decision-making.
Submissions are due *June 27, 2025*. Please see our call for papers [1]
for
more!
[1] https://origen-workshop.github.io/submissions/
Organizers:
- Nikhil Krishnaswamy, Colorado State University
- James Pustejovsky, Brandeis University
- Dilek Hakkani-Tür, University of Illinois Urbana Champaign
- Vasanth Sarathy, Tufts University
- Tejas Srinivasan, University of Southern California
- Mariah Bradford, Colorado State University
- Timothy Obiso, Brandeis University
- Mert Inan, Northeastern University
Dear colleagues,
EUSKORPORA, a newly created Linguistic Data Center for Basque digital technologies based in San Sebastián (Donostia), Spain, is seeking candidates for two key roles in its Technology area:
1) Senior AI and Language Technologies Specialist
2) Junior AI and Language Technologies Specialist
Both positions are part of the Center's mission to position the Basque language in the global digital space through open-source development and cutting-edge research.
=== SENIOR AI AND LANGUAGE TECHNOLOGIES SPECIALIST ===
EUSKORPORA, the Linguistic Data Center for Basque Digital Technologies, a new association based in Donostia/San Sebastián, is seeking a senior expert in AI technologies applied to natural language processing, with experience, to lead key tasks related to language technologies applied to the Basque language.
The selected person will be part of an interdisciplinary team and will participate in projects involving the collection, analysis, and annotation of linguistic data, as well as the development of open-source foundational language models (ASR, TTS, MT, NLP) oriented to Basque, in a research and development context closely connected to industry.
Responsibilities:
- Supervise and optimize processes for linguistic corpus collection, annotation, and management
- Lead the design and development of foundational language models applied to Basque (speech recognition, synthesis, translation, text processing, etc.)
- Contribute to the technological architecture of the Center
- Coordinate internal and external teams and mentor junior staff
- Identify innovation opportunities and contribute to proposals, reports, and dissemination
- Establish strategic relationships with ecosystem stakeholders
Requirements:
- Advanced degree (Master or PhD) in Computational Linguistics, NLP, AI, Computer Engineering, Data Science or related fields
- Minimum 5 years of experience in language or speech technologies
- Proven experience with ASR, TTS, MT, or NLP models
- Strong programming skills in Python and familiarity with frameworks such as Hugging Face, PyTorch, TensorFlow, spaCy, Kaldi, ESPnet, Fairseq
- Knowledge of MLOps, Git, and data science best practices
- Familiarity with open repositories and licensing
Languages:
- Basque: desirable, intermediate level (B2 or higher)
- Spanish: fluent
- English: high level (especially technical)
We offer:
- Participation in strategic national and international projects
- Competitive salary according to experience
- Interdisciplinary environment and opportunities for professional growth
=== JUNIOR AI AND LANGUAGE TECHNOLOGIES SPECIALIST ===
EUSKORPORA, the Linguistic Data Center for Basque Digital Technologies, a new association based in Donostia/San Sebastián, is seeking young professionals at the beginning of their careers to support key tasks related to the creation of linguistic resources and language technologies for the Basque language.
Selected individuals will join an interdisciplinary team and participate in projects involving the collection, annotation, and analysis of linguistic data, as well as the development of open-source foundational language models (ASR, TTS, MT, NLP) oriented to Basque, in a research and development context closely connected to industry.
Responsibilities:
- Support the collection, cleaning and annotation of linguistic corpora (text and audio)
- Assist in the training and evaluation of language and speech models
- Collaborate in the documentation and maintenance of language resources
- Contribute to the integration of open-source NLP tools and libraries
- Assist in reports and dissemination activities
- Work in coordination with technical, linguistic and project management profiles
Requirements:
- Degree or Master in Computational Linguistics, Computer Engineering, Data Science, or similar
- Basic knowledge of NLP, language models, or speech technologies
- Python programming (basic/intermediate level)
- Familiarity with linguistic annotation or text processing tools
- Experience with Git and frameworks like Hugging Face or spaCy is a plus
Languages:
- Basque: high level (B2 or higher)
- Spanish: fluent
- English: high level (B2 or higher)
We offer:
- Dynamic and innovative environment based in San Sebastián
- Continuous training in cutting-edge technologies
- Real opportunities for growth within the team
- Competitive salary according to training and experience
For further information or to apply, please contact:
info(a)euskorpora.eus
Best regards,
EUSKORPORA
[Euskorpora]<https://www.euskorpora.eus/>
Euskorpora
info(a)euskorpora.eus<mailto:sarregi@euskorpora.eus>
+(34) 611 02 81 72
Mezu elektroniko honetan jasotzen den informazioa hartzaileen erabilera pertsonal eta konfidentzialerako da. Okerreko mezu hau jaso baduzu, mesedez, jakinarazi eta ezabatu.
[https://www.euskorpora.eus/wp-content/uploads/2025/02/eco.png] Ez inprimatu mezu hau behar-beharrezkoa ez bada.
We are pleased to invite submissions for the first Interdisciplinary
Workshop on Observations of Misunderstood, Misguided and Malicious Use of
Language Models (OMMM 2025). The workshop will be held with the RANLP 2025
conference in Varna, Bulgaria, on 11-13 September 2025.
Overview
The use of Large Language Models (LLMs) pervades scientific practices in
multiple disciplines beyond the NLP/AI communities. Alongside benefits for
productivity and discovery, widespread use often entails misuse due to
misalignment of values, lack of knowledge, or, more rarely, malice. LLM
misuse has the potential to cause real harm in a variety of settings.
Through this workshop, we aim to gather researchers interested in
identifying and mitigating inappropriate and harmful uses of LLMs. These
include misunderstood usage (e.g., misrepresentation of LLMs in the
scientific literature); misguided usage (e.g., deployment of LLMs without
adequate training or privacy safeguards); and malicious usage (e.g.,
generation of misinformation and plagiarism). Sample topics are listed
below, but we welcome submissions on any domain related to the scope of the
workshop.
Important Dates
Submission deadline *[NEW]*: *15 July 2025*, at 23:59 Anywhere on Earth
Notification of acceptance: 01 August 2025
Camera-ready papers due: 30 August 2025
Workshop dates: September 11, 12, or 13, 2025
Submission Guidelines
Submissions will be accepted as short papers (4 pages) and as long papers
(8 pages), plus additional pages for references. All submissions undergo a
double-blind review, so they should not include any identifying
information. Submissions should conform to the RANLP guidelines; for
further information and templates, please see
https://ranlp.org/ranlp2025/index.php/submissions/
We welcome submissions from diverse disciplines, including NLP and AI,
psychology, HCI, and philosophy. We particularly encourage reports on
negative results that provide interesting perspectives on relevant topics.
In-person presenters will be prioritised when selecting submissions to be
presented at the workshop, but the workshop will take place in a hybrid
format. Accepted papers will be included in the workshop proceedings in the
ACL Anthology.
Papers should be submitted on the RANLP conference system at
https://softconf.com/ranlp25/OMMM2025/
Keynote Speaker
We are excited to have Dr. Stefania Druga as the keynote speaker for the
inaugural OMMM workshop. Dr. Druga is a Research Scientist at Google
DeepMind, where she designs novel multimodal AI applications.
Topics of Interest
We welcome paper submissions on all topics related to inappropriate and
harmful uses of LLMs, including but not limited to:
-
Misunderstood use (and how to improve understanding):
-
Misrepresentation of LLMs (e.g., anthropomorphic language)
-
Attribution of consciousness
-
Interpretability
-
Overreliance on LLMs
-
Misguided use (and how to find alternatives):
-
Underperformance and inappropriate applications
-
Structural limitations and ethical considerations
-
Deployment without proper training or safeguards
-
Malicious use (and how to mitigate it):
-
Adversarial attacks, jailbreaking
-
Detection and watermarking of machine-generated content
-
Generation of misinformation or plagiarism
-
Bias mitigation and trust design
For more information, please refer to the workshop website:
https://ommm-workshop.github.io/2025/. For any questions, please contact
the organisers at ommm-workshop(a)googlegroups.com.
The organisers,
Piotr Przybyła, Universitat Pompeu Fabra
Matthew Shardlow, Manchester Metropolitan University
Clara Colombatto, University of Waterloo
Nanna Inie, IT University of Copenhagen
[Apologies for cross-posting]
Terminology Translation Task at WMT2025 - Call for Participation
We are excited to announce the third Shared Task on Terminology Translation<https://www2.statmt.org/wmt25/terminology.html>, which would be run within the 10th Conference on Machine Translation (WMT2025) in Suzhou, China.
TL;DR:
- We test the sentence-level and document-level translation of the texts in finance and IT domains, given the explicit terminology.
- The language pairs are: English -> {Spanish, German, Russian, Chinese}, Chinese -> English.
- We evaluate the overall quality of translation, terminology success rate and consistency. Additionally, we compare the performance of systems given no terms provided, proper terminology and random terms.
- The task starts on 20th June 2025 AOE, the submission deadline is 20th July 2025 AOE.
- Please pre-register via Google Forms here: https://forms.gle/ZSn2pNJkQJAzHFnA6 .
OVERVIEW
The advances in neural MT and LLM-assisted translation of the last decade show nearly human quality in general domain translation at least for the high-resource languages. However, when it comes to specialized domains like science, finance, or legal texts, where the correct and consistent use of special terms is crucial, the task is far from being solved. The Terminology Shared Task aims to assess the extent to which machine translation models can utilize additional information regarding the translation of terminologies. Compared to two previous editions, 2021 and 2023, the new test data have more various test cases, are more consistent in domains for each translation direction, and are broader in language coverage.
TASK DESCRIPTION
Track №1: Sentence/Paragraph-Level Translation
You will be provided with sequence of input sentences long, and small terminology dictionaries that will correspond only to the terms present in the given sentence.
Language Pairs:
* en-de (English → German)
* en-ru (English → Russian)
* en-es (English → Spanish)
Domain: information technology
Track №2: Document-Level Translation
The setup is similar to Track №1, with two exceptions: the length of the input texts now equals the document, and the dictionaries correspond to the whole set of input texts (i.e. they are corpus-level). This makes the task close to the real-life setup (where the dictionaries exist independently from the texts), while it may complicate the implementation (since for the solutions that require storing the whole dictionary it will take more memory). Additionally, for the whole document setup, the problem of the consistent usage of terms is becoming more important.
Language Pairs:
en-zh-Hant (English → Traditional Chinese)
zh-Hant-en (Traditional Chinese → English)
Domain: finance
EVALUATION
Terminology Modes:
You are expected to compare your system’s performance under three modes:
1. No terminology: the system is only provided with input sentences/documents.
2. Proper terminology: the system is provided with input texts (same as 1.) and dictionaries of the format {source_term: target_term}.
3. Random terminology: the system is provided with input texts and translation dictionaries of the same format as in 2. The difference is that the dictionary items are not special terms but words randomly drawn from input texts. This mode is of special interest since we want to measure to what extent the proper term translations help to improve the system performance (2.), as opposed to an arbitrary broader input that does not contain the domain-specific terminology.
Metrics:
1. Overall Translation Quality: we will evaluate the general aspects of machine translation outputs such as fluency, adequacy and grammaticality. We will do that with the general MT automatic metrics such as BLEU or COMET. In addition to that, we will pay special attention to the grammaticality of the translated terms.
2. Terminology Success Rate: This metric assesses the ability of the system to accurately translate technical terms given the specialized vocabulary. This will be carried out by comparing the occurrences of the correct term translations (i.e. the ones present in the dictionary) to the output terms. The goal is to have a higher success rate that will show adherence to dictionary translations.
3. Terminology Consistency: for domains such as science or legal texts, the consistent use of an introduced term throughout the text is crucial. In other words, we want a system to not only pick up a correct term in a target language but to use it consistently once it is chosen. This will be evaluated by comparing all translations of a given source term in a text and measuring the percentage of deviations from the most consistent translation. This metric is more important for the Document-Level track, but it will be used for both tracks.
IMPORTANT DATES
All dates are end of Anywhere on Earth (AoE).
Data snippets released: 7th May 2025
Dev data released: 22nd May 2025
Test data release, task starts: 20th June 2025 (postponed)
Submission deadline: 20th July 2025 (postponed)
Paper submission to WMT25: in-line with WMT25
Camera-ready submission to WMT25: in-line with WMT25
Conference in Suzhou, China: 05-09 November 2025
SUBMISSION GUIDELINES
0. Please notify us about your participation prior to submission. This is optional, but will be very helpful for us for better understanding of our workload after submission. Please do it through this Google Form: https://forms.gle/ZSn2pNJkQJAzHFnA6
1. Check your submission files with the validation script. It will be published at test date publication.
2. Write a description of your system (optional).
3. Submit your system via Google Forms. The Google form with all necessary submission details will be published at the test set date.
All details on submission as well as FAQ can be found at the webpage of the shared task.
ORGANIZERS
* Kirill Semenov (University of Zurich), main contact: FirstNаmе [dоt] LаstNаmе {аt} uzh /dоt/ ch
* Nathaniel Berger (Heidelberg University)
* Pinzhen Chen (University of Edinburgh & Aveni.ai)
* Xu Huang (Nanjing University)
* Arturo Oncevay (JP Morgan)
* Dawei Zhu (Amazon)
* Vilém Zouhar (ETH Zurich)
WEBSITE: https://www2.statmt.org/wmt25/terminology.html
In case of query, please send an email to Kirill Semenov (see email above).
Call for papers: The First Workshop on Natural Language Processing and Language Models for Digital Humanities
(LM4DH_2025) @ RANLP_2025
Date: 11th- to 13th September 2025 (TBC)
Venue : Varna, Bulgaria
Website: https://www.clarin.eu/event/2025/clarin-workshop-ranlp-2025
Submissions Portal: https://softconf.com/ranlp25/LM4DH2025/
Digital Humanities has emerged as an interdisciplinary field of research, serving as an intersection of computer science with many other fields such as linguistics, social sciences, history, psychology, etc. With the development of Large Language Models (LLMs), state-of-the-art Natural Language Processing (NLP) tasks such as entity recognition, sentiment analysis, and text summarisation have been significantly enhanced, offering powerful tools to analyse and interpret complex historical and cultural data. These developments offer transformative capabilities for analysing and interpreting complex historical and cultural datasets, including oral histories, archival documents, and literary texts. These advancements provide powerful tools for analysing and interpreting intricate historical, cultural, and social data, enabling researchers to identify patterns, extract meaningful relationships, and generate interpretations at unprecedented scale and precision.
This workshop aims to provide a common platform for researchers, practitioners, and students from diverse disciplines to collaboratively explore and apply AI-driven techniques in the Digital Humanities. Through interdisciplinary discussion, the event aims to generate creative approaches, exchange best practices, and create a community committed to furthering AI-based research on human culture and history. The focus of the workshop is on applying natural language processing techniques to digital humanities research. The topics can be anything of digital humanities interest with a natural language processing or LLM-based application. We expect contributions related (but not limited) to the following topics:
* Text analysis and processing related to the humanities using computational methods
* Usage of the interpretability of large language models' output for DH-related tasks
* Dataset creation and curation for NLP (e.g. digitisation, datafication, and data preservation
* Automatic error detection, correction, and normalisation of textual data
* Generation and analysis of literary works such as poetry and novels
* Analysis and detection of text genres
* Emotion analysis for the humanities and literature
* Modelling of information and knowledge in the Humanities, Social Sciences, and Cultural Heritage
* Low-resource and historical language processing
* Search for scientific and/or scholarly literature
* Profiling and authorship attribution
Submission & Publication
All papers must represent original and unpublished work that is not currently under review. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop.
Submissions must follow the RANLP 2025 submission guidelines<https://ranlp.org/ranlp2025/index.php/submissions/>, using ACL-style templates (LaTeX or MS Word).
Paper must be submitted using SoftConf at https://softconf.com/ranlp25/LM4DH2025/
All papers will be double-blind peer reviewed. Authors of the accepted papers will present their work in either the oral or poster session. All accepted papers will appear in the workshop proceedings that will be published in ACL Anthology.
Important Dates
* Paper submission deadline: 20th July 2025
* Notification of acceptance: 2nd August 2025
* Camera-ready paper: 20th August 2025
* Workshop date: 11th September 2025
Organising Committee
* Isuri Anuradha, Lancaster University, UK
* Francesca Frontini, CNR-ILC, Italy & CLARIN ERIC
* Paul Rayson, Lancaster University, UK
* Ruslan Mitkov, Lancaster University, UK
* Deshan Sumanathilake, Swansea University, UK
This workshop has been organised with the generous support and coordination of CLARIN-EU.
Gmail: dhranlp2(a)gmail.com<mailto:%20dhranlp2@gmail.com>
*Call for Participation in Tracks
*
*FIRE 2025: 17th meeting of the Forum for Information Retrieval Evaluation*
Indian Institute of Technology (BHU) Varanasi
17th - 20th December
Website: fire.irsi.org.in <http://fire.irsi.org.in/>
*Call for Participation in Tracks*
FIRE 2025 offers the following exciting tracks this year:
* Cross-Lingual Mathematical Information Retrieval (CLMIR)
<https://clmir2025.github.io/>
* Code-Mixed Information Retrieval from Social Media Data (CMIR)
<https://cmir-iitbhu.github.io/cmir/index.html>
* Hate Speech and Offensive Content Identification in Memes in
Bengali, Hindi, Gujarati and Bodo (HASOC-meme)
<https://hasocfire.github.io/hasoc/2025/>
* Information Retrieval in Software Engineering (IRSE)
<https://sites.google.com/view/irse-2025/home>
* Misinformation Detection and Prompt Recovery (PROMID)
<https://promid.github.io/index.html>
* Multilingual Story Illustration: Bridging Cultures through AI
Artistry (MUSIA) <https://cse-iitbhu.github.io/MUSIA/index.html>
* Offensive Language Identification in Dravidian Languages
(DravidianCodeMix)
<https://dravidian-codemix.github.io/2025/dataset.html>
* Opinion Extraction and Question Answering from
CryptoCurrency-Related Tweets and Reddit posts (CryptOQA)
<https://sites.google.com/view/cryptoqa-2025/>
* Research Highlight Generation from Scientific Papers (SciHigh)
<https://sites.google.com/jadavpuruniversity.in/scihigh2025/home>
* Spoken-Query Cross-Lingual Information Retrieval for the Indic
Languages (SqCLIR) <https://sites.google.com/view/sqclir-2025>
* Varanasi Tourism in Question Answer System (VATIKA)
<https://sites.google.com/view/vatika-2025/>
* Word-Level Identification of Languages in Dravidian Languages (WILD)
<https://www.codabench.org/competitions/7902/>
Research groups are invited to participate in the experiments. Please
register directly with the organizers.
FIRE 2025 is the 17th edition of the annual meeting of Forum for
Information Retrieval Evaluation (fire.irsi.org.in). Since its inception
in 2008, FIRE had a strong focus on shared tasks similar to those
offered at Evaluation forums like TREC, CLEF, and NTCIR. The shared
tasks focus on solving specific problems in the area information access
and, more importantly help in generating evaluation datasets for the
research community.
Visit fire.irsi.org.in <http://fire.irsi.org.in>
The 2st Workshop on DHOW: Diffusion of Harmful Content on Online Web
Workshop
The workshop will be conducted in a *hybrid* format to ensure maximum
participation, accommodating attendees both *online* and in person.
Submission deadline: *July 11 2025 AOE*
*Workshop site*: https://dhow-workshop.github.io/2025/
*Co-located with ACMMM 2025*
https://acmmm2025.org/ <https://lrec-coling-2024.org/>
Dublin, Ireland, 27-31 October 2024
*Important Dates*
Submission deadline: extended to *July 11, 2025*
Notification of acceptance: August 01, 2025
Camera-ready papers due: August 11, 2025
Workshop date: October 27/28, 2025
*Workshop Description*
With the advancement of digital technologies and gadgets, online content
is easily accessible. At the same time, harmful content also gets
spread. There are different harmful content available on different
platforms in multiple languages. The topic of harmful content is broad
and covers multiple research directions. But from the user’s aspect,
they are affected by them all. Often, it is studied individually, like
misinformation and hate speech. Research has been done on one platform,
monolingual, on a particular issue. It leads to harmful content
spreaders switching platforms and languages to reach the user base.
Harmful is not limited to social media but also news media. Spreader
shares harmful content in posts, news articles, comments, and
hyperlinks. So, there is a need to study the harmful content by
combining cross-platform, language, multimodal data and topics.
We will bring the research on harmful content under one umbrella so that
research on different topics (hate speech, misinformation,
disinformation, self-harm, offensive content, etc.) can bring some novel
methods and recommendations for users, leveraging text analysis with
image, audio, and video recognition to detect harmful content in diverse
formats. The workshop will cover the ongoing issue of war or elections
in 2025.
We believe this workshop will provide a unique opportunity for
researchers and practitioners to exchange ideas, share latest
developments, and collaborate on addressing the challenges associated
with harmful contents spread across the Web. We expect that the workshop
will generate insights and discussions that will help advance the field
of societal artificial intelligence (AI) for the development of safer
internet. In addition to attracting high quality research contributions
to the workshop, one of the aims of the workshop is to mobilise the
researchers working on the related areas to form a community.
*Submissions Topics*
•Studying different types of harmful content
•Computational fact-checking & Misinformation Detection
•Role of Generative AI in Mitigating Harmful Content
•Harassment, Bullying, and Hate Speech Detection
•Explainable AI for Harmful Content Analysis
•Multimodal and Multilingual Harmful Content Detection such as fake
news, spam, and troll detection.
•Deepfake and Synthetic Media
•Ethical & Societal Implications of AI in Content Moderation
•Both Qualitative and Quantitative study on harmful content
•Psychological effects of harmful content like mental health
•Approaches for data collection or data annotation using multimodal
large models on harmful content
•User study on the effects of harmful content on human beings
*Submissions*
- Submission Instructions: https://dhow-workshop.github.io/2025/#call
<https://dhow-workshop.github.io/2025/#call>
- Submission Link:
https://openreview.net/group?id=acmmm.org/ACMMM/2025/Workshop/DHOW
<https://openreview.net/group?id=acmmm.org/ACMMM/2025/Workshop/DHOW>
***Workshop organizers*
•Thomas Mandl (University of Hildesheim, Germany)
•Haiming Liu (University of Southampton, United Kingdom)
•Gautam Kishore Shahi(University of Duisburg-Essen, Germany)
•Amit Kumar Jaiswal (University of Surrey, United Kingdom )
•Durgesh Nandini (University of Bayreuth, Germany)
DHOW 2025
Ethical LLMs 2025: The first Workshop on Ethical Concerns in Training, Evaluating and Deploying Large Language Models<https://sites.google.com/view/ethical-llms-2025> @ RANLP2025<https://ranlp.org/ranlp2025/>
Call for papers:
Scope
Large Language Models (LLMs) represent a transformative leap in Artificial Intelligence (AI), delivering remarkable language-processing capabilities that are reshaping how we interact with technology in our daily lives. With their ability to perform tasks such as summarisation, translation, classification, and text generation, LLMs have demonstrated unparalleled versatility and power. Drawing from vast and diverse knowledge bases, these models hold the potential to revolutionise a wide range of fields, including education, media, law, psychology, and beyond. From assisting educators in creating personalised learning experiences to enabling legal professionals to draft documents or supporting mental health practitioners with preliminary assessments, the applications of LLMs are both expansive and profound.
However, alongside their impressive strengths, LLMs also face significant limitations that raise critical ethical questions. Unlike humans, these models lack essential qualities such as emotional intelligence, contextual empathy, and nuanced ethical reasoning. While they can generate coherent and contextually relevant responses, they do not possess the ability to fully understand the emotional or moral implications of their outputs. This gap becomes particularly concerning when LLMs are deployed in sensitive domains where human values, cultural nuances, and ethical considerations are paramount. For example, biases embedded in training data can lead to unfair or discriminatory outcomes, while the absence of ethical reasoning may result in outputs that inadvertently harm individuals or communities. These limitations highlight the urgent need for robust research in Natural Language Processing (NLP) to address the ethical dimensions of LLMs. Advancements in NLP research are crucial for developing methods to detect and mitigate biases, enhance transparency in model decision-making, and incorporate ethical frameworks that align with human values. By prioritising ethics in NLP research, we can better understand the societal implications of LLMs and ensure their development and deployment are guided by principles of fairness, accountability, and respect for human dignity. This workshop will dive into these pressing issues, fostering a collaborative effort to shape the future of LLMs as tools that not only excel in technical performance but also uphold the highest ethical standards.
Submission Guidelines
We follow the RANLP 2025 standards for submission format and guidelines. EthicalLLMs 2025 invites the submission of long papers, up to eight pages in length, and short papers, up to six pages in length. These page limits only apply to the main body of the paper. At the end of the paper (after the conclusions but before the references) papers need to include a mandatory section discussing the limitations of the work and, optionally, a section discussing ethical considerations. Papers can include unlimited pages of references and an unlimited appendix.
To prepare your submission, please make sure to use the RANLP 2025 style files available here:
* Latex<https://ranlp.org/ranlp2025/wp-content/uploads/2025/05/ranlp2025-LaTeX.zip>
* Word<https://ranlp.org/ranlp2025/wp-content/uploads/2025/05/ranlp2025-word.docx>
Papers should be submitted through Softconf/START using the following link: https://softconf.com/ranlp25/EthicalLLMs2025/
Topics of interest
The workshop invites submissions on a broad range of topics related to the ethical development and evaluation of LLMs, including but not limited to the following.
1.
Bias Detection and Mitigation in LLMs
Research focused on identifying, measuring, and reducing social, cultural, and algorithmic biases in large language models.
2.
Ethical Frameworks for LLM Deployment
Approaches to integrating ethical principles—such as fairness, accountability, and transparency—into the development and use of LLMs.
3.
LLMs in Sensitive Domains: Risks and Safeguards
Case studies or methodologies for deploying LLMs in high-stakes fields such as healthcare, law, and education, with an emphasis on ethical implications.
4.
Explainability and Transparency in LLM Decision-Making
Techniques and tools for improving the interpretability of LLM outputs and understanding model reasoning.
5.
Cultural and Contextual Understanding in NLP Systems
Strategies for enhancing LLMs’ sensitivity to cultural, linguistic, and social nuances in global and multilingual contexts.
6.
Human-in-the-Loop Approaches for Ethical Oversight
Collaborative models that involve human expertise in guiding, correcting, or auditing LLM behaviour to ensure responsible use.
7. Mental Health and Emotional AI: Limits of LLM Empathy
Discussions on the role of LLMs in mental health support, highlighting the boundary between assistive technology and the need for human empathy.
Organisers
Damith Premasiri – Lancaster University, UK
Tharindu Ranasinghe – Lancaster University, UK
Hansi Hettiarachchi – Lancaster University, UK
Contact
If you have any questions regarding the workshop, please contact Damith: d.dolamullage(a)lancaster.ac.uk
Dear all,
We are currently doing a project aiming to make querying in syntactically annotated corpora easier and more accessible.
For this purpose, we want to know what researchers are actually searching for.
If you have a minute of your time, please feel free to fill out this form.
https://forms.office.com/e/a8DgETSabB
Feel free to reach out to ekavol(a)chalmers.se or nikdew(a)chalmers.se if you have any further questions.
Best regards
Niklas Deworetzki & Katja Voloshina
PhD Students
Department of Computer Science and Engineering
Chalmers University of Technology | University of Gothenburg
SE-412 96 Göteborg, Sweden
www.gu.se<http://www.gu.se/>
www.chalmers.se<http://www.chalmers.se/>
[cid:a8138665-78e4-4530-80d5-cf9cbf2bd3c2]
CLEF 2025 – Registration Open
Conference and Labs of the Evaluation Forum
We are pleased to announce CLEF 2025, taking place 9–12 September 2025 in Madrid, Spain at UNED. This peer‑reviewed conference and associated labs foster research in multilingual, multimodal, and cross‑language information access https://clef2025.clef-initiative.eu/.
Register now – Early‑bird registration is open! Standard registration opened earlier this year, and early-bird rates are currently available .
Why attend?
*
Present and discuss original research at main conference.
*
Engage in innovative labs and challenges, including LifeCLEF, ImageCLEF, EXIST, eRisk, CheckThat!, and more https://clef2025.clef-initiative.eu/index.php?page=Pages/labs.html.
*
Benefit from rich networking with academic and industry experts in IR, NLP, multimedia retrieval, and evaluation sciences.
For detailed conference and lab registration, registration deadlines, and pricing, please visit the official site: https://clef2025.clef-initiative.eu/index.php?page=Pages/registrationConfer…
Important Dates
*
Early‑bird registration ongoing
*
Registration closes: 31 August 2025
*
Conference & labs: 9–12 September 2025 — Madrid, Spain
We look forward to welcoming participants from across the global community — see you this September in Madrid at CLEF 2025!
Jorge Carrillo-de-Albornoz
On behalf of the CLEF 2025 Organising Committee
AVISO LEGAL. Este mensaje puede contener información reservada y confidencial. Si usted no es el destinatario no está autorizado a copiar, reproducir o distribuir este mensaje ni su contenido. Si ha recibido este mensaje por error, le rogamos que lo notifique al remitente.
Le informamos de que sus datos personales, que puedan constar en este mensaje, serán tratados en calidad de responsable de tratamiento por la UNIVERSIDAD NACIONAL DE EDUCACIÓN A DISTANCIA (UNED) c/ Bravo Murillo, 38, 28015-MADRID-, con la finalidad de mantener el contacto con usted. La base jurídica que legitima este tratamiento, será su consentimiento, el interés legítimo o la necesidad para gestionar una relación contractual o similar. En cualquier momento podrá ejercer sus derechos de acceso, rectificación, supresión, oposición, limitación al tratamiento o portabilidad de los datos, ante la UNED, Oficina de Protección de datos<https://www.uned.es/dpj>, o a través de la Sede electrónica<https://sede.uned.es/> de la Universidad.
Para más información visite nuestra Política de Privacidad<https://descargas.uned.es/publico/pdf/Politica_privacidad_UNED.pdf>.
Apologies for cross-posting.
---------------------------------------------------------------------------
*CALL FOR PAPERS: Language Resources and Evaluation Journal- Special Issue
on Machine Translation for Low-Resource Languages*
https://link.springer.com/collections/gbdgacbgbg
*Guest Editors:*
- Atul Kr. Ojha (Insight Research Ireland Centre for Data Analytics,
DSI, University of Galway, Ireland)
- Chao-Hong Liu (Industrial Technology Research Institute, Potamu
Research Ltd.)
- Ekaterina Vylomova (University of Melbourne, Australia)
- Flammie Pirinen (UiT The Arctic University of Norway, Tromsø)
- Jonathan Washington (Swarthmore College, USA)
- Nathaniel Oco (De La Salle University, Philippines)
- Xiaobing Zhao (Minzu University of China)
Machine translation (MT) technologies have been improved significantly in
the last decade using neural MT (NMT) approaches. However, most of these
methods rely on the availability of large parallel data for training the MT
systems, resources which are not available for the majority of language
pairs. Hence, current technologies often fall short in their ability to be
applied to low-resource languages. Developing MT technologies using
relatively small corpora still presents a major challenge for the MT
community. In addition, many methods for developing MT systems still rely
on several natural language processing (NLP) tools to pre-process texts in
source languages and post-process MT outputs in target languages. The
performance of these tools often has a great impact on the quality of the
resulting translation. The availability of MT technologies and NLP tools
can facilitate equal access to information for the speakers of a language
and determine on which side of the digital divide they will end up. The
lack of these technologies for many of the world's languages provides
opportunities both for the field to grow and for making tools available for
speakers of low-resource languages.
In the past few years, several workshops and evaluations have been
organized to promote research on low-resource languages. NIST has been
conducting Low Resource Human Language Technology evaluations (LoReHLT)
annually from 2016 to 2019. In LoReHLT evaluations, there is no training
data in the evaluation language. Participants receive training data in
related languages but need to bootstrap systems in the surprise evaluation
language at the start of the evaluation. Methods for this include pivoting
approaches and taking advantage of linguistic universals. The evaluations
are supported by DARPA's Low Resource Languages for Emergent Incidents
(LORELEI) program, which seeks to advance technologies that are less
dependent on large data resources and that can be quickly pivoted to new
languages within a very short amount of time so that information from any
language can be extracted in a timely manner to provide situation awareness
to emergent incidents. There are also the Workshop on Technologies for MT
of Low-Resource Languages (LoResMT), Special Interest Group on
Under-resourced Languages (SIGUL), Workshop on Resources and Technologies
for Indigenous, Endangered and Lesser-resourced Languages in Eurasia
(EURALI), the Workshop on Deep Learning Approaches for Low-Resource Natural
Language Processing (DeepLo). AfricaNLP, TurkLang, Conference on Machine
Translation (WMT), and International Conference on Spoken Language
Translation (IWSLT) workshop, which provide a venue for sharing research
and working on research and development in this field.
This topical collection solicits original research papers on MT
systems/methods and related NLP tools for low-resource languages in
general. LoReHLT, LORELEI, LoResMT, SIGUL, EURALI, DeepLo, WMT, and IWSLT
participants are very welcome to submit their work to the special issue.
Summary papers on MT research for specific low-resource languages, as well
as extended versions (>40% difference) of published papers from relevant
conferences/workshops, are also welcome.
Topics of the special issue include, but are not limited to:
* Research and review papers on MT systems/methods for low-resource
languages
* Research and review papers on pre-processing and/or post-processing NLP
tools for MT
* Word tokenizers/de-tokenizers for low-resource languages
* Word/morpheme segmenters for low-resource languages
* Use of morphological analyzers and/or morpheme segmenters in MT
* Multilingual/cross-lingual NLP tools for MT
* Review of available corpora of low-resource languages for MT
* Pivot MT for low-resource languages
* Zero-shot MT for low-resource languages
* Fast building of MT systems for low-resource languages
* Re-usability of existing MT systems and/or NLP tools for low-resource
languages
* Machine translation for language preservation
* Techniques that work across many languages and modalities
* Techniques that are less dependent on large data resources
* Use of language-universal resources
* Bootstrap-trained resources for the short development cycle
* Entity, relation- and event-extraction
* Sentiment detection in MT
* MT Summarisation
* Processing diverse languages, genres (news, social media, etc.) and
modalities (text, speech, video, etc.)
* Speech Translation for low-resource languages
* Multimodal MT for low-resource languages
* MT models using LLMs for low-resource languages
* Generative AI models for low-resource languages
* Evaluation metrics and datasets for low-resource languages
For further information on this initiative, please refer to
https://link.springer.com/collections/gbdgacbgbg
*IMPORTANT DATES*
*August 26, 2025: Paper submission deadlineDecember 05, 2025: Revised
papers dueMarch 2026: Publication*
* SUBMISSION GUIDELINES*
Authors should follow the "Instructions for Authors
<https://link.springer.com/journal/10579/submission-guidelines> (
https://link.springer.com/journal/10579/submission-guidelines or Overleaf
<https://link.springer.com/journal/10579/updates/17234296>)" on the LRE
journal website <https://link.springer.com/journal/10579>.
Thanks,
In this newsletter:
LDC data and commercial technology development
New publications:
Chinese Sentence Pattern Structure Treebank<https://catalog.ldc.upenn.edu/LDC2025T06>
IWSLT 2022-2023 Shared Task Training, Development and Test Set<https://catalog.ldc.upenn.edu/LDC2025S05>
KAIROS Schema Learning Complex Event Annotation<https://catalog.ldc.upenn.edu/LDC2025T07>
________________________________
LDC data and commercial technology development
For-profit organizations are reminded that an LDC membership is a pre-requisite for obtaining a commercial license to almost all LDC databases. Non-member organizations, including non-member for-profit organizations, cannot use LDC data to develop or test products for commercialization, nor can they use LDC data in any commercial product or for any commercial purpose. LDC data users should consult corpus-specific license agreements for limitations on the use of certain corpora. Visit the Licensing<https://www.ldc.upenn.edu/data-management/using/licensing> page for further information.
________________________________
New publications:
Chinese Sentence Pattern Structure Treebank<https://catalog.ldc.upenn.edu/LDC2025T06> was developed at Beijing Normal University<https://english.bnu.edu.cn/> and Peking University<https://english.pku.edu.cn/>. It contains 5,016 sentences and 119,627 tokens syntactically annotated following the concept of sentence constituent analysis which emphasizes sentence pattern structure. The source data consists of 27 chapters extracted from modern Mandarin and ancient Chinese works. There are three annotation layers: lexical sense and structural mode for dynamic words; syntactic structure for clauses; and inter-clause relation within complex sentence and sentence clusters. These structures can be visualized using the Jbw-viewer tool<https://github.com/bnucip/jbwviewer> which is included in the release.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
IWSLT 2022 - 2023 Shared Task Training, Development and Test Set<https://catalog.ldc.upenn.edu/LDC2025S05> was developed by LDC and contains 210 hours of Tunisian<https://catalog.ldc.upenn.edu/LDC2025S05> Arabic conversational telephone speech, transcripts, English translations, speaker metadata, and documentation. This material constitutes the training, development, and test data used in the International Conference on Spoken Language Translation (IWSLT) Dialectal Speech Translation task (2022)<https://iwslt.org/2022/dialect> and the Dialectal and Low-resource track (2023)<https://iwslt.org/2023/low-resource>.
The telephone speech was collected by LDC in 2016-2017 from native speakers of Tunisian Arabic in Tunis. Speakers were recruited to make telephone calls to people in their social networks from a variety of noise conditions and handsets. Transcripts are orthographic following Buckwalter<https://catalog.ldc.upenn.edu/LDC2004L02> transliteration and cover 175 hours of the collected speech. IPA transcripts were added to a subset of the data. All transcribed segments were translated into English.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
KAIROS Schema Learning Complex Event Annotation<https://catalog.ldc.upenn.edu/LDC2025T07> was developed by LDC to support the DARPA KAIROS program. It contains English and Spanish text, audio, video, and image data labeled for 93 real-world complex events with event, relation, and argument annotations linking to document provenance. Source data was collected from the web; 3431 root web pages were collected and processed, yielding 1919 text data files, 24019 image files, 1472 video files, and 16 audio files.
The DARPA KAIROS (Knowledge-directed Artificial Intelligence Reasoning Over Schemas) program aimed to build technology capable of understanding and reasoning about complex real-world events in order to provide actionable insights to end users. KAIROS systems utilized formal event representations in the form of schema libraries that specified the steps, preconditions, and constraints for an open set of complex events; schemas were then used in combination with event extraction to characterize and make predictions about real-world events in a large, multilingual, multimedia corpus.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
Dear CLIN enthusiasts
We are extending the submission deadline for CLIN abstracts by one week. The new, final deadline is June 20th. Below you can find the original call for abstracts with a modified date.
Website: https://clin35.ccl.kuleuven.be/
We invite submissions for CLIN35, the 35th edition of the Computational Linguistics in the Netherlands (CLIN) conference, which will take place in Leuven on September 12th, 2025.
Abstracts describing theoretical or applied research in any area of computational linguistics and natural language processing are welcome. We especially encourage submissions related to the Dutch language, but contributions on other languages and multilingual approaches are equally welcome. Abstracts must be written in English and should not exceed 500 words.
Submissions should include:
* Name and affiliation of each author
* Contact details
* Presentation title and short abstract (max. 500 words)
* Keywords
* Your presentation format preference (We will do our best to accommodate your preference but may need to make changes to provide a well-balanced program)
Abstracts must be submitted via the form on the website<https://clin35.ccl.kuleuven.be/call-for-abstracts> by Friday, 20th of June 2025. Notifications of acceptance will be sent out by Friday, 4th of July 2025. Accepted abstracts will be presented at the conference as oral or poster presentations. Authors with accepted abstracts will also have the opportunity to submit a full paper after the conference for publication in the CLIN Journal<https://www.clinjournal.org/clinj/>.
Please share this call with your interested colleagues and network! For any questions you can reach us at this email address (clin35(a)kuleuven.be<mailto:clin35@kuleuven.be>).
We look forward to your submissions and to welcoming you to CLIN35!
CLIN35 local organizers
________________________________
Denk je aan het milieu? Print alleen als het nodig is.
Aan dit bericht kunnen geen rechten worden ontleend.
Het bericht is alleen bestemd voor de geadresseerde.
Indien het bericht niet voor u is bestemd, verzoeken wij
u dit aan ons te melden en het bericht te verwijderen.
This message shall not constitute any obligations.
This message is intended solely for the addressee.
If you have received this message in error, please
inform us and delete the message.
________________________________
******************************************************
********* EVALITA 2026: Call for tasks *********
******* NEW DEADLINES and TIMELINE ******
******************************************************
EVALITA 2026 is an initiative of AILC (Associazione Italiana di Linguistica
Computazionale, AILC https://www.ai-lc.it/).
As in the previous editions (https://www.evalita.it/), EVALITA 2026 will be
organized along a few selected tasks, which provide participants with
opportunities to discuss and explore both emerging and traditional
areas of Natural
Language Processing and Speech. The participation is encouraged for teams
working both in academic institutions and industrial organizations.
TASK PROPOSAL SUBMISSION
Task proposals should be no longer than 4 pages and should include:
-
task title and acronym;
-
names and affiliation of the organizers (minimum 2 organizers);
-
brief task description, including motivations and state of the art;
-
explanation of the international relevance of the task;
-
description and examples of the data, including information about their
availability, development stage, and issues concerning privacy and data
sensitivity. The examples are mandatory because they are intended to give
potential participants an idea of what the task data will look like, how
it’ll be formatted, etc.
-
expected number of participants and attendees;
-
names and contact information of the organizers.
We also accept the re-annotation/expansion of datasets from previous years
and previous challenges with new annotation levels, and texts from publicly
available corpora. However, test annotations must be new and unpublished,
as participants must not have access to the test data annotations until the
end of EVALITA campaign. For new tasks, organizers must specify in the
proposal why it would attract a reasonable number of participants, and why
it is needed. For re-runs, organizers must describe the element of novelty
from previous challenges.
In submitting your proposal, please bear in mind that we strongly encourage:
-
tasks that pose non-trivial challenges and stimulate the creation of
innovative systems (i.e., that integrate linguistic insights or external
knowledge sources), rather than being easily addressed by off-the-shelf LLM
prompting techniques;
-
tasks focused on multimodality, e.g., considering both textual and
visual or any other modality;
-
tasks characterized by different levels of complexity, e.g., with a
straightforward main subtask and one or more sophisticated additional
subtasks;
-
to consider providing competitive baselines (e.g., small-scale LLMs in
zero-shot setups), which participants are expected to improve upon, in
order to encourage the design of advanced solutions;
-
application-oriented tasks, that is, tasks that have a clearly defined
end-user application showcasing;
-
multilingual tasks, i.e. with data both in Italian and in other
languages;
-
industrial tasks, i.e. tasks with real data provided by companies.
The organizers of the accepted tasks should take care of planning,
according to the scheduled deadlines (see below):
-
the development and distribution of datasets needed for the contest,
i.e. data for training and development, and data for testing; the scorer to
be used to evaluate the submitted systems should be included in the release
of development data;
-
the development of task guidelines, where all the instructions for the
participation are made clear, together with a detailed description of data
and evaluation metrics applied for the evaluation of the participant's
results;
-
the collection of participants' results;
-
the evaluation of participants' results according to standard metrics
and baseline(s);
-
the solicitation of participation and submissions;
-
the reviewing process of the papers describing the participants'
approach and results (according to the template to be made available by the
EVALITA 2026 chairs);
-
the production of a paper describing the task (according to the template
to be made available by the EVALITA 2026 chairs).
*** Email your proposal in PDF format to evalitacampaign(a)gmail.com with
"EVALITA 2026 TASK Proposal" as the subject line by the submission
deadline: July 28th 2025. ***
Please feel free to contact the EVALITA 2026 chairs at
evalitacampaign(a)gmail.com in case of any questions or suggestions.
Deadlines of the task proposal:
-
July 21th 2025 July 28th 2025: submission of task proposals
-
July 31th 2025 August 7th 2025: notification of task proposal acceptance
Timelines of EVALITA 2026:
-
22nd September 2025: development data available to participants
-
3 - 17th November 2025: evaluation windows
-
28th November 2025: assessments returned to participants
-
15th December 2025: final reports (from participants) due to task
organizers
-
22nd December 2025: final reports (from task organizers) due to EVALITA
chairs
-
19th January 2025: review deadline
-
2nd February 2026: camera-ready version deadline
-
26 - 27th February 2026: final workshop in Bari
EVALITA 2026 CHAIRS
Francesco Cutugno (Università di Napoli)
Alessio Miaschi (Istituto di Lingustica Computazionale “A. Zampolli” - CNR)
Alessio Palmero Aprosio (Università di Trento)
Giulia Rambelli (Università di Bologna)
Lucia Siciliani (Università di Bari)
Marco Antonio Stranisci (Università di Torino)
FURTHER INFORMATION
Website: https://www.evalita.it/campaigns/evalita-2026/call-for-tasks/
Mail: evalitacampaign(a)gmail.com
Marco,
UNITO <https://www.unito.it/persone/mstranis> and aequa-tech
<https://aequa-tech.com/>
The UKP Lab at the Department of Computer Science, Technical University Darmstadt, Germany, is looking for
*** two fully funded 𝗣𝗵𝗗 𝗦𝘁𝘂𝗱𝗲𝗻𝘁𝘀 𝗮𝗻𝗱/𝗼𝗿 𝗣𝗼𝘀𝘁𝗱𝗼𝗰𝘀 ***
for an exciting project in machine-generated text detection. This is a unique opportunity to join the UKP Lab on the intersection of AI Safety, Natural Language Processing and Machine Learning. If you're excited about shaping the future of Large Language Models, AI Agents, human-AI interaction, building novel prototypes, and publishing at top-tier venues of NLP, ML and AI, we’d love to hear from you.
🔗 More information:
https://www.informatik.tu-darmstadt.de/ukp/ukp_home/jobs_ukp/2025_phd_ukp.e…
📩 Apply here:
https://careers.ukp.informatik.tu-darmstadt.de/ukprecruitment
📅 Application deadline: June 29th, 2025
--------------------------------------------------------------------
Prof. Dr. Iryna Gurevych
UKP Lab
Technical University Darmstadt, Germany
http://www.ukp.tu-darmstadt.de/
Third call for papers Sixth Workshop on Resources for African
Indigenous Language (RAIL)
Co-located with DHASA 2025
https://sadilar.org/rail-2025/
RAIL Workshop date: 10 November 2025
DHASA Conference dates: 10-14 November 2025
Venue: CSIR International Convention Centre.
The sixth RAIL workshop website: https://sadilar.org/rail-2025/
DHASA website: https://digitalhumanities.org.za/
The sixth Resources for African Indigenous Languages (RAIL) workshop
will be co-located with the Digital Humanities Association of Southern
Africa (DHASA) 2025 conference at the CSIR International Convention
Centre in Pretoria, South Africa, on 10 November 2025. The RAIL
workshop is an interdisciplinary platform for researchers working on
African indigenous languages resources such as natural languages
processing (NLP) tools, Human Language Technologies (HLT), data
collections, and annotations. This workshop aims to foster a
scientific community of practice that focuses on computational
linguistic tools and data that are designed for or applied to the
indigenous languages of Africa.
Many African languages are under-resourced while only a few are
considered to be somewhat better resourced. These languages often share
interesting properties such as writing systems, making them different
from most high-resourced languages. From a computational perspective,
these languages lack enough corpora to undertake high level development
of NLP and HLT tools, which in turn impedes the development of African
languages in these areas. During previous workshops, it was noted that
the problems and solutions presented were not only applicable to
African languages but were also relevant to many other low-resource
languages across the world. Because these languages share similar
challenges, this workshop provides researchers with opportunities to
work collaboratively on issues of language resource development and
learn from each other.
The RAIL workshop has several aims. First, the workshop brings together
researchers who work on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances sharing of knowledge on the development of
low-resource languages. Finally, it enables discussions on how to
improve the quality as well as availability of the resources.
The workshop has “Language resources in the age of large language
models” as its theme, but submissions on any topic related to
properties of African indigenous languages (including related non-
African languages) may be accepted. Suggested topics include (but are
not limited to) the following:
* Digital representations of linguistic structures
* Descriptions of corpora or other data sets of African indigenous
languages
* Building resources for (under-resourced) African indigenous languages
* Developing and using African indigenous languages in the digital age
* Effectiveness of digital technologies for the development of African
indigenous languages
* Revealing unknown or unpublished existing resources for African
indigenous languages
* Developing desired resources for African indigenous languages
* Improving quality, availability and accessibility of African
indigenous language resources
Submission requirements:
We invite papers on original, unpublished work related to the topics of
the workshop. Submissions, presenting completed work, may consist of up
to eight (8) pages of content plus additional pages of references. The
final camera-ready version of accepted long papers are allowed one
additional page of content (up to 9 pages) so that reviewers’ feedback
can be incorporated. Papers should be formatted according to the DHASA
style sheet which is provided on the Journal of the Digital Humanities
Association of Southern Africa website
(https://upjournals.up.ac.za/index.php/dhasa/about). Reviewing is
double-blind, so make sure to anonymise your submission (e.g., do not
provide author names, affiliations, project names, etc.) Limit the
amount of self citations (anonymised citations should not be used). The
RAIL workshop follows the DHASA submission requirements.
Please submit papers in PDF format (the submission link will be
available soon). Accepted papers will be published in proceedings
linked to the DHASA conference.
Important dates:
Submission deadline: 14 July 2025
Date of notification: 16 September 2025
Camera ready copy deadline: 24 October 2025
Workshop: 10 November 2025
DHASA conference: 10 November 2025-14 November 2025
Organising Committee
Rooweither Mabuya, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Muzi Matfunjwa, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Mmasibidi Setaka, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources
(SADiLaR), South Africa
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
Third call for papers DHASA Conference 2025
https://dh2025.digitalhumanities.org.za
Theme: The role of humanities in digital humanities and artificial
intelligence
The Digital Humanities Association of Southern Africa (DHASA) is
pleased to announce its fifth conference, focusing on the theme The
role of humanities in digital humanities and artificial intelligence.
In a region where the field of Digital Humanities is still relatively
underdeveloped, this conference aims to address this gap and foster
growth and collaboration in the field. The conference offers an
opportunity for researchers interested in showcasing their work in the
broad field of Digital Humanities to come together. By doing so, the
conference provides a comprehensive overview of the current state-of-
the-art in Digital Humanities, particularly within the Southern Africa
region. As such, we welcome submissions related to Digital Humanities
research conducted by individuals from Southern Africa or research
focused on the geographical area of Southern Africa in the broad sense.
Furthermore, the conference serves as a platform for information
sharing and networking among researchers passionate about Digital
Humanities. By bringing together experts working on Digital Humanities
in Southern Africa or with a focus on Southern Africa, we aim to
promote collaboration and facilitate further research in this dynamic
field. In addition to the main conference, affiliated workshops and
tutorials will be organised, providing researchers with valuable
insights into novel technologies and tools. These supplementary events
are designed for researchers interested in specific aspects of Digital
Humanities or seeking practical information to enter or advance their
knowledge in the field.
The DHASA conference welcomes interdisciplinary contributions from
researchers in various domains of Digital Humanities, including, but
not limited to, language, literature, visual art, performance and
theatre studies, media studies, music, history, sociology, psychology,
language technologies, library studies, philosophy, methodologies,
software and computation, AI, and more. Our goal is to cultivate an
inclusive scientific community of practice within Digital Humanities.
Suggested topics include the following:
* The role of AI in digital humanities, the role of Digital Humanities
in shaping AI, and the broader role of the humanities in both AI and DH
projects;
* Digital archives and the preservation of marginalised voices;
* Intersectionality and the digital humanities: exploring the
intersections of race, gender, sexuality, culture, and class in digital
research and activism;
* Activism and social change through digital media: how digital
humanities tools and methodologies can be used to promote inclusion;
* Engaging marginalised communities in the creation and use of digital
tools, resources, and AI;
* Exploring the role of digital humanities in decolonising knowledge
and promoting indigenous perspectives;
* The ethics of data collection and analysis in digital humanities and
AI research;
* The role of digital humanities and AI in promoting inclusive and
equitable pedagogy;
* Digital humanities and inclusion in the context of African and global
perspectives and international collaborations;
* Critical approaches to digital humanities and inclusion: examining
the limitations and possibilities of digital tools and methodologies in
promoting inclusion; and
* Collaborative digital humanities projects with non-profit
organisations, community groups, and cultural institutions;
* Development of digital and AI tools for supporting digital
humanities;
* Novel utilisation of digital and AI tools for performing digital
humanities research;
* The role of digital humanities in the classroom: reimagining literacy
and AI fluency
* Digital humanities data and project management;
* The role of librarians in the digital humanities project;
* Any other digital humanities-related topic that serves the Southern
African community.
Submission Guidelines
The DHASA conference 2025 asks for three types of submissions:
* Long papers: Authors may submit long papers with a maximum of 8
content pages and unlimited pages for references and appendices. The
final versions of accepted long papers will be granted an additional
page (leading to a total of up to 9 content pages) to incorporate
reviewers' comments. Long papers accepted for the conference will be
presented in 30-minute time slots (which includes 10 minutes for
questions).
* Short papers: Authors may submit short papers with a maximum of 5
content pages and unlimited pages for references and appendices. The
final versions of accepted short papers will be allowed an extra page
(leading to a total of up to 6 content pages) to accommodate reviewers'
comments. Short papers accepted for the conference will be presented in
15-minute time slots (which includes 5 minutes for questions).
* Executive summaries: Authors can submit an executive summary for work
in progress, limited to 1 page. Executive summaries accepted for the
conference will be presented as posters during a dedicated poster
presentation slot.
All accepted long and short paper submissions that are presented at the
conference will be published in the JDHASA journal, see
https://upjournals.up.ac.za/index.php/dhasa. In addition, the executive
summaries for the poster presentations will be published in a book of
executive summaries before the conference.
We particularly encourage student submissions where the first author is
a student.
All submissions should adhere to the ACL style guide:
https://acl-org.github.io/ACLPUB/formatting.html
Submissions should be submitted in PDF format. Submissions that do not
adhere to the prescribed style guide will be rejected.
Follow this link to go to the submission platform:
https://dh2025.digitalhumanities.org.za/submission/
Authors are encouraged to upload their datasets to the SADiLaR
repository: https://repo.sadilar.org/. In case of difficulties
uploading the datasets, please reach out to Benito Trollip
(benito.trollip(a)nwu.ac.za).
Important dates
Submission deadline: 14 July 2025
Date of notification: 16 September 2025
Camera-ready copy deadline: 24 October 2025
Conference: 10 November 2025 - 14 November 2025
Conference venue: CSIR ICC, Pretoria, South Africa
Co-located events
Several co-located events are currently being prepared, including
workshops and tutorials. These will be updated on the conference
website.
Organising Committee
Aby Louw, Council for Scientific and Industrial Research
Andiswa Bukula, South African Centre for Digital Language Resources
Avi Moodley, Council for Scientific and Industrial Research
Franco Mak, Council for Scientific and Industrial Research
Franziska Pannach, Rijksuniversiteit Groningen
Ilana Wilken, Council for Scientific and Industrial Research
Johannes Sibeko, Nelson Mandela University
Juan Steyn, South African Centre for Digital Language Resources
Laurette Marais, Council for Scientific and Industrial Research
Marissa Griesel, South African Centre for Digital Language Resources
Menno van Zaanen, South African Centre for Digital Language Resources
Privolin Naidoo, Council for Scientific and Industrial Research
Sthembiso Mkhwanazi, Council for Scientific and Industrial Research
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
WINLP 2025 WORKSHOP
The Widening NLP (WiNLP) workshop aims to foster an inclusive
environment that highlights the contributions of researchers from
underrepresented groups in NLP. Anyone who self-identifies as being from
an underrepresented background--based on gender, ethnicity, nationality,
sexual orientation, disability, or otherwise--is encouraged to submit.
In 2025, WiNLP will continue placing emphasis on access, disability, and
diversity across scientific backgrounds, disciplines, training, and
underrepresented languages.
Our annual Widening Natural Language Processing Workshop (WiNLP) will be
held in conjunction with EMNLP 2025 in Suzhou, China. Since EMNLP is
anticipating a hybrid format for their conference, we also anticipate
our workshop will be hybrid, with both online and in-person attendees.
The one-day workshop will occur during EMNLP's workshop period with an
exact date to be announced soon.
The full-day event includes invited talks, oral presentations, and
poster sessions. The workshop provides an excellent opportunity for
junior members in the community to showcase their work and connect with
senior mentors for feedback and career advice. It also offers
recruitment opportunities with leading industrial labs. Most
importantly, the workshop will provide an inclusive and accepting
space, and work to lower structural barriers to joining and
collaborating with the NLP community at large.
Information on Submission guidelines at:
https://www.winlp.org/call-for-submissions-2025/
PRE-SUBMISSION MENTORSHIP PROGRAM
WiNLP offers an optional pre-submission mentorship program to help
authors improve the quality of their writing and presentation before
final submission. The program focuses on enhancing the clarity and
structure of the paper, not critiquing the research content.
* Submission: Authors must submit a draft of their paper via the
designated Google Form (https://forms.gle/J33K2ea6VruN82ke9) by June 20,
2025. The draft should adhere to the same formatting and length
guidelines as final submissions.
* Mentor Assignment: Organizers will check the draft for compliance
with formatting requirements before assigning a mentor. The mentor will
not be involved in reviewing the final submission.
* Feedback: Mentors will provide feedback by July 18, 2025, offering
suggestions to improve writing and presentation. Authors are encouraged
to incorporate this feedback before the final submission deadline.
* Non-Anonymous: The mentorship process is not anonymized.
* Final Submission: Authors who participate in the mentorship program
should submit their final paper as a new submission via OpenReview by
August 1st, 2025 to be considered for WiNLP workshop. Participation in
the mentorship program is not a prerequisite for submitting a paper to
WiNLP.
TRAVEL SUPPORT
WiNLP offers a limited number of travel grants to support one author per
accepted submission. Grants may cover expenses such as registration,
travel, lodging, or visa costs. Funded authors may choose to attend
virtually if preferred.
* Travel grant application deadline: September 26, 2025
* Notification: October 6, 2025
* Eligibility: One author per accepted submission is eligible. The
funded author must be identified in the travel grant application form.
Additional funding for virtual attendance by other authors may be
considered if surplus funds are available, but in-person attendance for
additional authors is not guaranteed. Travel expenses are handled via
reimbursement (primarily through USD check or PayPal). Authors unable to
front travel costs should contact the organizers early to discuss
alternatives.
Authors are encouraged to explore local funding options (e.g.,
institutional support) to maximize the reach of WiNLP's limited funds.
We recommend additional student authors keep an eye out for the EMNLP
call for student volunteers or call for D&I subsidies as opportunities
for further funding.
IMPORTANT DATES
All deadlines are 11:59 PM UTC-12:00 "Anywhere on Earth"
* Pre-submission mentoring deadline: June 20, 2025
* Pre-submission feedback returned: July 18, 2025
* Paper submission deadline: August 1, 2025
* Acceptance notifications: September 15, 2025
* Camera-ready deadline: October 1, 2025
* Travel grant applications due: September 26, 2025
* Travel grant notifications: October 6, 2025
CONTACT INFORMATION
Website: https://www.winlp.org/call-for-submissions-2025/
Twitter: @winlpworkshop [1]
Facebook: Widening NLP [2]
LinkedIn: Widening NLP [3]
E-mail: winlp-chairs(a)googlegroups.com
Links:
------
[1] https://twitter.com/WiNLPWorkshop
[2] https://www.facebook.com/WideningNLP
[3] https://www.linkedin.com/company/winlp
[CFP] - (R2LM) From Rules to Language Models: Comparative Performance Evaluation @ RANLP 2025 (Varna, Bulgaria) - 11-13 September 2025
https://r2lm2025.github.io/R2LM/
Workshop Description
Deep learning (DL) and large language models (LLMs) have driven major advances in natural language processing (NLP), enabling impressive performance across many tasks. However, they continue to face key challenges in handling complex linguistic phenomena such as multiword expressions, long-context reasoning, and robustness to adversarial inputs. In parallel, concerns remain about the scalability, interpretability, and domain adaptability of these models, particularly in applications requiring high precision, such as grammar checking, legal analysis, or medical NLP. These limitations have sparked renewed interest in rule-based and knowledge-based approaches, which often offer better explainability and remain competitive, especially in low-resource or high-stakes scenarios.
Our workshop aims to gather contributions that deal with the following topics:
• Role of rule-based and knowledge-based NLP methods in modern applications
• Comparative analysis of rule-based, machine-learning, deep-learning and large language models for different NLP tasks
• Emerging trends in NLP research beyond deep learning and Large Language Models
• Limitations and performance bottlenecks in scalability and accuracy of deep learning models
Submission Details
• Long papers: up to 8 pages (excluding references)
• Short papers: up to 4 pages (excluding references)
• Format: ACL-style (LaTeX or MS Word)
• Submission portal and template info available on the RANLP 2025 website
Important dates
Paper Submission Deadline: 6 July 2025
Notification of Acceptance: 31 July 2025
Workshop date: 11, 12 or 13 September 2025
Organising Committee:
Alicia Picazo-Izquierdo, University of Alicante, Spain
Ernesto Luis Estevanell-Valladares, University of Alicante, Spain
Rafael Muñoz Guillena, University of Alicante, Spain
Ruslan Mitkov, Lancaster University, UK
Raúl García Cerdá, University of Alicante, Spain