[CFP] - (R2LM) From Rules to Language Models: Comparative Performance Evaluation @ RANLP 2025 (Varna, Bulgaria) - 11-13 September 2025
https://r2lm2025.github.io/R2LM/
Workshop Description
Deep learning (DL) and large language models (LLMs) have driven major advances in natural language processing (NLP), enabling impressive performance across many tasks. However, they continue to face key challenges in handling complex linguistic phenomena such as multiword expressions, long-context reasoning, and robustness to adversarial inputs. In parallel, concerns remain about the scalability, interpretability, and domain adaptability of these models, particularly in applications requiring high precision, such as grammar checking, legal analysis, or medical NLP. These limitations have sparked renewed interest in rule-based and knowledge-based approaches, which often offer better explainability and remain competitive, especially in low-resource or high-stakes scenarios.
Our workshop aims to gather contributions that deal with the following topics:
• Role of rule-based and knowledge-based NLP methods in modern applications
• Comparative analysis of rule-based, machine-learning, deep-learning and large language models for different NLP tasks
• Emerging trends in NLP research beyond deep learning and Large Language Models
• Limitations and performance bottlenecks in scalability and accuracy of deep learning models
Submission Details
• Long papers: up to 8 pages (excluding references)
• Short papers: up to 4 pages (excluding references)
• Format: ACL-style (LaTeX or MS Word)
• Submission portal and template info available on the RANLP 2025 website
Important dates
Paper Submission Deadline: 6 July 2025
Notification of Acceptance: 31 July 2025
Workshop date: 11, 12 or 13 September 2025
Organising Committee:
Alicia Picazo-Izquierdo, University of Alicante, Spain
Ernesto Luis Estevanell-Valladares, University of Alicante, Spain
Rafael Muñoz Guillena, University of Alicante, Spain
Ruslan Mitkov, Lancaster University, UK
Raúl García Cerdá, University of Alicante, Spain
The Marseille Computer Science and Systems Laboratory (https://www.lis-lab.fr/) is seeking a candidate for a three-year thesis grant as part of the ANR Cre@lame project, in collaboration with the University of Turku in Finland.
The subject concerns the modeling of the literary writing and revision process carried out by authors. The starting point is an already written text, which is to be revised in the manner of an author. The problem is seen as a problem of predicting edit operations, taking the original text as input and producing edit operations. These can concern the lexicon, syntax or textual organization.
The thesis's problem is structured around three directions.
The first is the nature of the object produced by the prediction process, which could take the form of a sequence of edit operations or a more complex form, such as a graph. The prediction model itself will depend largely on the nature of the predicted object.
The second concerns data. Revision data, which associates revision operations with a text, is not very common in general, and those concerning literary revision are even less so. We will rely on all available data available and, possibly, produce them using language models, in order to train the revision models.
The final direction concerns evaluation. Given an original text and a revised version, how can we judge the quality of the latter? And how can we assess that the changes made to the original text are consistent with the objectives of the revision process.
We are looking for candidates with a strong background in machine learning, mainly in deep learning, as well as knowledge in Natural Language Processing.
Application deadline: June 22
Contacts: Patrice Bellot (patrice.bellot(a)univ-amu.fr), Christophe Leblay (chrleb(a)utu.fi) and Alexis Nasr (alexis.nasr(a)univ-amu.fr)
Dear Colleagues,
The evaluation period for the brand new Model Compression track
<https://www2.statmt.org/wmt25/model-compression.html> at WMT 2025
<https://www2.statmt.org/wmt25/index.html> is approaching!
LATEST ANNOUNCEMENTS:
-
Test data release brought forward to June 19, 2025! Participants now
have two full weeks to prepare their submissions.
-
Submission upload space available upon request (see the task’s page for
details)
OVERVIEW
This shared task aims to evaluate the potential of model compression
techniques in reducing the size of large, general-purpose language models,
with the goal of achieving an optimal balance between practical
deployability and high translation quality in specific machine translation
(MT) scenarios. The broader objectives of the task include:
-
fostering research into efficient, accessible, and sustainable
deployment of LLMs for MT;
-
establishing a common evaluation framework to monitor progress in model
compression across a wide range of languages; and
-
enabling meaningful comparisons with state-of-the-art MT systems through
standardized evaluation protocols that assess not only translation quality
but also efficiency.
Although the focus is on model compression, the task is closely aligned
with the General MT shared task
<https://www2.statmt.org/wmt25/translation-task.html>, sharing language
directions, test data, and protocols for automatic MT quality evaluation.
Additionally, the task follows the same timeline as the flagship WMT task.
We warmly invite participation from academic teams and industry players
interested in applying existing compression methods to MT or exploring
innovative, cutting-edge approaches.
THE TASK IN A NUTSHELL
-
Goal: Reduce the size of a general-purpose LLM while maintaining a
balance between model compactness and MT performance.
-
Languages: The first round will focus on the same language pairs as the
General MT track.
-
Conditions:
-
Constrained: Participants work within a predefined model and language
setting for directly comparable results.
-
Unconstrained: Participants are free to compress any model across
language directions of their choice.
-
Evaluation Criteria:
-
Translation quality: Automatically measured using the LLM-as-a-judge
framework from the General MT task
-
Model size: Defined by the memory usage
-
Inference speed: Measured by total processing time over the test set
IMPORTANT DATES (UPDATED)
-
Test data released: 26th June 2025 19th June 2025
-
Translation submission deadline: 3rd July 2025
-
System description abstract paper: 10th July 2025
-
System description submission: 14th August 2025
WEBSITE: https://www2.statmt.org/wmt25/model-compression.html
ORGANIZERS:
-
Marco Gaido, Fondazione Bruno Kessler
-
Matteo Negri, Fondazione Bruno Kessler
-
Roman Grundkiewicz - Microsoft Translator
-
TG Gowda - Microsoft Translator
CONTACTS:
-
Marco Gaido - mgaido(a)fbk.eu
-
Matteo Negri - negri(a)fbk.eu
--
--
Le informazioni contenute nella presente comunicazione sono di natura
privata e come tali sono da considerarsi riservate ed indirizzate
esclusivamente ai destinatari indicati e per le finalità strettamente
legate al relativo contenuto. Se avete ricevuto questo messaggio per
errore, vi preghiamo di eliminarlo e di inviare una comunicazione
all’indirizzo e-mail del mittente.
--
The information transmitted is
intended only for the person or entity to which it is addressed and may
contain confidential and/or privileged material. If you received this in
error, please contact the sender and delete the material.
*🎓 *We are happy to announce the next webinar in the CIRCE online
seminar series organized by the CIRCE <https://www.circe-project.eu/>
project in collaboration with DFCLAM University of Siena
<https://www.dfclam.unisi.it/en>, H2IOSC <https://www.h2iosc.cnr.it/>
project and CNR-ILC <https://www.ilc.cnr.it/en/>.
*Dr. Giuliana Regnoli*
/University of Salerno, Italy & University of Regensburg, Germany/
*/Unveiling linguistic bias: Approaches to accent perception and
discrimination/*
📅 *May 26, 2025*
🕓 *4:40 PM – 5:30 PM (CEST)*
*Venue*: Online
*Attendees*: Secondary school teachers, researchers, language instructors
*Summary: *Accent discrimination remains one of the most pervasive forms
of linguistic bias, influencing social perceptions, identity
construction, and attitudes towards language variation. This talk
examines how accents shape linguistic hierarchies and social
interactions, drawing on three research projects that employ distinct
methodologies. First, we will explore how folk linguistic methods, such
as map-drawing tasks, reveal nuanced spatial dimensions of language
attitudes, challenging homogenising conceptualisations of World
Englishes. This will be illustrated through a study on how a
first-generation Indian diasporic community in Germany perceives and
evaluates accent variation in Indian English. We will then turn to
traditional language attitude research methods, focusing on
questionnaire data to investigate overt stigmatisations and highlighting
the importance of scale validation in direct attitude measurement. This
discussion will be grounded in a pilot study on Italian university
students’ direct attitudes towards English in Italy and their
perceptions of Italian English. Finally, we will examine language
attitudes in primary education in Cameroon, emphasising the importance
of understanding children’s language perceptions within broader
ideological frameworks. This analysis draws on data from parental and
children’s questionnaires, as well as semi-structured interviews with
children. By shedding light on early linguistic gatekeeping and its role
in decolonising language education, this study also explores when and
how these beliefs become embedded in society. Taken together, these
projects demonstrate how different methodological approaches can be
employed to investigate attitudes towards accents and linguistic
variation, ultimately providing insights into how we can better
understand and tackle accent discrimination.
*Bio*: Dr. Giuliana Regnoli is assistant professor of English
linguistics at the University of Salerno and a postdoctoral research
fellow at the University of Regensburg. Her research interests include
variationist sociolinguistics, sociophonetics, language attitudes,
perceptual dialectology, and World Englishes. She is currently working
on children's English in Cameroon and Italian university students'
attitudes toward English(es) world-wide.
Upcoming webinars:
- Clara Molina (Monday, June 30, 2025)
- Sender Dovchin (Monday, July 7, 2025)
- Christian Ilbury (Monday, September 22, 2025)
The seminar is free of charge, but participants must register. To access
this and next events, you should create an account on theH2IOSC Training
Environment
<https://h2iosc-training-platform.ilc4clarin.ilc.cnr.it/registration>.
Once logged in with your credentials, choose the course “Language and
Accent Discrimination - Online Seminar Series” and activate it with the
code PbK837GtE. Make sure to have the Teams platform installed.
The registrations of the previous CIRCE Seminars are also available on
the H2IOSC Training Environment. For any inquiry, write to
contact(a)circe-project.eu.
*BAREC Shared Task 2025
<https://urldefense.proofpoint.com/v2/url?u=https-3A__barec.camel-2Dlab.com_…>*
*Arabic Readability Assessment*
*The Third Arabic Natural Language Processing Conference (ArabicNLP 2025)
<https://urldefense.proofpoint.com/v2/url?u=https-3A__arabicnlp2025.sigarab.…>*
@
*EMNLP 2025
<https://urldefense.proofpoint.com/v2/url?u=https-3A__2025.emnlp.org_&d=DwMF…>*
We are excited to announce the BAREC Shared Task
<https://urldefense.proofpoint.com/v2/url?u=https-3A__barec.camel-2Dlab.com_…>
2025
on fine-grained readability classification across 19 levels using the
Balanced Arabic Readability Evaluation Corpus (BAREC), a dataset of over 1
million words. Participants will build models for both sentence- and
document-level classification.
*Task 1: Sentence-level Readability Assessment*
Given an Arabic sentence, predict its readability level on a scale from 1
(i.e., first grade) to 19 (i.e., university level), indicating the degree
of reading difficulty.
*Task 2: Document-level Readability Assessment*
Given a document consisting of multiple sentences, predict its readability
level on a scale from 1 to 19, where the hardest (i.e., highest
readability) sentence in the document determines the overall document
readability level.
For each task, there will be three tracks, allowing different data sources
for training: Strict, Constrained, and Open.
*Important Dates:*
All deadlines are 11:59pm UTC-12 (anywhere on Earth):
- *June 10, 2025:* Release of training, dev and open test data, and
evaluation scripts.
- *July 20, 2025:* Registration deadline and release of test data.
- *July 25, 2025:* End of evaluation cycle (test set submission closes).
- *July 30, 2025: *Final results released.
- *August 15, 2025:* System description paper submissions due.
- *August 25, 2025: *Notification of acceptance.
- *September 5, 2025:* Camera-ready versions due.
*Awards:*
- *Top-performing Systems:*
- We will recognize the top-performing system in each of the two
tasks + track combinations (2 tasks × 3 tracks), with a *$100 *prize
per winning team.
- *Best System Description Papers:*
- We will award one or two prizes for Best System Description Papers.
These will recognize clarity, reproducibility, and insight, regardless of
leaderboard ranking:
- Best Paper: *$250*
- Runner-up or Honorable Mention: *$150*
*Organizers:*
- *Khalid N. Elmadani
<https://urldefense.proofpoint.com/v2/url?u=https-3A__khalid-2Delmadani.gith…>*:
New York University Abu Dhabi
- *Bashar Alhafni
<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.basharalhafni.com_…>*:
New York University Abu Dhabi and Mohamed bin Zayed University of
Artificial Intelligence
- *Hanada Taha-Thomure
<https://urldefense.proofpoint.com/v2/url?u=https-3A__hanadataha.com_&d=DwMF…>*:
Zayed University
- *Nizar Habash
<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nizarhabash.com_&d…>*:
New York University Abu Dhabi
*Shared Task Website: *https://barec.camel-lab.com/sharedtask2025
*Contact:*
For any questions related to this task, check out the *FAQs
<https://urldefense.proofpoint.com/v2/url?u=https-3A__barec.camel-2Dlab.com_…>*.
Feel free to post your questions on our *Slack workspace
<https://urldefense.proofpoint.com/v2/url?u=https-3A__join.slack.com_t_barec…>*.
You are also welcome to contact the organizers directly at this email
address: barec25.organizers(a)camel-lab.com.
Dear Corpora members,
following our previous Calls for Papers, we would like to inform you of
a small update to the submission policies:
*The maximum length of supplementary material has been extended to 3 pages.*
For any additional updates regarding the conference, please visit the
website: https://clic2025.unica.it/
The full text of the updated CfP can be found below.
Best regards,
The CLiC-it chairs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CLiC-it 2025 – Eleventh Italian Conference on Computational Linguistics
24 – 26 September 2025, Cagliari, Italy
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Over the years, CLiC-it has evolved into an important forum for the
Italian community of researchers in Computational Linguistics (CL) and
Natural Language Processing (NLP). CLiC-it aims to promote and
disseminate high-quality, original research covering different aspects
of automatic language processing, involving both written and spoken
language. Furthermore, it seeks to showcase cutting-edge theoretical
findings, experimental methodologies, technologies, and application
perspectives.
The spirit of the conference is inclusive. Recognizing the multifaceted
nature of language phenomena and the need for interdisciplinary
expertise, CLiC-it aims to bring together researchers from different
fields including Computational Linguistics and Natural Language
Processing, Linguistics, Cognitive Science, Machine Learning, Computer
Science, Knowledge Representation, Information Retrieval, and Digital
Humanities. CLiC-it welcomes contributions focusing on all languages,
with a particular emphasis on Italian.
CLiC-it 2025 will be held in Cagliari, from the 24th to the 26th of
September. CLiC-it is organised by the Italian Association of
Computational Linguistics (AILC — http://www.ai-lc.it/).
➢ CONFERENCE TOPICS
CLiC-it 2025 aims to have a broad technical program. Relevant topics for
the conference include, but are not limited to (in alphabetical order):
Computational Historical Linguistics
Computational Social Science and Cultural Analytics
Dialogue and Interactive Systems
Discourse and Pragmatics
Ethics and NLP
Generation
Handwritten Text Recognition
Information Extraction
Information Retrieval and Text Mining
Interpretability and Analysis of Models for NLP
Language Grounding to Vision, Robotics and Beyond
Large Language Models
Linguistic Diversity
Linguistic Theories, Cognitive Modeling, and Psycholinguistics
Machine Learning for NLP
Machine Translation
Multilingualism and Cross-Lingual NLP
NLP Applications
NLP for the Humanities
Phonology, Morphology, and Word Segmentation
Pragmatics and Creativity
Question Answering
Resources and Evaluation
Semantics: Lexical, Sentence-level Semantics, Textual Inference,
and Other Areas
Sentiment Analysis, Stylistic Analysis, and Argument Mining
Speech and Multimodality
Summarization
Syntax: Tagging, Chunking and Parsing
➢ RESEARCH COMMUNICATION
CLiC-it 2025 adopts a parallel submission policy for outstanding papers
accepted in 2024 and 2025 by major publication venues, namely the major
international CL conferences (workshops excluded) or international
journals. These contributions can be submitted to CLiC-it 2025 as short
research communications. Research communications will not be published
in the conference proceedings, they serve primarily to promote the
dissemination of high-quality research within the Italian CL community.
Submitted research communications must be in the scope of the CLiC-it
2025 conference.
The authors of papers that meet the above criteria are invited to submit
a written (maximum) one-page abstract of the original paper, including
the paper’s title and authors as well as a pointer to the original
conference or journal where the paper was published. If needed, research
communications will undergo a selection process overseen by the
conference chairs. Since these papers have already been reviewed, the
selection criteria will primarily consider their original publication
venue. Priority will be granted to papers that align most closely with
the conference program, ensuring a balanced representation across
various conference topics. The research communication papers will be
presented at the conference either orally or as a poster according to
the number of submissions received.
➢ PAPER SUBMISSION
Submitted papers must describe substantial, original, completed, and
unpublished work. Wherever appropriate, concrete evaluation and analysis
should be included.
CLiC-it 2025 allows for a multiple submission policy. In case of
acceptance of the paper in other venues, the authors must communicate
this information to the CLiC-it 2025 Chairs as soon as possible.
Papers may consist of at least six (6) and no more than eight (8) pages
of content and up to three (3) pages of references.
*UPDATE* Supplementary material is also allowed, but it should not
exceed *three (3) pages* in length. Authors are reminded that all
relevant content should be included in the main text of the paper and
that reviewers are not required to evaluate material presented in the
Appendix. In case additional space is needed (e.g. to include prompts,
examples, etc.), external links can be used. Please also note that
sections on limitations and ethical considerations are not mandatory; if
included, they will count toward the page limit.
Upon acceptance, final versions of the papers will be given one
additional page of content so that reviewers’ comments can be taken into
account.
Papers will be evaluated according to the following criteria:
soundness of approach
relevance to computational linguistics
novelty and clarity of relation with related work
quality of presentation
quality of evaluation (if applicable)
verifiability and ability to replicate (if applicable)
Papers can be either in English or Italian, with the abstract in
English. Accepted papers will be published on-line and will be presented
at the conference either orally or as a poster.
All accepted papers must be presented at the conference to appear in the
proceedings.
Reviewing will NOT be blind, so there is no need to remove author
information from manuscripts.
The required template for CLiC-it submissions must be compatible with
CEUR (https://ceur-ws.org/). You can download the conference-adapted
version at the following links:
LaTeX template
https://clic2025.unica.it/wp-content/uploads/2025/02/CLiC-it-2025-template.…
Word template
https://clic2025.unica.it/wp-content/uploads/2025/02/CLiC_it_2025_template.…
Should you encounter any issues with the compilation (as the CEUR
template has historically presented some challenges and is not
modifiable without risking exclusion from the proceedings), we provide a
read-only Overleaf template:
https://www.overleaf.com/read/hzyckyjzwhwb#06b27c
This template can be accessed and cloned to help resolve any technical
difficulties.
Papers and research communications must be submitted through the START
platform using the following link: https://softconf.com/p/clic-it2025
For research communications, the appropriate track should be selected.
➢ AWARDS
To acknowledge the contribution of young researchers to the field, the
title of “best paper” will be awarded to outstanding papers, provided
that a Master’s or PhD student is the first author and presents the work
at the conference. Recipients of this award will be invited to submit an
extended version of their papers to the Italian Journal of Computational
Linguistics (IJCoL).
To recognise excellence in student research as well as promote awareness
of our field, AILC is also conferring the “Emanuele Pianta” prize for
the best Master Thesis (Laurea Magistrale) in Computational Linguistics
submitted at an Italian University. The prize consists of 500 Euros plus
free membership to AILC for one year and free registration to the
upcoming CLiC-it.
➢ IMPORTANT DATES
09/06/2025 16/06/2025 [EXTENDED]– Paper submission deadline:
regular papers and research communications
21/07/2025 – Notification to authors of reviewing/selection outcome
04/08/2025 – Camera ready version of accepted papers
24-26/09/2025 – CLiC-it 2025 Conference, Cagliari
➢ PEOPLE
Conference Chairs:
Cristina Bosco (University of Torino)
Elisabetta Jezek (University of Pavia)
Marco Polignano (University of Bari)
Manuela Sanguinetti (University of Cagliari)
Senior Program Committee:
Elisa Bassignana (IT University of Copenhagen)
Pierluigi Cassotti (University of Gothenburg)
Simone Conia (University of Rome “La Sapienza”)
Elisa Di Nuovo (Joint Research Centre European Commission – Ispra)
Claudiu Daniel Hromei (University of Rome “Tor Vergata”)
Antonio Origlia (University of Naples “Federico II”)
Ludovica Pannitto (University of Bologna)
Beatrice Savoldi (Fondazione Bruno Kessler)
Gabriele Sarti (University of Groningen)
Lucia Siciliani (University of Bari)
Irene Siragusa (University of Palermo)
Rossella Varvara (University of Turin – University of Pavia)
Alessandro Vietti (University of Bolzano)
Local Organizing Committee:
Maurizio Atzori (DMI, University of Cagliari)
Andrea Loddo (DMI, University of Cagliari)
Alessandro Pani (DMI, University of Cagliari)
Alessandra Perniciano (DMI, University of Cagliari)
Luca Zedda (DMI, University of Cagliari)
Web chairs:
Maurizio Atzori
Andrea Loddo
➢ FURTHER INFORMATION
Mail: clicit2025cagliari(a)gmail.com
Key Deadlines:
Abstract submission for posters: June 15
Registration: June 30
Registrations are still open for WALP 2025! Places are filling up quickly—register now to secure your on-site accommodation.
Dear Colleagues,
WALP 2025 will bring together leading academic and industrial experts from all around the world working at the forefront of atomic layer processing (ALP), aiming to stimulate discussions on recent advances and emerging directions in ALP, particularly those driven by industrial and societal needs. The workshop will cover experimental, computational, and AI/ML-driven research in the development of ALP techniques, materials and their applications in semiconductor CMOS, energy storage (batteries, supercapacitors), and energy conversion (fuel cells, photovoltaics) technologies. WALP 2025 will host industry talks on ALP method development and scale-up for commercial energy and nanoelectronics applications.
WALP 2025 features a distinguished lineup of speakers from academia and industry, including
* Mikko Ritala (University of Helsinki, Finland)
* Louis Piper (WMG, University of Warwick, UK)
* Fred Roozeboom (University of Twente, the Netherlands)
* Anjana Devi (TU Dresden, Germany)
* Jeff Elam (Argonne National Lab, US)
* Seán Barry (University of Carleton, Canada)
* Riikka Puurunen (University of Aalto, Finland)
* Ralf Tonner-Zech (University of Leipzig, Germany)
* Richard Potter (university of Liverpool, UK)
* Jennifer D'Souza (Leibniz Universität Hannover, Germany)
* Industry talks by Merck Electronics KGaA, BioLogic Ltd., Schrödinger, Inc. Oxford Instruments Plasma Technologies, Entalpic, Hitachi Energy, Forge Nano, and ATLANT 3D.
WALP 2025 offers an excellent opportunity to network, exchange ideas, and explore collaborative research prospects in an inspiring academic and industrial setting. Delegates have the option to submit abstracts for poster presentations, with select submissions considered for short contributed talks.
The registration is still open, and the registration fees for academics are £250 (excluding accommodation) and £350 (including on-site accommodation for the nights of July 21st and 22nd). Registration fees cover all meals and refreshments during the conference and poster session. There is a limited number of bursaries available for students.
Please visit the WALP 2025 webpage for detailed information, registration and abstract submission: https://warwick.ac.uk/fac/sci/chemistry/chemevents/walp2025/
We gratefully acknowledge the valuable financial support to WALP 2025 by Merck (EMD) Electronics, Schrödinger, BioLogic, and Royal Society of Chemistry.
We look forward to welcoming you to WALP 2025.
Best wishes,
Bora Karasulu
Also on behalf of Dr. Adrie Mackus and Prof. Erwin Kessels
WALP 2025 is jointly organised by Warwick Chemistry and the Eindhoven University of Technology (TU/e, Netherlands).
SEMANTiCS 2025 EU
21st International Conference on Semantic Systems
Vienna, Austria
September 3 - 5, 2025
Follow us on *Twitter/X* <https://x.com/SemanticsConf>, *LinkedIn*
<https://www.linkedin.com/groups/7496190/?highlightedUpdateUrn=urn%3Ali%3Agr…>,
and *Bluesky*. <https://bsky.app/profile/semantics-conf.bsky.social>
Call for Posters & Demos
The Posters & Demos Track provides a platform for researchers to showcase
their latest findings, ongoing projects, and cutting-edge work in progress.
These include submissions on innovative applications, latest results,
unpublished ideas, prototypes of semantic technologies and their use in
various domains as well as applications, use cases, or pieces of code that
may attract developers and potential research or business partners. This
also concerns new datasets made publicly available.
The Posters & Demos Track offers an informal setting that promotes
engagement and dialogue between presenters and attendees. These discussions
can provide valuable feedback for presenters' future work while allowing
participants to gain insight into emerging research trends and network with
other researchers.
*The submission deadlines for the Posters & Demos Track have been extended
as follows:*
-
*Paper Submission Deadline: July 4, 2025*
-
*Notification of Acceptance: July 21, 2025 *
-
*Camera-Ready of Paper Deadline: July 28, 2025*
*All deadlines are set for 11:59 pm, Anywhere On Earth time (UTC-12)*
*Submission via Easychair on*
*https://easychair.org/conferences/?conf=semantics2025*
<https://easychair.org/conferences/?conf=semantics2025>.
Proceedings of SEMANTiCS 2025 EU will be made available open access by *
CEUR-WS.org*.
Topics of Interest We welcome contributions in the context of
semantic-based research and systems, which address – but are not limited to
– the topics of the Research Track
<https://2025-eu.semantics.cc/page/cfp_rev_rep>. Additionally, we encourage
submissions of visionary ideas, position statements, negative results, and
unconventional ideas. Demos should showcase innovative implementations and
technologies both, from academia and industry. We also very much encourage
submissions from industry, but they should be focused on presenting a novel
solution to a specific problem and not be in the nature of an advertisement
or commercial product description. Author Guidelines and Submission Poster
and demo submissions should consist of a paper that describes the work, its
contribution to the field or innovative aspects.
-
Poster and demo submissions are at most 5 pages long, including
references.
-
No double-blind submissions required.
-
Submissions must be either in PDF or HTML.
-
Submissions must be formatted in the style of CEUR-ART (
https://ceur-ws.org/HOWTOSUBMIT.html). An Overleaf page for LaTeX users
is available.
-
For demos, we ask authors to include links enabling the reviewers to
test the application or review the component. The absence of a pointer
affects the overall rating of the contribution.
-
Submissions must be original and must not have been submitted for
publication elsewhere.
-
At least one author of each accepted paper must register for the
conference and present the paper.
Posters and Demos Track Chairs
Ivan Heibi
Diego Collarana
Kind Regards,
On behalf of the organising committee.
=========================
Dr. Kossi Amouzouvi
ScaDS.AI Dresden/Leipzig, TU Dresden
--
DISCLAIMER: The contents of this email and any attachments are
confidential. They are intended for the named recipient(s) only. If you
have received this email by mistake, please notify the sender immediately
and you are herewith notified that the contents are legally privileged and
that you do not have permission to disclose the contents to anyone, make
copies thereof, retain or distribute or act upon it by any means,
electronically, digitally or in print. The views expressed in this
communication may be of a personal nature and not be representative of
AIMS-NEI and/or any of its Centres or Initiatives.
We are excited to announce the 2nd edition of the Open Language Data Initiative shared task at WMT25, co-located with EMNLP 2025.
**TASK DESCRIPTION**
The primary goal of this shared task is to expand OLDI’s open datasets to more languages. We are soliciting contributions to the following:
- The MT evaluation dataset FLORES+.,
- The MT Seed dataset.,
- Other high-quality, massively-parallel and open-source datasets.,
Contributions may consist of either the addition of entirely new languages, varieties or dialects to the above datasets, or substantial improvements to existing datasets. To describe and publicise their contributions, task participants will be asked to submit a 4-6 page paper to be presented at the WMT 2025 conference.
**IMPORTANT DATES**
All dates follow WMT/EMNLP.
- Paper and data submission deadline: 14 August,
- Notification of acceptance: 13 September,
**MORE INFORMATION**
- Shared task website: https://www2.statmt.org/wmt25/open-data.html,
- OLDI website: https://oldi.org/
Dear colleagues,
We are pleased to announce the first call for papers of the
*1st Workshop on Multilingual Data Quality Signals at COLM 2025*
Important information:
🗓️ CfP Deadline: June 23, Workshop: October 10
📍 Montréal, Canada
🌐 https://wmdqs.org
Scope
Recent research has shown that large language models (LLMs) not only need large quantities of data, but also need data of sufficient quality. Ensuring data quality is even more important in a multilingual setting, where the amount of acceptable training data in many languages is limited. Indeed, for many languages even the fundamental step of language identification remains a challenge, leading to unreliable language labels and thus noisy datasets for underserved languages.
In response to these challenges, we will be holding the first Workshop on Multilingual Data Quality Signals (WMDQS) in tandem with COLM. We invite the submission of long and short research papers related to data quality in multilingual data.
Even though most previous work on data quality has been targeted at LLM development, we believe that research in this area can also benefit other research communities in areas such as web search, web archiving, corpus linguistics, digital humanities, political sciences and beyond. We therefore encourage submissions from a wide range of disciplines.
WMDQS will also include a shared task on language identification for web text. We invite participants to submit novel systems which address current problems with language identification for web text. We will provide a training set of annotated documents sourced from Common Crawl to aid development.
Topics
We welcome submissions of (1) original research papers, (2) review/opinion papers, (3) online systems on the topics listed below, and (4) extended abstracts. We especially welcome work-in-progress projects and all novel ideas covering research in multilinguality, underserved/low-resource languages, under-represented linguistic communities and all types of work covering data quality signals. Suggested areas include:
- Data pipelines for data annotation and data filtering
- Undesirable content detection in a multilingual setting
- Multilingual or language independent content ranking
- Human annotation platforms and systems
- Multilingual tokenization mechanisms
- Small language models and embeddings
- Linguistic studies in underserved languages
- Corpus creation and curation methods, especially for underserved languages
- Machine translation
- Digital humanities
- Historical and constructed languages
Shared task
The lack of training data—especially high-quality data—is the root cause of poor language model performance for many languages. One obstacle to improving the quantity and quality of available text data is language identification (LangID or LID). Lang ID remains far from solved for many languages. Several of the commonly used LangID models were introduced in 2017 (e.g. fastText and CLD3). The aim of this shared task is to encourage innovation in open-source language identification and improve accuracy on a broad range of languages.
All accepted authors will be invited to contribute a larger paper, which will be submitted to a high-impact NLP venue.
Important dates for the Workshop:
Workshop paper submission deadline: June 23, 2025
Workshop paper acceptance notification: July 24, 2025
Workshop: October 10, 2025
Important dates for the Shared Task:
1st Deadline to contribute annotations: July 7, 2025
1st Annotations released (train split): July 14, 2025
Abstract Deadline: July 21, 2025
Decision Notification: July 24, 2025
Camera Ready Deadline: September 21, 2025
(All deadlines are 23:59 AoE.)
Organizers:
For any questions, please drop a mail to wmdqs-pcs(a)googlegroups.com
Program Chairs:
Pedro Ortiz Suarez (Common Crawl Foundation)
Sarah Luger (MLCommons)
Laurie Burchell (Common Crawl Foundation)
Kenton Murray (Johns Hopkins University)
Catherine Arnett (EleutherAI)
Organizing Committee:
Thom Vaughan (Common Crawl Foundation)
Sara Hincapié (Factored)
Rafael Mosquera (MLCommons)