In this newsletter:
LDC data and commercial technology development
New publications:
Chinese Sentence Pattern Structure Treebank<https://catalog.ldc.upenn.edu/LDC2025T06>
IWSLT 2022-2023 Shared Task Training, Development and Test Set<https://catalog.ldc.upenn.edu/LDC2025S05>
KAIROS Schema Learning Complex Event Annotation<https://catalog.ldc.upenn.edu/LDC2025T07>
________________________________
LDC data and commercial technology development
For-profit organizations are reminded that an LDC membership is a pre-requisite for obtaining a commercial license to almost all LDC databases. Non-member organizations, including non-member for-profit organizations, cannot use LDC data to develop or test products for commercialization, nor can they use LDC data in any commercial product or for any commercial purpose. LDC data users should consult corpus-specific license agreements for limitations on the use of certain corpora. Visit the Licensing<https://www.ldc.upenn.edu/data-management/using/licensing> page for further information.
________________________________
New publications:
Chinese Sentence Pattern Structure Treebank<https://catalog.ldc.upenn.edu/LDC2025T06> was developed at Beijing Normal University<https://english.bnu.edu.cn/> and Peking University<https://english.pku.edu.cn/>. It contains 5,016 sentences and 119,627 tokens syntactically annotated following the concept of sentence constituent analysis which emphasizes sentence pattern structure. The source data consists of 27 chapters extracted from modern Mandarin and ancient Chinese works. There are three annotation layers: lexical sense and structural mode for dynamic words; syntactic structure for clauses; and inter-clause relation within complex sentence and sentence clusters. These structures can be visualized using the Jbw-viewer tool<https://github.com/bnucip/jbwviewer> which is included in the release.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
IWSLT 2022 - 2023 Shared Task Training, Development and Test Set<https://catalog.ldc.upenn.edu/LDC2025S05> was developed by LDC and contains 210 hours of Tunisian<https://catalog.ldc.upenn.edu/LDC2025S05> Arabic conversational telephone speech, transcripts, English translations, speaker metadata, and documentation. This material constitutes the training, development, and test data used in the International Conference on Spoken Language Translation (IWSLT) Dialectal Speech Translation task (2022)<https://iwslt.org/2022/dialect> and the Dialectal and Low-resource track (2023)<https://iwslt.org/2023/low-resource>.
The telephone speech was collected by LDC in 2016-2017 from native speakers of Tunisian Arabic in Tunis. Speakers were recruited to make telephone calls to people in their social networks from a variety of noise conditions and handsets. Transcripts are orthographic following Buckwalter<https://catalog.ldc.upenn.edu/LDC2004L02> transliteration and cover 175 hours of the collected speech. IPA transcripts were added to a subset of the data. All transcribed segments were translated into English.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
KAIROS Schema Learning Complex Event Annotation<https://catalog.ldc.upenn.edu/LDC2025T07> was developed by LDC to support the DARPA KAIROS program. It contains English and Spanish text, audio, video, and image data labeled for 93 real-world complex events with event, relation, and argument annotations linking to document provenance. Source data was collected from the web; 3431 root web pages were collected and processed, yielding 1919 text data files, 24019 image files, 1472 video files, and 16 audio files.
The DARPA KAIROS (Knowledge-directed Artificial Intelligence Reasoning Over Schemas) program aimed to build technology capable of understanding and reasoning about complex real-world events in order to provide actionable insights to end users. KAIROS systems utilized formal event representations in the form of schema libraries that specified the steps, preconditions, and constraints for an open set of complex events; schemas were then used in combination with event extraction to characterize and make predictions about real-world events in a large, multilingual, multimedia corpus.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
Dear CLIN enthusiasts
We are extending the submission deadline for CLIN abstracts by one week. The new, final deadline is June 20th. Below you can find the original call for abstracts with a modified date.
Website: https://clin35.ccl.kuleuven.be/
We invite submissions for CLIN35, the 35th edition of the Computational Linguistics in the Netherlands (CLIN) conference, which will take place in Leuven on September 12th, 2025.
Abstracts describing theoretical or applied research in any area of computational linguistics and natural language processing are welcome. We especially encourage submissions related to the Dutch language, but contributions on other languages and multilingual approaches are equally welcome. Abstracts must be written in English and should not exceed 500 words.
Submissions should include:
* Name and affiliation of each author
* Contact details
* Presentation title and short abstract (max. 500 words)
* Keywords
* Your presentation format preference (We will do our best to accommodate your preference but may need to make changes to provide a well-balanced program)
Abstracts must be submitted via the form on the website<https://clin35.ccl.kuleuven.be/call-for-abstracts> by Friday, 20th of June 2025. Notifications of acceptance will be sent out by Friday, 4th of July 2025. Accepted abstracts will be presented at the conference as oral or poster presentations. Authors with accepted abstracts will also have the opportunity to submit a full paper after the conference for publication in the CLIN Journal<https://www.clinjournal.org/clinj/>.
Please share this call with your interested colleagues and network! For any questions you can reach us at this email address (clin35(a)kuleuven.be<mailto:clin35@kuleuven.be>).
We look forward to your submissions and to welcoming you to CLIN35!
CLIN35 local organizers
________________________________
Denk je aan het milieu? Print alleen als het nodig is.
Aan dit bericht kunnen geen rechten worden ontleend.
Het bericht is alleen bestemd voor de geadresseerde.
Indien het bericht niet voor u is bestemd, verzoeken wij
u dit aan ons te melden en het bericht te verwijderen.
This message shall not constitute any obligations.
This message is intended solely for the addressee.
If you have received this message in error, please
inform us and delete the message.
________________________________
******************************************************
********* EVALITA 2026: Call for tasks *********
******* NEW DEADLINES and TIMELINE ******
******************************************************
EVALITA 2026 is an initiative of AILC (Associazione Italiana di Linguistica
Computazionale, AILC https://www.ai-lc.it/).
As in the previous editions (https://www.evalita.it/), EVALITA 2026 will be
organized along a few selected tasks, which provide participants with
opportunities to discuss and explore both emerging and traditional
areas of Natural
Language Processing and Speech. The participation is encouraged for teams
working both in academic institutions and industrial organizations.
TASK PROPOSAL SUBMISSION
Task proposals should be no longer than 4 pages and should include:
-
task title and acronym;
-
names and affiliation of the organizers (minimum 2 organizers);
-
brief task description, including motivations and state of the art;
-
explanation of the international relevance of the task;
-
description and examples of the data, including information about their
availability, development stage, and issues concerning privacy and data
sensitivity. The examples are mandatory because they are intended to give
potential participants an idea of what the task data will look like, how
it’ll be formatted, etc.
-
expected number of participants and attendees;
-
names and contact information of the organizers.
We also accept the re-annotation/expansion of datasets from previous years
and previous challenges with new annotation levels, and texts from publicly
available corpora. However, test annotations must be new and unpublished,
as participants must not have access to the test data annotations until the
end of EVALITA campaign. For new tasks, organizers must specify in the
proposal why it would attract a reasonable number of participants, and why
it is needed. For re-runs, organizers must describe the element of novelty
from previous challenges.
In submitting your proposal, please bear in mind that we strongly encourage:
-
tasks that pose non-trivial challenges and stimulate the creation of
innovative systems (i.e., that integrate linguistic insights or external
knowledge sources), rather than being easily addressed by off-the-shelf LLM
prompting techniques;
-
tasks focused on multimodality, e.g., considering both textual and
visual or any other modality;
-
tasks characterized by different levels of complexity, e.g., with a
straightforward main subtask and one or more sophisticated additional
subtasks;
-
to consider providing competitive baselines (e.g., small-scale LLMs in
zero-shot setups), which participants are expected to improve upon, in
order to encourage the design of advanced solutions;
-
application-oriented tasks, that is, tasks that have a clearly defined
end-user application showcasing;
-
multilingual tasks, i.e. with data both in Italian and in other
languages;
-
industrial tasks, i.e. tasks with real data provided by companies.
The organizers of the accepted tasks should take care of planning,
according to the scheduled deadlines (see below):
-
the development and distribution of datasets needed for the contest,
i.e. data for training and development, and data for testing; the scorer to
be used to evaluate the submitted systems should be included in the release
of development data;
-
the development of task guidelines, where all the instructions for the
participation are made clear, together with a detailed description of data
and evaluation metrics applied for the evaluation of the participant's
results;
-
the collection of participants' results;
-
the evaluation of participants' results according to standard metrics
and baseline(s);
-
the solicitation of participation and submissions;
-
the reviewing process of the papers describing the participants'
approach and results (according to the template to be made available by the
EVALITA 2026 chairs);
-
the production of a paper describing the task (according to the template
to be made available by the EVALITA 2026 chairs).
*** Email your proposal in PDF format to evalitacampaign(a)gmail.com with
"EVALITA 2026 TASK Proposal" as the subject line by the submission
deadline: July 28th 2025. ***
Please feel free to contact the EVALITA 2026 chairs at
evalitacampaign(a)gmail.com in case of any questions or suggestions.
Deadlines of the task proposal:
-
July 21th 2025 July 28th 2025: submission of task proposals
-
July 31th 2025 August 7th 2025: notification of task proposal acceptance
Timelines of EVALITA 2026:
-
22nd September 2025: development data available to participants
-
3 - 17th November 2025: evaluation windows
-
28th November 2025: assessments returned to participants
-
15th December 2025: final reports (from participants) due to task
organizers
-
22nd December 2025: final reports (from task organizers) due to EVALITA
chairs
-
19th January 2025: review deadline
-
2nd February 2026: camera-ready version deadline
-
26 - 27th February 2026: final workshop in Bari
EVALITA 2026 CHAIRS
Francesco Cutugno (Università di Napoli)
Alessio Miaschi (Istituto di Lingustica Computazionale “A. Zampolli” - CNR)
Alessio Palmero Aprosio (Università di Trento)
Giulia Rambelli (Università di Bologna)
Lucia Siciliani (Università di Bari)
Marco Antonio Stranisci (Università di Torino)
FURTHER INFORMATION
Website: https://www.evalita.it/campaigns/evalita-2026/call-for-tasks/
Mail: evalitacampaign(a)gmail.com
Marco,
UNITO <https://www.unito.it/persone/mstranis> and aequa-tech
<https://aequa-tech.com/>
The UKP Lab at the Department of Computer Science, Technical University Darmstadt, Germany, is looking for
*** two fully funded 𝗣𝗵𝗗 𝗦𝘁𝘂𝗱𝗲𝗻𝘁𝘀 𝗮𝗻𝗱/𝗼𝗿 𝗣𝗼𝘀𝘁𝗱𝗼𝗰𝘀 ***
for an exciting project in machine-generated text detection. This is a unique opportunity to join the UKP Lab on the intersection of AI Safety, Natural Language Processing and Machine Learning. If you're excited about shaping the future of Large Language Models, AI Agents, human-AI interaction, building novel prototypes, and publishing at top-tier venues of NLP, ML and AI, we’d love to hear from you.
🔗 More information:
https://www.informatik.tu-darmstadt.de/ukp/ukp_home/jobs_ukp/2025_phd_ukp.e…
📩 Apply here:
https://careers.ukp.informatik.tu-darmstadt.de/ukprecruitment
📅 Application deadline: June 29th, 2025
--------------------------------------------------------------------
Prof. Dr. Iryna Gurevych
UKP Lab
Technical University Darmstadt, Germany
http://www.ukp.tu-darmstadt.de/
Third call for papers Sixth Workshop on Resources for African
Indigenous Language (RAIL)
Co-located with DHASA 2025
https://sadilar.org/rail-2025/
RAIL Workshop date: 10 November 2025
DHASA Conference dates: 10-14 November 2025
Venue: CSIR International Convention Centre.
The sixth RAIL workshop website: https://sadilar.org/rail-2025/
DHASA website: https://digitalhumanities.org.za/
The sixth Resources for African Indigenous Languages (RAIL) workshop
will be co-located with the Digital Humanities Association of Southern
Africa (DHASA) 2025 conference at the CSIR International Convention
Centre in Pretoria, South Africa, on 10 November 2025. The RAIL
workshop is an interdisciplinary platform for researchers working on
African indigenous languages resources such as natural languages
processing (NLP) tools, Human Language Technologies (HLT), data
collections, and annotations. This workshop aims to foster a
scientific community of practice that focuses on computational
linguistic tools and data that are designed for or applied to the
indigenous languages of Africa.
Many African languages are under-resourced while only a few are
considered to be somewhat better resourced. These languages often share
interesting properties such as writing systems, making them different
from most high-resourced languages. From a computational perspective,
these languages lack enough corpora to undertake high level development
of NLP and HLT tools, which in turn impedes the development of African
languages in these areas. During previous workshops, it was noted that
the problems and solutions presented were not only applicable to
African languages but were also relevant to many other low-resource
languages across the world. Because these languages share similar
challenges, this workshop provides researchers with opportunities to
work collaboratively on issues of language resource development and
learn from each other.
The RAIL workshop has several aims. First, the workshop brings together
researchers who work on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances sharing of knowledge on the development of
low-resource languages. Finally, it enables discussions on how to
improve the quality as well as availability of the resources.
The workshop has “Language resources in the age of large language
models” as its theme, but submissions on any topic related to
properties of African indigenous languages (including related non-
African languages) may be accepted. Suggested topics include (but are
not limited to) the following:
* Digital representations of linguistic structures
* Descriptions of corpora or other data sets of African indigenous
languages
* Building resources for (under-resourced) African indigenous languages
* Developing and using African indigenous languages in the digital age
* Effectiveness of digital technologies for the development of African
indigenous languages
* Revealing unknown or unpublished existing resources for African
indigenous languages
* Developing desired resources for African indigenous languages
* Improving quality, availability and accessibility of African
indigenous language resources
Submission requirements:
We invite papers on original, unpublished work related to the topics of
the workshop. Submissions, presenting completed work, may consist of up
to eight (8) pages of content plus additional pages of references. The
final camera-ready version of accepted long papers are allowed one
additional page of content (up to 9 pages) so that reviewers’ feedback
can be incorporated. Papers should be formatted according to the DHASA
style sheet which is provided on the Journal of the Digital Humanities
Association of Southern Africa website
(https://upjournals.up.ac.za/index.php/dhasa/about). Reviewing is
double-blind, so make sure to anonymise your submission (e.g., do not
provide author names, affiliations, project names, etc.) Limit the
amount of self citations (anonymised citations should not be used). The
RAIL workshop follows the DHASA submission requirements.
Please submit papers in PDF format (the submission link will be
available soon). Accepted papers will be published in proceedings
linked to the DHASA conference.
Important dates:
Submission deadline: 14 July 2025
Date of notification: 16 September 2025
Camera ready copy deadline: 24 October 2025
Workshop: 10 November 2025
DHASA conference: 10 November 2025-14 November 2025
Organising Committee
Rooweither Mabuya, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Muzi Matfunjwa, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Mmasibidi Setaka, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources
(SADiLaR), South Africa
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
Third call for papers DHASA Conference 2025
https://dh2025.digitalhumanities.org.za
Theme: The role of humanities in digital humanities and artificial
intelligence
The Digital Humanities Association of Southern Africa (DHASA) is
pleased to announce its fifth conference, focusing on the theme The
role of humanities in digital humanities and artificial intelligence.
In a region where the field of Digital Humanities is still relatively
underdeveloped, this conference aims to address this gap and foster
growth and collaboration in the field. The conference offers an
opportunity for researchers interested in showcasing their work in the
broad field of Digital Humanities to come together. By doing so, the
conference provides a comprehensive overview of the current state-of-
the-art in Digital Humanities, particularly within the Southern Africa
region. As such, we welcome submissions related to Digital Humanities
research conducted by individuals from Southern Africa or research
focused on the geographical area of Southern Africa in the broad sense.
Furthermore, the conference serves as a platform for information
sharing and networking among researchers passionate about Digital
Humanities. By bringing together experts working on Digital Humanities
in Southern Africa or with a focus on Southern Africa, we aim to
promote collaboration and facilitate further research in this dynamic
field. In addition to the main conference, affiliated workshops and
tutorials will be organised, providing researchers with valuable
insights into novel technologies and tools. These supplementary events
are designed for researchers interested in specific aspects of Digital
Humanities or seeking practical information to enter or advance their
knowledge in the field.
The DHASA conference welcomes interdisciplinary contributions from
researchers in various domains of Digital Humanities, including, but
not limited to, language, literature, visual art, performance and
theatre studies, media studies, music, history, sociology, psychology,
language technologies, library studies, philosophy, methodologies,
software and computation, AI, and more. Our goal is to cultivate an
inclusive scientific community of practice within Digital Humanities.
Suggested topics include the following:
* The role of AI in digital humanities, the role of Digital Humanities
in shaping AI, and the broader role of the humanities in both AI and DH
projects;
* Digital archives and the preservation of marginalised voices;
* Intersectionality and the digital humanities: exploring the
intersections of race, gender, sexuality, culture, and class in digital
research and activism;
* Activism and social change through digital media: how digital
humanities tools and methodologies can be used to promote inclusion;
* Engaging marginalised communities in the creation and use of digital
tools, resources, and AI;
* Exploring the role of digital humanities in decolonising knowledge
and promoting indigenous perspectives;
* The ethics of data collection and analysis in digital humanities and
AI research;
* The role of digital humanities and AI in promoting inclusive and
equitable pedagogy;
* Digital humanities and inclusion in the context of African and global
perspectives and international collaborations;
* Critical approaches to digital humanities and inclusion: examining
the limitations and possibilities of digital tools and methodologies in
promoting inclusion; and
* Collaborative digital humanities projects with non-profit
organisations, community groups, and cultural institutions;
* Development of digital and AI tools for supporting digital
humanities;
* Novel utilisation of digital and AI tools for performing digital
humanities research;
* The role of digital humanities in the classroom: reimagining literacy
and AI fluency
* Digital humanities data and project management;
* The role of librarians in the digital humanities project;
* Any other digital humanities-related topic that serves the Southern
African community.
Submission Guidelines
The DHASA conference 2025 asks for three types of submissions:
* Long papers: Authors may submit long papers with a maximum of 8
content pages and unlimited pages for references and appendices. The
final versions of accepted long papers will be granted an additional
page (leading to a total of up to 9 content pages) to incorporate
reviewers' comments. Long papers accepted for the conference will be
presented in 30-minute time slots (which includes 10 minutes for
questions).
* Short papers: Authors may submit short papers with a maximum of 5
content pages and unlimited pages for references and appendices. The
final versions of accepted short papers will be allowed an extra page
(leading to a total of up to 6 content pages) to accommodate reviewers'
comments. Short papers accepted for the conference will be presented in
15-minute time slots (which includes 5 minutes for questions).
* Executive summaries: Authors can submit an executive summary for work
in progress, limited to 1 page. Executive summaries accepted for the
conference will be presented as posters during a dedicated poster
presentation slot.
All accepted long and short paper submissions that are presented at the
conference will be published in the JDHASA journal, see
https://upjournals.up.ac.za/index.php/dhasa. In addition, the executive
summaries for the poster presentations will be published in a book of
executive summaries before the conference.
We particularly encourage student submissions where the first author is
a student.
All submissions should adhere to the ACL style guide:
https://acl-org.github.io/ACLPUB/formatting.html
Submissions should be submitted in PDF format. Submissions that do not
adhere to the prescribed style guide will be rejected.
Follow this link to go to the submission platform:
https://dh2025.digitalhumanities.org.za/submission/
Authors are encouraged to upload their datasets to the SADiLaR
repository: https://repo.sadilar.org/. In case of difficulties
uploading the datasets, please reach out to Benito Trollip
(benito.trollip(a)nwu.ac.za).
Important dates
Submission deadline: 14 July 2025
Date of notification: 16 September 2025
Camera-ready copy deadline: 24 October 2025
Conference: 10 November 2025 - 14 November 2025
Conference venue: CSIR ICC, Pretoria, South Africa
Co-located events
Several co-located events are currently being prepared, including
workshops and tutorials. These will be updated on the conference
website.
Organising Committee
Aby Louw, Council for Scientific and Industrial Research
Andiswa Bukula, South African Centre for Digital Language Resources
Avi Moodley, Council for Scientific and Industrial Research
Franco Mak, Council for Scientific and Industrial Research
Franziska Pannach, Rijksuniversiteit Groningen
Ilana Wilken, Council for Scientific and Industrial Research
Johannes Sibeko, Nelson Mandela University
Juan Steyn, South African Centre for Digital Language Resources
Laurette Marais, Council for Scientific and Industrial Research
Marissa Griesel, South African Centre for Digital Language Resources
Menno van Zaanen, South African Centre for Digital Language Resources
Privolin Naidoo, Council for Scientific and Industrial Research
Sthembiso Mkhwanazi, Council for Scientific and Industrial Research
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
WINLP 2025 WORKSHOP
The Widening NLP (WiNLP) workshop aims to foster an inclusive
environment that highlights the contributions of researchers from
underrepresented groups in NLP. Anyone who self-identifies as being from
an underrepresented background--based on gender, ethnicity, nationality,
sexual orientation, disability, or otherwise--is encouraged to submit.
In 2025, WiNLP will continue placing emphasis on access, disability, and
diversity across scientific backgrounds, disciplines, training, and
underrepresented languages.
Our annual Widening Natural Language Processing Workshop (WiNLP) will be
held in conjunction with EMNLP 2025 in Suzhou, China. Since EMNLP is
anticipating a hybrid format for their conference, we also anticipate
our workshop will be hybrid, with both online and in-person attendees.
The one-day workshop will occur during EMNLP's workshop period with an
exact date to be announced soon.
The full-day event includes invited talks, oral presentations, and
poster sessions. The workshop provides an excellent opportunity for
junior members in the community to showcase their work and connect with
senior mentors for feedback and career advice. It also offers
recruitment opportunities with leading industrial labs. Most
importantly, the workshop will provide an inclusive and accepting
space, and work to lower structural barriers to joining and
collaborating with the NLP community at large.
Information on Submission guidelines at:
https://www.winlp.org/call-for-submissions-2025/
PRE-SUBMISSION MENTORSHIP PROGRAM
WiNLP offers an optional pre-submission mentorship program to help
authors improve the quality of their writing and presentation before
final submission. The program focuses on enhancing the clarity and
structure of the paper, not critiquing the research content.
* Submission: Authors must submit a draft of their paper via the
designated Google Form (https://forms.gle/J33K2ea6VruN82ke9) by June 20,
2025. The draft should adhere to the same formatting and length
guidelines as final submissions.
* Mentor Assignment: Organizers will check the draft for compliance
with formatting requirements before assigning a mentor. The mentor will
not be involved in reviewing the final submission.
* Feedback: Mentors will provide feedback by July 18, 2025, offering
suggestions to improve writing and presentation. Authors are encouraged
to incorporate this feedback before the final submission deadline.
* Non-Anonymous: The mentorship process is not anonymized.
* Final Submission: Authors who participate in the mentorship program
should submit their final paper as a new submission via OpenReview by
August 1st, 2025 to be considered for WiNLP workshop. Participation in
the mentorship program is not a prerequisite for submitting a paper to
WiNLP.
TRAVEL SUPPORT
WiNLP offers a limited number of travel grants to support one author per
accepted submission. Grants may cover expenses such as registration,
travel, lodging, or visa costs. Funded authors may choose to attend
virtually if preferred.
* Travel grant application deadline: September 26, 2025
* Notification: October 6, 2025
* Eligibility: One author per accepted submission is eligible. The
funded author must be identified in the travel grant application form.
Additional funding for virtual attendance by other authors may be
considered if surplus funds are available, but in-person attendance for
additional authors is not guaranteed. Travel expenses are handled via
reimbursement (primarily through USD check or PayPal). Authors unable to
front travel costs should contact the organizers early to discuss
alternatives.
Authors are encouraged to explore local funding options (e.g.,
institutional support) to maximize the reach of WiNLP's limited funds.
We recommend additional student authors keep an eye out for the EMNLP
call for student volunteers or call for D&I subsidies as opportunities
for further funding.
IMPORTANT DATES
All deadlines are 11:59 PM UTC-12:00 "Anywhere on Earth"
* Pre-submission mentoring deadline: June 20, 2025
* Pre-submission feedback returned: July 18, 2025
* Paper submission deadline: August 1, 2025
* Acceptance notifications: September 15, 2025
* Camera-ready deadline: October 1, 2025
* Travel grant applications due: September 26, 2025
* Travel grant notifications: October 6, 2025
CONTACT INFORMATION
Website: https://www.winlp.org/call-for-submissions-2025/
Twitter: @winlpworkshop [1]
Facebook: Widening NLP [2]
LinkedIn: Widening NLP [3]
E-mail: winlp-chairs(a)googlegroups.com
Links:
------
[1] https://twitter.com/WiNLPWorkshop
[2] https://www.facebook.com/WideningNLP
[3] https://www.linkedin.com/company/winlp
[CFP] - (R2LM) From Rules to Language Models: Comparative Performance Evaluation @ RANLP 2025 (Varna, Bulgaria) - 11-13 September 2025
https://r2lm2025.github.io/R2LM/
Workshop Description
Deep learning (DL) and large language models (LLMs) have driven major advances in natural language processing (NLP), enabling impressive performance across many tasks. However, they continue to face key challenges in handling complex linguistic phenomena such as multiword expressions, long-context reasoning, and robustness to adversarial inputs. In parallel, concerns remain about the scalability, interpretability, and domain adaptability of these models, particularly in applications requiring high precision, such as grammar checking, legal analysis, or medical NLP. These limitations have sparked renewed interest in rule-based and knowledge-based approaches, which often offer better explainability and remain competitive, especially in low-resource or high-stakes scenarios.
Our workshop aims to gather contributions that deal with the following topics:
• Role of rule-based and knowledge-based NLP methods in modern applications
• Comparative analysis of rule-based, machine-learning, deep-learning and large language models for different NLP tasks
• Emerging trends in NLP research beyond deep learning and Large Language Models
• Limitations and performance bottlenecks in scalability and accuracy of deep learning models
Submission Details
• Long papers: up to 8 pages (excluding references)
• Short papers: up to 4 pages (excluding references)
• Format: ACL-style (LaTeX or MS Word)
• Submission portal and template info available on the RANLP 2025 website
Important dates
Paper Submission Deadline: 6 July 2025
Notification of Acceptance: 31 July 2025
Workshop date: 11, 12 or 13 September 2025
Organising Committee:
Alicia Picazo-Izquierdo, University of Alicante, Spain
Ernesto Luis Estevanell-Valladares, University of Alicante, Spain
Rafael Muñoz Guillena, University of Alicante, Spain
Ruslan Mitkov, Lancaster University, UK
Raúl García Cerdá, University of Alicante, Spain
The Marseille Computer Science and Systems Laboratory (https://www.lis-lab.fr/) is seeking a candidate for a three-year thesis grant as part of the ANR Cre@lame project, in collaboration with the University of Turku in Finland.
The subject concerns the modeling of the literary writing and revision process carried out by authors. The starting point is an already written text, which is to be revised in the manner of an author. The problem is seen as a problem of predicting edit operations, taking the original text as input and producing edit operations. These can concern the lexicon, syntax or textual organization.
The thesis's problem is structured around three directions.
The first is the nature of the object produced by the prediction process, which could take the form of a sequence of edit operations or a more complex form, such as a graph. The prediction model itself will depend largely on the nature of the predicted object.
The second concerns data. Revision data, which associates revision operations with a text, is not very common in general, and those concerning literary revision are even less so. We will rely on all available data available and, possibly, produce them using language models, in order to train the revision models.
The final direction concerns evaluation. Given an original text and a revised version, how can we judge the quality of the latter? And how can we assess that the changes made to the original text are consistent with the objectives of the revision process.
We are looking for candidates with a strong background in machine learning, mainly in deep learning, as well as knowledge in Natural Language Processing.
Application deadline: June 22
Contacts: Patrice Bellot (patrice.bellot(a)univ-amu.fr), Christophe Leblay (chrleb(a)utu.fi) and Alexis Nasr (alexis.nasr(a)univ-amu.fr)