Apologies for cross-posting.
----------------------------------------
*The International Conference on Spoken Language Translation*
*ACL – 22nd IWSLT 2025 – First Call for Participation*
*31 July-1 August 2025 - Vienna, Austria*
http://iwslt.org
The International Conference on Spoken Language Translation (IWSLT)
<https://iwslt.org/> is the premier annual conference for all aspects of
Spoken Language Translation. Every year, the conference organises and
sponsors open evaluation campaigns around key challenges in simultaneous
and consecutive translation, under real-time/low latency or offline
conditions and under low-resource or multilingual constraints. System
descriptions and results from participants’ systems and scientific papers
related to key algorithmic advances and best practices are presented.
IWSLT is the venue of the SIGSLTs <https://iwslt.org/sigslt/>, the Special
Interest Group on Spoken Language Translation <https://iwslt.org/sigslt/>
of ACL <https://www.aclweb.org/portal/>, ISCA <https://www.isca-speech.org/>
and ELRA <https://www.elra.info/>. With a track record of 21 years, IWSLT
benchmarks and proceedings serve as reference for all researchers and
practitioners working on speech translation and related fields.
The 22nd edition of IWSLT will be run as a hybrid ELRA
<https://www.elra.info/>/ACL <https://www.aclweb.org/portal/> event,
co-located with ACL 2025 <https://2025.aclweb.org/> from 31 July to 1
August 2025.
*Important Dates*
*January 1, 2025*: Release of shared task training and dev data
*March 15, 2025*: Scientific paper submission deadline
*Apr 1-15, 2025*: Evaluation period
*April 21, 2025*: System description paper submission deadline
*May 15, 2025*: Notification of acceptance
*June 1, 2025*: Camera-ready deadline (all paper)
*July 31-Aug 1*, *2025*: IWSLT conference
Evaluation
The IWSLT 2025 features shared tasks <https://iwslt.org/2025/#shared-tasks>
that address the following focus areas:
- High-resource ST: Offline track, Simultaneous track, Subtitling track
- Low-resource ST: Low-resource and Indic (multilingual) tracks
- Instruction-following Speech Processing track: Technical domain ST, ASR,
Summarization, and QA
Training and development data for each shared task will be prepared and
released by the respective organisers (for further information on this
initiative, please refer to the IWSLT website <https://iwslt.org/2025/>).
Participants will receive instructions about how to submit their runs. In
addition, participants have the opportunity to present their work
through a system
paper that will be published in the ACL Proceedings.
Conference
IWSLT also invites submissions of scientific papers to be published in the
ACL Proceedings and presented either in oral or poster format. The
conference selects high-quality, original contributions on theoretical and
practical issues of spoken language translation research, technologies and
applications. Submissions will be accepted directly through the IWSLT
submission site (to be announced on the website <https://iwslt.org/2025/>).
We will also accept commitments of submissions with reviews from the ACL
Rolling Review.
Additionally, to foster cross-pollination of ideas, the conference also
invites the presentation of papers on speech translation recently published
elsewhere. Please note that this is for non-archival presentation of papers
relevant to speech translation already published in other venues (e.g.,
Findings for the *ACL, speech, NLP or MT conferences). Submissions for this
category will be accepted through a dedicated form (to be announced on the
website <https://iwslt.org/2025/>). Papers will be checked for relevance to
IWSLT, and assigned either oral or poster presentation slots if selected.
Contact
Please email iwslt-evaluation-campaign(a)googlegroups.com if you have any
questions related to the shared tasks.
Thanks,
Marine, Marcello, Alex, Jan, Sebastian, Elizabeth, Atul
(IWSLT organisers)
Dear all,
On 5 and 6 December, the 22nd International Workshop on Treebanks and Linguistic Theories (TLT 2024) is being hosted at University of Hamburg. The workshop will be held in hybrid form and you are welcome to join us online!
Keynote talks:
December 5: 10:00-11:00 h
Anna Nedoluzhko (Charles University Prague)
Multilingual Coreference and Treebanking: Benefits of Interaction<https://www.korpuslab.uni-hamburg.de/en/tlt2024/program/_boxes/abstract-ann…>
December 6: 12:00-13:00 h
Marcel Bollmann (Linköping University)
Increasing language diversity in NLP: Insights from CreoleVal<https://www.korpuslab.uni-hamburg.de/en/tlt2024/program/_boxes/abstract-mar…>
On Friday, December 6
14:30-16:30 h There will be a discussion panel on "Treebanks and linguistic annotation in the area of LLMs” Panelists: Marcel Bollmann (Linköping University), Daniel Dakota (Indiana University), Sandra Kübler (Indiana University), Anna Nedoluzhko (Charles University Prague), Juri Opitz (Universität Zürich)
Find the full workshop programme on our website: https://www.korpuslab.uni-hamburg.de/en/tlt2024/program.html If you would like to participate, please register via this form and we will send you the Zoom link in advance of the workshop: https://www.korpuslab.uni-hamburg.de/en/tlt2024/registration.html Please note that due to security reasons, University of Hamburg allows Zoom conferencing only via the Zoom app, so joining via browser will not work.
Do not hesitate to contact us via tlt2024.gw(a)uni-hamburg.de<mailto:tlt2024.gw@uni-hamburg.de> if you have any further questions.
The workshop is endorsed by ACL SIGPARSE<https://www.sigparse.org/> and we like to thank SFB 1102<https://sfb1102.uni-saarland.de/> for their financial support.
Best,
TLT 2024 Program Chairs
---------------------------
Prof. Dr. Heike Zinsmeister (sie/ihr)
Linguistik des Deutschen / Korpuslinguistik
Universität Hamburg, Institut für Germanistik, Raum C7012
Von-Melle-Park 6, Postfach #15, D-20146 Hamburg
Tel.: 040 42838-7119
heike.zinsmeister(a)uni-hamburg.de
http://www.slm.uni-hamburg.de/germanistik/personen/zinsmeister.html
On behalf of Prof. Omer Bobrowski and Prof Primoz Skraba.
An exciting PhD opportunity at the intersection of Machine Learning, Mathematics and model interpretability is offered at the Centre for Probability, Statistics and Data Science at Queen Mary University of London.
Project description
This PhD position is part of the “Erlangen Programme for AI,” a prestigious multi-university initiative focused on developing a rigorous mathematical foundation for Artificial Intelligence. The project emphasizes the integration of concepts from topology, geometry, and probability, with the overarching goal of enhancing the interpretability, robustness, and generalization of AI models.
Understanding Deep Neural Networks
DNNs represent a cutting-edge approach in machine learning and AI, but there remains a significant gap in understanding the intrinsic mechanisms behind their powerful performance. This research aims to combine topological and geometric tools with probabilistic analysis to unveil hidden structures in neural networks. By investigating how these structures arise during training, how information flows through layers, and what vulnerabilities exist, we expect to gain insights that will drive future advancements in model design, optimization, and resilience.
Understanding Large Language Models
LLMs have shown to capture (encode) both the semantics and structure (grammar) of language within their learned parameters. However, the methods used to access this knowledge (decoding) remain basic, typically involving the representation of textual objects (e.g., words, sentences) as continuous vectors in Euclidean space. This project aims to leverage geometry and topology to explore the internal representations and latent spaces within the LLMs parameters that go beyond simple vectors analysis. We will develop advanced methods for decoding meaning and structure from LLMs, enabling richer and more diverse access to the linguistic knowledge they encode, and test it in a range of linguistic tasks (polysemy, cross-lingual transfer, among others). This approach holds the potential for breakthroughs in both AI theory and practical applications.
Deadline is Wednesday, January 29, 2025
Further details can be found here:
https://www.findaphd.com/phds/project/mathematical-foundations-for-ai/?p177…
LaTeCH-CLfL 2025:
The 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
to be held on May 3rd or 4th, 2025 in conjunction with NAACL 2025 <https://2025.naacl.org/> in Albuquerque, NM.
https://sighum.wordpress.com/latech-clfl-2025/
Second Call for Papers (with apologies for cross-posting)
Organisers: Diego Alves, Yuri Bizzoni, Stefania Degaetano-Ortlieb, Anna Kazantseva, Janis Pagel, Stan Szpakowicz
LaTeCH-CLfL 2025 is the ninth in a series of meetings for NLP researchers who work with data from the broadly understood arts, humanities and social sciences, and for specialists in those disciplines who apply NLP techniques in their work. The workshop continues a long tradition of annual meetings. The SIGHUM Workshops on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH) ran ten times in 2007-2016. The five Workshops on Computational Linguistics for Literature (CLfL) took place in 2012-2016. The first eight joint workshops (LaTeCH-CLfL) were held in 2017-2024.
Topics and content
In the Humanities, Social Sciences, Cultural Heritage and literary communities, there is increasing interest in, and demand for, NLP methods for semantic and structural annotation, intelligent linking, discovery, querying, cleaning and visualization of both primary and secondary data. This is even true of primarily non-textual collections, given that text is also the pervasive medium for metadata. Such applications pose new challenges for NLP research: noisy, non-standard textual or multi-modal input, historical languages, vague research concepts, multilingual parts within one document, and so no. Digital resources often have insufficient coverage; resource-intensive methods require (semi-)automatic processing tools and domain adaptation, or intense manual effort (e.g., annotation).
Literary texts bring their own problems, because navigating this form of creative expression requires more than the typical information-seeking tools. Examples of advanced tasks include the study of literature of a certain period, author or sub-genre, recognition of certain literary devices, or quantitative analysis of poetry.
NLP methods applied in this context not only need to achieve high performance, but are often applied as a first step in research or scholarly workflow. That is why it is crucial to interpret model results properly; model interpretability might be more important than raw performance scores, depending on the context.
More generally, there is a growing interest in computational models whose results can be used or interpreted in meaningful ways. It is, therefore, of mutual benefit that NLP experts, data specialists and Digital Humanities researchers who work in and across their domains get involved in the Computational Linguistics community and present their fundamental or applied research results. It has already been demonstrated how cross-disciplinary exchange not only supports work in the Humanities, Social Sciences, and Cultural Heritage communities but also promotes work in the Computational Linguistics community to build richer and more effective tools and models.
Topics of interest include, but are not limited to, the following:
• adaptation of NLP tools to Cultural Heritage, Social Sciences, Humanities and literature;
• automatic error detection and cleaning of textual data;
• complex annotation schemas, tools and interfaces;
• creation (fully- or semi-automatic) of semantic resources;
• creation and analysis of social networks of literary characters;
• discourse and narrative analysis/modelling, notably in literature;
• emotion analysis for the humanities and for literature;
• generation of literary narrative, dialogue or poetry;
• identification and analysis of literary genres;
• interpretability of large language models output for DH-related tasks (explainable AI);
• linking and retrieving information from different sources, media, and domains;
• low-resource and historical language processing;
• modelling dialogue literary style for generation;
• modelling of information and knowledge in the Humanities, Social Sciences, and Cultural Heritage;
• profiling and authorship attribution;
• search for scientific and/or scholarly literature;
• work with linguistic variation and non-standard or historical use of language.
Information for authors
We invite papers on original, unpublished work in the topic areas of the workshop. In addition to long papers, we will consider short papers and system descriptions (demos). We also welcome position papers.
• Long papers, presenting completed work, may consist of up to eight (8) pages of content plus additional pages of references (just two if possible -:). The final camera-ready versions of accepted long papers will be given one additional page of content (up to 9 pages) so that reviewers’ comments can be taken into account.
• A short paper / demo presenting work in progress, or the description of a system, and may consist of up to four (4) pages of content plus additional pages of references (one if you can). Upon acceptance, short papers will be given five (5) content pages in the proceedings.
• A position paper — clearly marked as such — should not exceed eight (8) pages including references.
All submissions are to follow the *ACL paper styles (for LaTeX / Overleaf and MS Word) available at https://github.com/acl-org/acl-style-files. Papers should be submitted electronically, only in PDF, via the LaTeCH-CLfL 2025 submission website on the SoftConf pages (we will publish the link as soon as we have it).
Reviewing will be double-blind. Please do not include the authors’ names and affiliations, or any references to Web sites, project names, acknowledgements and so on — anything that immediately reveals the authors’ identity. Self-references should be kept to a reasonable minimum, and anonymous citations cannot be used.
Accepted papers will be published in the workshop proceedings available as usual in the ACL Anthology.
Important dates (tentative)
Workshop paper due: January 30, 2025
Notification of acceptance: March 1, 2025
Camera-ready papers due: March 10, 2025
Workshop date: May 3rd or 4th, 2025
More on the organizers
Diego Alves, Language Science and Technology, Saarland University
Yuri Bizzoni, Center for Humanities Computing / School for Communication and Culture, Århus University
Stefania Degaetano-Ortlieb, Language Science and Technology, Saarland University
Anna Kazantseva, National Research Council Canada
Janis Pagel, Department of Digital Humanities, University of Cologne
Stan Szpakowicz, School of Electrical Engineering and Computer Science, University of Ottawa
Contact
latech-clfl(a)googlegroups.com <mailto:latech-clfl@googlegroups.com>
Dear colleagues,
Following several demands for a new extension of the submission deadline, we extend it to January, 5 2025.
You will find all the information about the international conference PAC 2025 (Phonology of Contemporary English) - Spoken English Varieties - Perception and Representations, which will be held in Aix-en-Provence from 18 to 20 June 2025, on the website: https://pac2025.sciencesconf.org/?lang=fr
The conference will be preceded on Wednesday 18 June morning by a workshop on perception experiments, and followed on Friday 20 June afternoon by a workshop on depositing and sharing data.
Important dates:
Conference: 18-20 June 2025, Laboratoire Parole et Langage, Aix-en-Provence, France.
Abstract submission date: 6 December 2024 5 January 2025
Notification: 29 January 2025
Please circulate as widely as possible and we apologise in advance for any duplication.
Sincerely,
The PAC 2025 Organising Committee.
----------------
Chères et chers collègues,
Suite à plusieurs demandes de report de la date limite de soumission, nous la reportons au 5 janvier 2025.
Vous trouverez toutes les informations concernant la conférence internationale PAC 2025 (Phonology of Contemporary English) - Spoken English Varieties - Perception and Representations, qui se tiendra à Aix-en-Provence du 18 au 20 juin 2025, sur le site web: https://pac2025.sciencesconf.org/?lang=fr
La conférence sera précédée le mercredi 18 juin matin d'un atelier sur les expériences de perception, et suivie le vendredi 20 juin d'un atelier sur le dépôt et le partage des données.
Dates à retenir :
Conférence : 18-20 juin 2025, Laboratoire Parole et Langage, Aix-en-Provence.
Date de soumission des résumés: 6 décembre 2024 5 janvier 2025
Merci de diffuser le plus largement possible et toutes nos excuses par avance pour les doublons.
Cordialement,
Le comité d’organisation PAC 2025.
*******************
Anne Przewozny-Desriaux
English linguistics - Phonology - Sociolinguistics
CLLE | CNRS UMR 5263
https://clle.univ-tlse2.fr/
Université Toulouse-Jean Jaurès, France
PAC programme & LVTI project
https://www.pacprogramme.net/
New Perspectives on English Word Stress
https://edinburghuniversitypress.com/book-new-perspectives-on-english-word-…
The Corpus Phonology of English: Multifocal Analyses of Variation
https://edinburghuniversitypress.com/book-the-corpus-phonology-of-english-h…
GermEval: Call for Shared Task Proposals
We cordially invite proposals for the GermEval Shared Task 2025 co-located with Konvens 2025 in Hildesheim, Germany (September 9-12), https://konvens-2025.hs-hannover.de/ .
Background
GermEval is a series of shared task evaluation campaigns that focus on Natural Language Processing for the German language and has been running since 2014, traditionally co-located with Konvens.
Previous shared tasks
Previous shared tasks have been devoted to:
* Named entity recognition (2014)
* Lexical Substitution (2015)
* Aspect-based Sentiment in Social Media Customer Feedback (2016)
* Offensive Language (2018, 2019)
* Hierarchical classification of blurbs (2019)
* Lemmatization of German web and social media texts (2019)
* Text Complexity (2022)
* Speaker Attribution (2023)
* Sexism Detection in German Online News Fora (2024)
See also https://germeval.github.io/tasks/ for details.
Time schedule
Proposal submission deadline: December 15, 2024
Notification: December 20, 2024
Extended submission period January 4 - February 14, 2025
Extended notification: One week after submission
Workshop: September 10, 2025
The early deadline facilitates a quick start of the organization of a shared task. Later submission is possible until February 14. We will try to review the proposal in that case within one week.
While fixing the exact timeline for the shared task is up to the task organizers, we propose the following tentative schedule:
Trial data ready: March 8, 2025
Training data ready: April 12, 2025
Test data ready: May 17, 2025
Evaluation start: June 16, 2025
Evaluation end: June 27, 2025
Paper submission due: July 11, 2025
Camera ready due: August 15, 2025
Submission guidelines
For GermEval proposals are invited for any shared task involving natural language processing in the context of the German language. For a detailed questionnaire to be submitted please see: https://gscl.org/germeval
Proposals should be submitted by e-mail to info.konvens2025(a)gscl.org<mailto:info.konvens2025@gscl.org>, from the email of the contact person, with the subject "GermEval-2025 Shared Task Proposal". Proposals should make clear that a schedule similar to that suggested above can be implemented, especially for the preparation of resources (trial/training/development/test data sets). Proposals should be 2 to 3 pages long, in ACL - PDF format. We encourage proposers to include appendices with examples that help understanding the data and type of evaluation.
Proposals will be reviewed by the KONVENS Program Committee. Proposers might be contacted during the reviewing period to provide further information. Potential proposers should feel free to indicate the intention to submit by emailing info.konvens2025(a)gscl.org<mailto:info.konvens2025@gscl.org>. We are also happy to answer your questions about the process, or to solicit early feedback for a proposal.
Proceedings
Peer-reviewed workshop papers can be published in the KONVENS 2025 Proceedings. A camera-ready version of all accepted papers should be available at latest on August 22, 2025.
Prof. Dr. Christian Wartena
Hochschule Hannover
Fakultät III - Medien, Information und Design
Abt. Information und Kommunikation
Lehrgebiet Sprach- und Wissensverarbeitung
Expo Plaza 12
30539 Hannover
e-mail: christian.wartena(a)hs-hannover.de<mailto:christian.wartena@hs-hannover.de>
[DATA-H-Logo_RGB_Unterzeile_klein]
Prof. Dr. Christian Wartena
Hochschule Hannover
Fakultät III - Medien, Information und Design
Abt. Information und Kommunikation
Lehrgebiet Sprach- und Wissensverarbeitung
Expo Plaza 12
30539 Hannover
e-mail: christian.wartena(a)hs-hannover.de<mailto:christian.wartena@hs-hannover.de>
[DATA-H-Logo_RGB_Unterzeile_klein]
more information on our Workshop-Website:
https://www.ids-mannheim.de/home/lexiktagungen/llm-fails
Dear list members,
we would like to invite you to submit abstracts to our workshop "LLM
fails – Failed experiments with Generative AI and what we can learn from
them" taking place from April 8-9, 2025 at the Leibniz Institute for the
German Language, Mannheim, Germany.
If the extended short papers are positively reviewed, there is an
opportunity to publish them in a special issue of the Journal for
Language Technology and Computational Linguistics.
Further information (automatic English translation):
Failed experiments typically have no place in scientific discourse; they
are discarded and not published. We believe this leads to a loss of
potential knowledge. After all, a systematic reflection on the reasons
for failure allows for the questioning and/or improvement of methods
used. Furthermore, when previously failed experiments are repeated and
succeed, explicit progress can be determined. Thus, the discussion and
documentation of failures creates added value for the scientific
community from the perspective of methodological reflection. This is
even more relevant in a field like research into and with Generative
Artificial Intelligence (AI), which cannot look back on decades of
tradition and where best practices are still being negotiated.
This workshop focuses on linguistic and NLP experiments with Generative
AI that did not yield the desired results, such as but not limited to:
• Using Generative AI as a Named-Entity Recognizer
• Using Generative AI for automatic transcription of spoken language
data
• Using Generative AI for the creation of dictionary entries
• Using Generative AI for the detection of language change phenomena
The contribution should clarify how this failure can contribute to
knowledge gain regarding the work with Generative AI.
Unpublished proposals can be submitted anonymously as an abstract
(500-750 words) in either German or English to the following email
address by December 11, 2024:
llmfails(at)ids-mannheim.de
The organization team will decide on the acceptance of contributions by
December 16, 2025. If a contribution is accepted, a short paper (4-6
pages without references) in English will be requested by February 15,
2025. The short papers will undergo double-blind peer review and will be
published in a special issue of the Journal for Language Technology and
Computational Linguistics and archived at ACL Anthology.
Best,
Annelen Brunner, Christian Lang, Ngoc Duyen Tanja Tu
(Organising committee)
--
Dr. Ngoc Duyen Tanja Tu
Leibniz-Institut für Deutsche Sprache
Abteilung Grammatik
Tel: +49 621-1581-242
I have a student who is interested in tracing the development of the
English novel from its origins to the present day (or at least to the
start of the twentieth century), and I'm trying to gather information
about relevant corpora covering this text type and period.
We know about the European Literary Text Collection (ELTeC,
https://www.distant-reading.net/eltec/) which will be very useful for
the later end of the timescale. We also know it is possible to assemble
a corpus from Project Gutenberg, archive.org, Oxford Text Archive, etc.
, but would be interested in re-using any corpora that people might
already have made, which aim to be representative of particular periods
within this genre.
The student has some flexibility with her research question, so while
the original idea of 'English novels' was probably 'novels in English
from Great Britain and Ireland', other related areas such as US novels
might be interesting as well.
Any tips and suggestions gratefully received. If we get a number of
interesting direct emails, I'll be happy to summarize the results to the
list.
Best wishes,
Martin
--
Senior Researcher in Corpus Linguistics
Faculty of Linguistics, Philology and Phonetics, University of Oxford
National Co-ordinator, CLARIN-UK
martin.wynne(a)ling-phil.ox.ac.uk
https://orcid.org/0000-0002-4155-0530
(Apologies for cross-posting)
Dear colleagues,
We invite participants to a three-day winter school on large-scale NLP
research, with special emphasis on pretraining data quality and
multilingual evaluation of large language models (LLMs). The school
will provide lectures and space for discussion by the following invited
speakers:
- Jenia Jitsev (Jülich Supercomputing Centre)
- Marianna Nezhurina (LAION)
- Guilherme Penedo (Huggingface)
- Gema Ramírez-Sánchez (Prompsit Language Engineering)
- Anna Rogers (IT University of Copenhagen)
- Ahmet Üstün (Cohere AI)
- other international experts to be confirmed.
The winter school is organized as a collaboration between the Horizon
Europe project High-Performance Language Technologies (HPLT,
https://hplt-project.org/) and the Nordic Language Processing Laboratory
(NLPL). The event will be held ‘in real life’ on February 3–5, 2025, in
Norway. For additional information, please see:
http://wiki.nlpl.eu/Community/training
There is no participant fee for the winter school, and HPLT will provide
free bus transfer between the Oslo airport and the conference hotel
(about two hours north of Oslo, with skiing facilities just outside the
door). Participants will need to cover their own travel to Oslo and
accommodation at the hotel (NOK 3855 for two nights in a single room,
including all meals and conference facilities).
We kindly invite expressions of interest in participation in the winter
school. Please register through the on-line form linked up from the
above overview page. We will process requests for participation on a
first-come, first-served basis, with an eye toward regional balance.
Participation will be confirmed in three batches, one on December 6,
another one on December 13, and finally after the closing date for
registration, which is Friday, December 20, 2024.
Welcome to Skeikampen in February 2025!
Andrey Kutuzov & Stephan Oepen (for the organizing team)
--
Andrey
Language Technology Group (LTG)
University of Oslo