This is the first call for participation on the 18th MT Marathon
that will take place in Helsinki on August 25-29, 2025.
The eighteenth edition of the MT Marathon will be organized by the Language Technology Research group at the University of Helsinki, Finland, with sponsorship of EAMT.
Each Machine Translation Marathon is a week-long gathering of machine
translation researchers, developers, students and users featuring:
- MT Lectures and Labs covering the basics and tutorials.
- Keynote Talks from experienced researchers and practitioners.
- Presentations of research and open source tools related to MT.
- Hacking Projects to advance tools or research in one week or start new collaborations.
Details can be found on the event page: https://blogs.helsinki.fi/language-technology/mt-marathon-2025/
** Registration **
The registration is free of charge for EAMT members. To register, use the following link:
https://forms.gle/uvrZuWpeSbcmJozK7. The registration form will remain open until the start of the event; however, please register as soon as possible if you plan to attend to help us with the planning.
** Programme **
The event will include a poster session, labs, and lessons from experts in the field, including:
- Ayodele Awokoya, McPherson University, University of Ibadan, Masakhane,
- Wilker Aziz, University of Amsterdam,
- Marta Costa-Jussa, Meta AI,
- Barry Haddow, University of Edinburgh,
- Amit Moryossef, University of Zürich,
- Sara Papi, FBK Trento,
- Jörg Tiedemann, University of Helsinki,
- Marco Turchi, Zoom,
The programme is still under construction. For up to date information about invited speakers and the topics that will be covered by talks and labs, have a look at the event page here: https://blogs.helsinki.fi/language-technology/mt-marathon-2025/
The event will also include a poster session where participants will be invited to present their own work in machine translation.
** Call for project proposals **
As always, project topics will get finalized on the first day of the Marathon, but it was found useful in the past to announce and refine project proposals earlier. If you have an idea what you'd like to implement in a small team of fellow participants, or if you just want to peek at what is going to be proposed, have a look or edit the live document linked here: https://docs.google.com/document/d/1A4Iy_iOVvYHKAwnSV2ZGIPru7t-jeMauCQd6i9G… .
Second call for paper: *BriGap-2, Bridges and Gaps between Formal and
Computational Linguistics* (an IWCS 2025 workshop)
(with our apologies for cross-posting)
Venue: IWCS 2025 (https://iwcs2025.github.io/), Düsseldorf, Germany
Date: *September 24th, 2025* (main conference: 22nd-23rd)
Workshop website: https://brigap-workshop.github.io/
BriGap-2 is a venue for linguists and NLP scientists to meet: what fruitful
interactions can we have? How do we build upon each other’s work?
* Description *
In recent years, the natural language processing (NLP) community has
shifted its focus towards engineering questions. This state of affairs is
in no small part due to the recent technical advances that have transformed
NLP as a field. In the current large language model (LLM) era, much of what
was deemed near impossible to achieve a few years prior is now taken for
granted and it stands to reason that mapping how far ahead new
computational models have advanced the field has become a central topic for
the NLP community. Hence, the current ongoing discourse in NLP focuses more
on what can be achieved through language rather than studying language for
its own sake. It seems thus that computational and formal linguistics are
now separate domains, and that the former is no longer rooted in the latter.
To what extent are these traditions truly divorced, and what fruitful
bridges can be (re)built? To answer these questions, the second iteration
of the workshop on Bridges and Gaps between Formal and Computational
Linguistics (BriGap-2) intends to provide a space for formal linguists,
computational linguists, and NLP scientists to exchange their perspectives
on how their different domains of research can build upon one another.
* Workshop topics *
- investigation of the linguistic properties of machine learning models,
- linguistic representations, vector space semantics, and their relations
with theoretical concepts such as compositionality,
- use of information-theoretical and computational methods for linguistic
inquiry,
- formal distributional semantics and neural-symbolic integration for NLP,
- formal grammars, symbolic structures and their applications for
computational linguistics and NLP,
- trends in the history of computational linguistics and NLP,
- …
* Invited speakers *
- Anna ROGERS, IT University of Copenhagen
- Kees VAN DEEMTER, Universiteit Utrecht
* Submission details *
The workshop accepts both archival (original and unpublished research) and
non-archival (work-in-progress, dissemination of research published or
accepted elsewhere, etc.) submissions in either short (up to 4 pages) or
long (up to 8 pages) format. Camera-ready versions of papers will be given
one additional page of content so that reviewers’ comments can be taken
into account.
Each submission should mention whether it targets archival or non-archival
status. Archival papers accepted at BriGap-2 will be indexed in the ACL
Anthology.
Please use the ACL style templates available here:
https://github.com/acl-org/acl-style-files
The submissions need to be done in PDF format via OpenReview, using the
following link: https://openreview.net/group?id=IWCS/2025/Workshop/BriGap-2
* Important dates *
- Submission deadline:* Friday, June 6th 2025*
- Notification of acceptance: Friday, August 1st 2025
- Workshop: *September 24th, 2025* (main conference: 22nd-23rd)
* Contact *
For questions, please send an email to brigapworkshop(a)gmail.com or contact
one of the workshop chairs:
- Timothée Bernard, Université Paris Cité, timothee.bernard(a)u-paris.fr
- Timothee Mickus, University of Helsinki, timothee.mickus(a)helsinki.fi
- Grégoire Winterstein, Université du Québec à Montréal,
winterstein.gregoire(a)uqam.ca
----------------------------
HealTAC 2025
June 16-18th, 2025, Glasgow (UK)
https://healtac2025.github.io/
----------------------------
1) Call for contributions – deadline 28 March
2) Keynotes, panels and workshop
3) Registration fees
4) Key dates
----------------------------
----------------------------------------
Call for contributions - reminder
----------------------------------------
The 8th Healthcare Text Analytics Conference (HealTAC 2025) invites contributions that address any aspect of healthcare text analytics. We invite submissions in the form of extended abstracts that describe either methodological or application work that has not been previously presented in a conference. Submissions (up to 2 pages) should be prepared based on a template that is available at the conference web site.
We also invite PhD and fellowship project submissions that describe ongoing PhD research (any stage) or a planned fellowship application. The conference will provide an opportunity to receive constructive feedback from a panel of experts.
Deadline for all submissions is March 28th, 2025.
As in previous years, there will be a post-conference call to submit a journal length paper for further peer review and publication in Frontiers in Digital Health.
----------------------------
Programme
----------------------------
We are delighted to announce keynotes by Dr Jason Fries from Stanford University and Dr Alison O'Neil from Canon Medical Research, and panels on "Opportunities and challenges in LLMs for health research: social inequalities, bias detection, and mitigation " and "Challenges in AI deployment within NHS" (industry forum).
A pre-conference workshop on June 16th will focus on "NLP in mental healthcare and research" (https://healtac2025.github.io/workshop/).
----------------------------
Registration fees
----------------------------
Due to generous support from Health Data Research UK, CogStack, Frontiers, University of Glasgow, Research Data Scotland and Healtex, we will keep the registration fee low as before: an early registration fee for students is expected to be £100 and for others £200, and will include the full 3-day programme, lunches and the conference dinner.
----------------------------
Key dates
----------------------------
Deadline for all contributions: March 28th 2025
Notification of acceptance: April 18th 2025
Early-bird registration: by May 16th 2025
Pre-conference workshop: June 16th 2025
Conference: June 17-18th 2025
Follow the conference announcements on social media at #HEALTAC2025
We are looking forward to welcoming you to HealTAC 2025.
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
DepLing 2025, Ljubljana, August 26-29
deadline April 15
We are pleased to announce the 8th International Workshop on Dependency Grammar (DepLing 2025) , which will bring together researchers interested in dependency-based approaches in linguistics and natural language processing. Dependencies, directed labeled graph structures representing hierarchical relations between morphemes, words or semantic units, have now become the standard representation of syntactic resources and NLP technologies. Depling has become the central event for people discussing the linguistic significance of these structures, their theoretical and formal foundations, their processing, and their use in NLP tools.
The workshop is part of SyntaxFest 2025 and will be hosted by University of Ljubljana in Slovenia on August 26-29, 2025.
Link to DepLing 2025: https://depling.org/depling2025/
Link to SyntaxFest 2025: https://syntaxfest.github.io/
-----------------------------
SELECTED TOPICS OF INTEREST
-----------------------------
Topics include but are not limited to:
The use of dependency structures in theoretical linguistics; a.o.:
The use of syntactic trees to model syntactic relations;
The use of semantic, valency-based or predicate-argument graph structures;
The use of dependency-like structures to model semantic and pragmatic phenomena related to information structure;
The use of dependency-like structures beyond the sentence (e.g., to model discourse phenomena);
The elaboration of formal lexicons for dependency-based syntax and semantics, including descriptions of collocations and paradigmatic relations;
The use of dependency in the field of linguistic universals, and typology.
Historical and epistemological foundations of dependency grammar; a.o.:
The definition of the very notion of dependency;
The development and the use of dependency-based diagrams;
Dependency grammar and its relation to other formalisms;
The use of dependency-like concepts in the history of grammar and linguistics.
The use of the dependency structures in corpus linguistics; a.o.:
Corpus annotation and development of dependency-based treebanks and other linguistic resources of written and spoken texts;
Recent advances in dependency-based parsing, and text generation;
Cross-lingual dependency parser evaluation, with particular emphasis on intrinsic evaluation metrics.
The relation between dependency-based grammar and other fields of science, such as, e.g., the psycholinguistic relevance of dependency grammar.
-----------------------------
INVITED SPEAKER
-----------------------------
Daniel Zeman, Inst. of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Prague
-----------------------------
IMPORTANT DATES
-----------------------------
* Paper submission deadline: 15 April 2025
* Notification of acceptance: 2 June 2025
* Camera-ready papers: 16 June 2025
* Early bird registration: June 2025
* Conference dates: 26 to 29 August 2025
-----------------------------
DepLing 2025 WORKSHOP CHAIRS
-----------------------------
* Sylvain Kahane, Paris Nanterre University
* Eva Hajičová, Charles University, Prague
We need your help to preserve indigenous languages!
Due to the overwhelming success of previous workshops like LoResMT,
AmericasNLP, and IWSLT, we have decided to continue to push the needle
for Quechua to Spanish translations another year. We ask that you kindly
participate in the 2025 edition of the QUE-SPA speech translation shared
task being held at ACL 2025. This low-resource task will help increase
language preservation for low-resource languages. We invite advanced
research and approaches of all types so bring your rule-based,
statistical, neural, and more!
IMPORTANT LINKS
Dialectal and Low-resource webpage:
https://iwslt.org/2025/low-resource
Data webpage:
https://github.com/Llamacha/IWSLT2025_Quechua_data
Google Group:
https://groups.google.com/g/iwslt-evaluation-campaign
IWSLT conference webpage:
https://iwslt.org/2025
HOW TO PARTICIPATE
Please join the IWSLT Evaluation Campaign Google Group and access the
registration using the following link:
https://groups.google.com/g/iwslt-evaluation-campaign
The QUE-SPA data set can be downloaded here:
https://github.com/Llamacha/IWSLT2025_Quechua_data
Task submissions can be uploaded to GitHub or emailed directly, please
email the organizers below for more details.
IMPORTANT DATES
Apr 21, 2025 System description paper submission deadline
May 15, 2025 Notification of acceptance
June 1, 2025 Camera ready deadline
July 31-Aug 1, 2025 IWSLT conference
ORGANIZING COMMITTEE
John E. Ortega (Northeastern University) j.ortega(a)northeastern.edu
William Chen (Carnegie Mellon University) wc4(a)andrew.cmu.edu
Rodolfo Zevallos (Universitat Pompeu Fabra) rodolfojoel.zevallos(a)upf.edu
We need your help to preserve indigenous languages!
Due to the overwhelming success of previous workshops like LoResMT,
AmericasNLP, and IWSLT, we have decided to continue to push the needle for
Quechua to Spanish translations another year. We ask that you kindly
participate in the 2025 edition of the QUE-SPA speech translation shared
task being held at ACL 2025. This low-resource task will help increase
language preservation for low-resource languages. We invite advanced
research and approaches of all types so bring your rule-based, statistical,
neural, and more!
IMPORTANT LINKS
-
Dialectal and Low-resource webpage:
https://iwslt.org/2025/low-resource
-
Data webpage:
https://github.com/Llamacha/IWSLT2025_Quechua_data
-
Google Group: https://groups.google.com/g/iwslt-evaluation-campaign
-
IWSLT conference webpage: <https://iwslt.org/2023/>https://iwslt.org/2025
HOW TO PARTICIPATE
Please join the IWSLT Evaluation Campaign Google Group and access the
registration using the following link:
https://groups.google.com/g/iwslt-evaluation-campaign
The QUE-SPA data set can be downloaded here:
https://github.com/Llamacha/IWSLT2025_Quechua_data
Task submissions can be uploaded to GitHub or emailed directly, please
email the organizers below for more details.
IMPORTANT DATES
-
Apr 21, 2025 System description paper submission deadline
-
May 15, 2025 Notification of acceptance
-
June 1, 2025 Camera ready deadline
-
July 31-Aug 1, 2025 IWSLT conference
ORGANIZING COMMITTEE
John E. Ortega (Northeastern University) j.ortega(a)northeastern.edu
William Chen (Carnegie Mellon University) wc4(a)andrew.cmu.edu
Rodolfo Zevallos (Universitat Pompeu Fabra) rodolfojoel.zevallos(a)upf.edu
Second International Workshop on Construction Grammars and NLP (CxGs+NLP 2025)
Call for Papers
Please join the workshop’s Google Group for the latest updates and to post any questions you might have: https://groups.google.com/g/cxgsnlp-workshop
Overview
Constructionist approaches to language posit that all linguistic knowledge needed for language comprehension and production can be captured as a network of form-meaning mappings, called constructions. Construction Grammars (CxGs) do not distinguish between words and grammar rules, but allow for mappings between forms and meanings of arbitrary complexity and degree of abstraction. CxGs are thereby able to uniformly capture the compositional and non-compositional aspects of language use, making the theory particularly attractive to researchers in the field of Natural Language Processing (NLP). CxG theories, for example, can serve as a valuable ‘lens’ to assess and investigate the abilities of today’s large language models, which lack explicit, theoretically grounded linguistic insights. At the same time, techniques from the field of NLP are often employed for the further development and scaling of CxG theories and applications.
This workshop aims to bring together researchers across theory and practice from the two complementary perspectives of Construction Grammar and NLP to explore how CxG approaches can both inform and benefit from NLP methods, with an emphasis on LLMs. Therefore, we invite original research papers from a broad spectrum of topics, including but not limited to:
Contributions to Construction Grammar theory
Construction Grammar Formalisms
Computational Construction Grammar Implementations
Natural Language Understanding (NLU)
Opinion pieces on the interplay between Construction Grammar and NLP
Constructions and Language Models (Mechanistic interpretability, probing (e.g., BERTology), and evaluation of LLMs)
Resources: Constructicons and corpora annotated for Construction Grammar
Construction Grammar learning and adaptation
Applications at the intersection of Construction Grammar and NLP
Invited Speakers
Adele Goldberg, Professor of Psychology, Princeton University
Thomas Hoffmann, Professor of English Language and Linguistics, Catholic University of Eichstätt-Ingolstadt
Laura Michaelis, Professor of Linguistics, University of Colorado Boulder
Venue
The 2nd CxGs+NLP workshop will be co-located with the 16th International Conference on Computational Semantics (IWCS), organized by the Heinrich Heine University (HHU) in Düsseldorf, Germany. The workshop will be held on 24 September 2025.
We are expecting the workshop to be in-person only, but are awaiting details on the possibility of a hybrid presentation option.
Important Dates
Jun 06: submission deadline
Aug 01: notification of acceptance, registration opens
Aug 22: camera-ready papers due
Sep 22-23: IWCS main conference
Sep 24: workshop
Submission information
Two types of submission are solicited: long papers and short papers. Long papers should describe original research and must not exceed 8 pages. Short papers (typically system or project descriptions, or ongoing research) must not exceed 4 pages. Acknowledgments, references, a limitations section (optional), an ethics statement (optional), and a technical appendix (optional, not subject to reviewing) do not count towards the page limit.
Accepted papers get an extra page in the camera-ready version and will be published in the conference proceedings in the ACL Anthology. Additionally, non-archival publications will be considered for acceptance into the workshop as in-person poster presentations only.
CxGs+NLP 2 papers should be formatted following the common two-column structure as used by IWCS 2021 (borrowed from ACL 2021). Please use these specific style-files or the Overleaf template.
Style files: https://iwcs2021.github.io/download/iwcs2021-templates.zip
Overleaf template: https://www.overleaf.com/latex/templates/instructions-for-iwcs-2021-proceed…
Double submission policy: We will accept submissions that have been submitted elsewhere, but require that the authors notify us, including information on where else they are submitting and let us know if the work is accepted for publication elsewhere.
Submission site TBA.
Instructions for Double-Blind Review
As reviewing will be double blind, papers must not include authors’ names and affiliations. Furthermore, self-references or links (such as github) that reveal the author’s identity, e.g., “We previously showed (Smith, 1991) …” must be avoided. Instead, use citations such as “Smith previously showed (Smith, 1991) …” Papers that do not conform to these requirements will be rejected without review. Papers should not refer, for further detail, to documents that are not available to the reviewers. For example, do not omit or redact important citation information to preserve anonymity. Instead, use third person or named reference to this work, as described above (“Smith showed” rather than “we showed”). If important citations are not available to reviewers (e.g., awaiting publication), these paper/s should be anonymised and included in the appendix. They can then be referenced from the submission without compromising anonymity. Papers may be accompanied by a resource (software and/or data) described in the paper, but these resources should also be anonymized.
Workshop Chairs
Claire Bonial (U.S. Army Research Lab)
Harish Tayyar Madabushi (The University of Bath)
Workshop Organizing Committee
Melissa Torgbi (The University of Bath)
Leonie Weissweiler (University of Texas at Austin)
Austin Blodgett (U.S. Army Research Lab)
Katrien Beuls (University of Namur,Belgium)
Paul Van Eecke (Vrije Universiteit Brussel,Belgium)
Contact: Please join the workshop’s Google Group for the latest updates and to post any questions you might have: https://groups.google.com/g/cxgsnlp-workshop
In this newsletter:
LDC data and commercial technology development
New publications:
2015 NIST Language Recognition Evaluation Test Set<https://catalog.ldc.upenn.edu/LDC2025S02>
The Xi'an Multi-Language Learner Corpus<https://catalog.ldc.upenn.edu/LDC2025T03>
________________________________
LDC data and commercial technology development
For-profit organizations are reminded that an LDC membership is a pre-requisite for obtaining a commercial license to almost all LDC databases. Non-member organizations, including non-member for-profit organizations, cannot use LDC data to develop or test products for commercialization, nor can they use LDC data in any commercial product or for any commercial purpose. LDC data users should consult corpus-specific license agreements for limitations on the use of certain corpora. Visit the Licensing<https://www.ldc.upenn.edu/data-management/using/licensing> page for further information.
________________________________
New publications:
2015 NIST Language Recognition Evaluation Test Set<https://catalog.ldc.upenn.edu/LDC2025S02> was developed by LDC and NIST. It contains the evaluation test set for the 2015 NIST Language Recognition Evaluation (LRE), approximately 867 hours of conversational telephone speech (CTS) and broadcast narrowband speech (BNBS) collected by LDC in 20 languages over 6 clusters of related languages: Arabic (Egyptian, Iraqi, Levantine, Maghrebi, Modern Standard Arabic); Spanish (Caribbean, European, Latin American, Brazilian Portuguese); English (British, Indian, General American English); Chinese (Cantonese, Mandarin, Min Nan, Wu); Slavic (Polish, Russian); and French (West African, Haitian Creole).
The CTS data includes calls between individuals in the same social networks lasting 8-15 minutes and telephone speech from the IARPA Babel series collected in 2012-2013 from speakers using a range of phone types in diverse settings with varying noise conditions. The BNBS data was collected by LDC from streaming and satellite radio programming, focusing on programs that included narrowband speech (e.g., call-ins to a talk show).
The goal of NIST's LRE evaluations is to establish the baseline of current performance capability for CTS language recognition and to lay the groundwork for further research efforts. LRE15 expanded the range of test segment durations and added a test condition that allowed systems to make use of unrestricted training data when developing models
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
The Xi'an Multi-Language Learner Corpus<https://catalog.ldc.upenn.edu/LDC2025T03> was developed by Xi'an International Studies University (XISU)<https://en.xisu.edu.cn/> and is comprised of 526 argumentative essays in 15 languages by Chinese L1 university students studying second languages, along with student metadata and writing prompts. It was developed to support second language learner research and to provide a database for cross-linguistic comparison of second languages.
Data was collected in 2023 and 2024 from students at XISU and Yunnan Minzu University (YMU) who were linguistic majors or studying one of the foreign languages available at XISU and YMU. Off-topic essays and incomplete texts were excluded.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
We’re Hiring! Assistant Professor (Tenure Track) in Natural Language
Processing
TU Wien Informatics invites applications for a full-time, tenure-track
Assistant Professor in Natural Language Processing. This position is
affiliated with both the Data Science Research Unit and the Complexity
Science Hub.
Application deadline: May 22, 2025
Location: TU Wien, Vienna, Austria
Start date: January 2026
Join us in shaping the future of NLP in a vibrant research community!
Find out more: https://informatics.tuwien.ac.at/news/2859
Apply now: https://jobs.tuwien.ac.at/Job/248962
More information:
TU Wien Informatics: https://informatics.tuwien.ac.at/
TU Wien Data Science: https://informatics.tuwien.ac.at/orgs/e194-04
Complexity Science Hub: https://csh.ac.at/
--
Allan Hanbury
Professor of Data Intelligence
Head of the Data Science Research Unit, Institute of Information Systems Engineering
Faculty Representative for Financial Affairs and Internationalization, Faculty of Informatics
TU Wien (Vienna University of Technology)
Favoritenstrasse 9-11/194-04
1040 Vienna, Austria
+43 1 58801 188310
🚀 *Join Us in Advancing AI & Data Science at the University of Chile!* 🌎💡
The *Faculty of Physical and Mathematical Sciences (FCFM)* at the *University
of Chile* is seeking two outstanding academics to join the *Department of
Computer Science (DCC)* and the *Institute for Data and Artificial
Intelligence (IDIA)*!
If you are passionate about *cutting-edge research* in *AI, data science,
and computer science*, this is your chance to work in one of Latin
America's leading research hubs. 🌍✨
📌 *Key Highlights:* 🔹 *Two full-time faculty positions* (Assistant
Professor level) 🔹 Focus on *AI, data science, and interdisciplinary
research* 🔹 Engage in *teaching, research, and industry collaboration* 🔹
Competitive salary with opportunities for additional funding 🔹 Join a
vibrant AI & data science ecosystem with leading research centers
🎯 *We are looking for experts in:* ✅ Data Engineering & Data Mining ✅
Machine Learning & Deep Learning ✅ Natural Language Processing & Multimodal
Data ✅ Autonomous Agents & AI-Driven Software Engineering ✅ Responsible AI
& Ethics in AI
📅 *Application Deadline: June 7, 2025* 📍 *Location: Santiago, Chile* 🇨🇱
🔗 *Apply now:* http://www.uchile.cl/concursoAcademico/ 🔍 *More details:*
https://comunicaciones.dcc.uchile.cl/news/966-faculty-positions-in-computer…
🏛
*Learn more about us:* 🔹 *IDIA:* idia.uchile.cl 🔹 *DCC:* dcc.uchile.cl 🔹
*FCFM:* ingenieria.uchile.cl