Apologies for cross-postings
********************************************************************************
Final Call for Papers
21st Workshop on Multiword Expressions (MWE 2025)
Organized, sponsored and endorsed by SIGLEX, the Special Interest Group on
the Lexicon of the ACL
Full-day workshop collocated with NAACL 2025, Albuquerque, New Mexico,
U.S.A., May 3 or 4, 2025
Hybrid (on-site & on-line)
Submission deadline:* February 13, 2025*
MWE 2025 website: <https://multiword.org/mwe2022/>
https://multiword.org/mwe2025/
********************************************************************************
Multiword expressions (MWEs), i.e., word combinations that exhibit lexical,
syntactic, semantic, pragmatic, and/or statistical idiosyncrasies (Baldwin
and Kim, 2010), such as “by and large”, “hot dog”, “make a decision” and
“break one's leg” are still a pain in the neck for Natural Language
Processing (NLP). The notion encompasses closely related phenomena: idioms,
compounds, light-verb constructions, phrasal verbs, rhetorical figures,
collocations, institutionalized phrases, etc. Given their irregular nature,
MWEs often pose complex problems in linguistic modeling (e.g. annotation),
NLP tasks (e.g. parsing), and end-user applications (e.g. natural language
understanding and Machine Translation), hence still representing an open
issue for computational linguistics (Constant et al., 2017).
For more than two decades, modelling and processing MWEs for NLP has been
the topic of the MWE workshop organised by the MWE section
<https://multiword.org/> of ACL-SIGLEX <http://www.siglex.org/> in
conjunction with major NLP conferences since 2003. Impressive progress has
been made in the field, but our understanding of MWEs still requires much
research considering their need and usefulness in NLP applications. This is
also relevant to domain-specific NLP pipelines that need to tackle
terminologies most often realised as MWEs. Following previous years, for
this 21st edition of the workshop, we identified the following topics on
which contributions are particularly encouraged:
-
MWE processing to enhance end-user applications. MWEs gained particular
attention in end-user applications, including Machine Translation (MT)
(Zaninello and Birch, 2020), simplification (Kochmar et al., 2020),
language learning and assessment (Paquot et al., 2020), social media mining
(Pelosi et al., 2017), and abusive language detection (Zampieri et al.
2020). We believe that it is crucial to extend and deepen these first
attempts to integrate and evaluate MWE technology in these and further
end-user applications.
-
MWE processing and identification in the general language, as well as in
specialized languages and domains: Multiword terminology extraction from
domain-specific corpora (Lossio-Ventura et al, 2014) is of particular
importance to various applications, such as MT (Semmar and Laib, 2017), or
for the identification and monitoring of neologisms and technical jargon
(Chatzitheodorou and Kappatos, 2021).
-
MWE processing in low-resource languages: The PARSEME shared tasks (2017
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_05_MWE_2…>,
2018
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_04_LAW-M…>,
2020
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_02_MWE-L…>)
among others, have fostered significant progress in MWE identification,
providing datasets that include low-resource languages, evaluation
measures, and tools that now allow fully integrating MWE identification
into end-user applications. There are continuous efforts in this direction
(Diaz Hernandez, 2024) and a few of them have also explored methods for the
automatic interpretation of MWEs (Bhatia et al., 2018), and their
processing in low-resource languages (Eder et al., 2021). Resource creation
and sharing should be pursued in parallel with the development of
multilingual benchmarks for MWE identification (Savary et al., 2023).
-
MWE identification and interpretation in LLMs: Most current MWE
processing is limited to their identification and detection using
pre-trained language models, but we still lack understanding about how MWEs
are represented and dealt with therein (Garcia et al., 2021), how to better
model the compositionality of MWEs from semantics (Phelps et al., 2024).
Now that NLP has shifted towards end-to-end neural models like BERT,
capable of solving complex tasks with little or no intermediary linguistic
symbols, questions arise about the extent to which MWEs should be
implicitly or explicitly modelled (Shwartz and Dagan, 2019).
-
New and enhanced representation of MWEs in language resources and
computational models of compositionality as gold standards for formative
intrinsic evaluation.
Through this workshop, we will bring together and encourage researchers in
various NLP subfields to submit their MWE-related research, We also intend
to consolidate the converging results of previous joint workshops LAW-MWE-CxG
2018 <http://multiword.sourceforge.net/lawmwecxg2018/>, MWE-WN 2019
<http://multiword.sourceforge.net/mwewn2019/> and MWE-LEX 2020
<http://multiword.sourceforge.net/mwelex2020/>, the joint MWE-WOAH panel in
2021 <https://multiword.org/mwe2021/#program>, the MWE-SIGUL 2022 joint
session <https://multiword.org/mwe2022/>, and the MWE-UD 2024
<https://multiword.org/mweud2024/>, extending our scope to MWEs in
e-lexicons, and WordNets, MWE annotation, as well as grammatical
constructions. Correspondingly, we call for papers on research related (but
not limited) to MWEs and constructions in:
-
Computationally-applicable theoretical work in psycholinguistics and
corpus linguistics;
-
Annotation (expert, crowdsourcing, automatic) and representation in
resources such as corpora, treebanks, e-lexicons, WordNets, constructions
(also for low-resource languages);
-
Processing in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG,
LFG, TAG, UD, etc.);
-
Discovery and identification methods, including for specialized
languages and domains such as clinical or biomedical NLP;
-
Interpretation of MWEs and understanding of text containing them;
-
Language acquisition, language learning, and non-standard language (e.g.
tweets, speech);
-
Evaluation of annotation and processing techniques;
-
Retrospective comparative analyses from the PARSEME shared tasks;
-
Processing for end-user applications (e.g. MT, NLU, summarisation,
language learning, etc.);
-
Implicit and explicit representation in pre-trained language models and
end-user applications;
-
Evaluation and probing of pre-trained language models;
-
Resources and tools (e.g. lexicons, identifiers) and their integration
into end-user applications;
-
Multiword terminology extraction;
-
Adaptation and transfer of annotations and related resources to new
languages and domains including low-resource ones.
Submission formats:
The workshop invites two types of submissions:
-
archival submissions that present substantially original research in
both long paper format (8 pages + references) and short paper format (4
pages + references).
-
non-archival submissions of abstracts describing relevant research
presented/published elsewhere which will not be included in the MWE
proceedings.
Paper submission and templates
Papers should be submitted via the workshop's submission page
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/MWE> (
https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/MWE). Please
choose the appropriate submission format (archival/non-archival). Archival
papers with existing reviews will also be accepted through the ACL Rolling
Review. Submissions must follow the ACL stylesheet
<https://github.com/acl-org/acl-style-files>.
Important Dates
Paper Submission Deadline: February 13, 2025
Notification of acceptance: March 8, 2025
Camera-ready papers due: March 17, 2025
Workshop: May 3 or 4, 2025
All deadlines are at 23:59 UTC-12 (Anywhere on Earth).
Organizing Committee
Verginica Barbu Mititelu, Voula Giouli, Grazina Korvel, A. Seza Doğruöz,
Alexandre Rademaker, Atul Kr. Ojha, Mathieu Constant
Anti-harassment policy
The workshop follows the ACL anti-harassment policy
<https://www.aclweb.org/adminwiki/index.php?title=Anti-Harassment_Policy>.
Contact
For any inquiries regarding the workshop, please send an email to the
Organizing Committee at mwe2025workshop(a)gmail.com.
Apologies for cross-posting.
---------------------------------------------------------------------------
*The Eighth Workshop on Technologies for Machine Translation of
Low-Resource Languages (LoResMT 2025)*
*https://www.loresmt.org/ <https://www.loresmt.org/>*
*@ NAACL 2025 (May 3–4, 2025)*
*Albuquerque, New Mexico, U.S.A.*
*SUBMISSION*
*
<https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/LoResMT>https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT>*
*TIMELINE*
*Paper submission due:* *February 13, 2025* (Anywhere on Earth)
*Pre-reviewed (ARR) submission deadline:* *February 27, 2025*
*Notification of acceptance:* March 8, 2025
*Camera-ready papers due:* March 17, 2025 (Anywhere on Earth)
*Pre-recorded video due (hard deadline):* April 8, 2025
*Workshop dates at NAACL 2025:* May 3–4, 2025
*SCOPE*
Based on the success of past low-resource machine translation (MT)
workshops at AMTA 2018, MT Summit 2019, AACL-IJCNLP 2020, AMTA 2021, COLING
2022, EACL 2023, ACL 2024, we introduce LoResMT 2025 workshop at NAACL
2025. The workshop provides a discussion panel for researchers working on
MT systems/methods for low-resource and under-represented languages in
general. We would like to help review/overview the state of MT for
low-resource languages and define the most important directions. We also
solicit papers dedicated to supplementary NLP tools that are used in any
language and especially in low-resource languages. Overview papers of these
NLP tools are very welcome. It will be beneficial if the evaluations of
these tools in research papers include their impact on the quality of MT
output.
*TOPICS*
We are highly interested in (1) original research papers, (2)
review/opinion papers, and (3) online systems on the topics below; however,
we welcome all novel ideas that cover research on low-resource languages.
- Neural machine translation (NMT) for low-resource languages
- Use of LLMs (large language models) for low-resource MT systems
- COVID-related corpora, their translations and corresponding NLP/MT systems
- Work that presents online systems for practical use by native speakers
- Word tokenizers/de-tokenizers for specific languages
- Word/morpheme segmenters for specific languages
- Alignment/Re-ordering tools for specific language pairs
- Use of morphology analyzers and/or morpheme segmenters in MT
- Multilingual/cross-lingual NLP tools for MT
- Corpora creation and curation technologies for low-resource languages
- Review of available parallel corpora for low-resource languages
- Research and review papers on MT methods for low-resource languages
- MT systems/methods (e.g. rule-based, SMT, NMT) for low-resource languages
- Pivot MT for low-resource languages
- Zero-shot MT for low-resource languages
- Fast building of MT systems for low-resource languages
- Re-usability of existing MT systems for low-resource languages
- Machine translation for language preservation
*SUBMISSION INFORMATION*
We are soliciting two types of submissions: (1) research, review, and
position papers and (2) system demonstration papers. For research, review
and position papers, the length of each paper should be at least four (4)
and not exceed eight (8) pages, plus unlimited pages for references. For
system demonstration papers, the limit is four (4) pages. Submissions
should be formatted according to the official ACL style templates
(Overleaf). Please refer to the NAACL submission guideline for further
information <https://2025.naacl.org/calls/papers/#paper-submission-details>.
Accepted papers will be published at ACL Anthology in the NAACL 2025 and
will be presented at the conference.
Submissions must be anonymized and should be done using the provided
submission system. Scientific papers that have been or will be submitted to
other venues must be declared as such and must be withdrawn from the other
venues if accepted and published at LoResMT. The review will be
double-blind. Authors of an accepted paper should present their paper in
person at NAACL 2025. Papers should be submitted in PDF to the LoResMT Open
Review
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT>.
We would like to encourage authors to cite papers written in ANY language
that are related to the topics, as long as both original bibliographic
items and their corresponding English translations are provided.
Registration is handled by the main conference (https://2025.naacl.org/).
*ORGANIZING COMMITTEE (LISTED ALPHABETICALLY)*
Atul Kr. Ojha, University of Galway
Chao-Hong Liu, Potamu Research Ltd
Ekaterina Vylomova, University of Melbourne, Australia
Jonathan Washington, Swarthmore College
Nathaniel Oco, National University (Philippines)
Flammie Pirinen, UiT The Arctic University of Norway, Tromsø
Xiaobing Zhao, Minzu University of China
*PROGRAM COMMITTEE (LISTED ALPHABETICALLY)*
Abigail Walsh, ADAPT Centre, Dublin City University, Ireland
Alberto Poncelas, Rakuten, Singapore
Ali Hatami, University of Galway
Alina Karakanta, Fondazione Bruno Kessler (FBK), University of Trento
Anna Currey, AWS AI Labs
Aswarth Abhilash Dara, Walmart Global Technology
Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP
Bogdan Babych, Heidelberg University
Chao-hong Liu, Potamu Research Ltd
Constantine Lignos, Brandeis University, USA
Daan van Esch, Google
Dana Moukheiber, Massachusetts Institute of Technology
Ekaterina Vylomova, University of Melbourne, Australia
Eleni Metheniti, CLLE-CNRS and IRIT-CNRS
Flammie Pirinen, UiT Norgga árktalaš universitehta
Gaurav Negi, University of Galway
Jinliang Lu, Institute of automation, Chinese Academy of Sciences
John Philip McCrae, University of Galway
Jonathan Washington, Swarthmore College
Koel Dutta Chowdhury, Saarland University
Majid Latifi, UPC University
Maria Art Antonette Clariño, University of the Philippines Los Baños
Milind Agarwal, George Mason University
Mathias Müller, University of Zurich
Nathaniel Oco, De La Salle University
Pavel Rychlý, Masaryk University and Lexical Computing
Pengwei Li, Meta
Rashid Ahmad, International Institute of Information Technology, Hyderabad
Rico Sennrich, University of Zurich
Santanu Pal, Wipro
Sangjee Dondrub, Qinghai Normal University
Sardana Ivanova, University of Helsinki
Sourabrata Mukherjee, Charles University
Thepchai Supnithi, National Electronics and Computer Technology Center
Timothee Mickus, University of Helsinki
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Wen Lai, LMU Munich
Xuebo Liu, Harbin Institute of Technolgy, Shenzhen
Yalemisew Abgaz, Dublin City University
Yasmin Moslem, Bering Lab
Zhanibek Kozhirbayev, National Laboratory Astana, Nazarbayev University
*CONTACT*
Please email loresmt(a)googlegroups.com if you have any
questions/comments/suggestions.
Apologies for cross-postings]
********************************************************************************
Second Call for Papers
21st Workshop on Multiword Expressions (MWE 2025)
Organized, sponsored and endorsed by SIGLEX, the Special Interest Group on
the Lexicon of the ACL
Full-day workshop collocated with NAACL 2025, Albuquerque, New Mexico,
U.S.A., May 3 or 4, 2025
Hybrid (on-site & on-line)
Submission deadline: January 30, 2025
MWE 2025 website: <https://multiword.org/mwe2022/>
https://multiword.org/mwe2025/
********************************************************************************
Multiword expressions (MWEs), i.e., word combinations that exhibit lexical,
syntactic, semantic, pragmatic, and/or statistical idiosyncrasies (Baldwin
and Kim, 2010), such as “by and large”, “hot dog”, “make a decision” and
“break one's leg” are still a pain in the neck for Natural Language
Processing (NLP). The notion encompasses closely related phenomena: idioms,
compounds, light-verb constructions, phrasal verbs, rhetorical figures,
collocations, institutionalized phrases, etc. Given their irregular nature,
MWEs often pose complex problems in linguistic modeling (e.g. annotation),
NLP tasks (e.g. parsing), and end-user applications (e.g. natural language
understanding and Machine Translation), hence still representing an open
issue for computational linguistics (Constant et al., 2017).
For more than two decades, modelling and processing MWEs for NLP has been
the topic of the MWE workshop organised by the MWE section
<https://multiword.org/> of ACL-SIGLEX <http://www.siglex.org/> in
conjunction with major NLP conferences since 2003. Impressive progress has
been made in the field, but our understanding of MWEs still requires much
research considering their need and usefulness in NLP applications. This is
also relevant to domain-specific NLP pipelines that need to tackle
terminologies most often realised as MWEs. Following previous years, for
this 21st edition of the workshop, we identified the following topics on
which contributions are particularly encouraged:
-
MWE processing to enhance end-user applications. MWEs gained particular
attention in end-user applications, including Machine Translation (MT)
(Zaninello and Birch, 2020), simplification (Kochmar et al., 2020),
language learning and assessment (Paquot et al., 2020), social media mining
(Pelosi et al., 2017), and abusive language detection (Zampieri et al.
2020). We believe that it is crucial to extend and deepen these first
attempts to integrate and evaluate MWE technology in these and further
end-user applications.
-
MWE processing and identification in the general language, as well as in
specialized languages and domains: Multiword terminology extraction from
domain-specific corpora (Lossio-Ventura et al, 2014) is of particular
importance to various applications, such as MT (Semmar and Laib, 2017), or
for the identification and monitoring of neologisms and technical jargon
(Chatzitheodorou and Kappatos, 2021).
-
MWE processing in low-resource languages: The PARSEME shared tasks (2017
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_05_MWE_2…>,
2018
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_04_LAW-M…>,
2020
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_02_MWE-L…>)
among others, have fostered significant progress in MWE identification,
providing datasets that include low-resource languages, evaluation
measures, and tools that now allow fully integrating MWE identification
into end-user applications. There are continuous efforts in this direction
(Diaz Hernandez, 2024) and a few of them have also explored methods for the
automatic interpretation of MWEs (Bhatia et al., 2018), and their
processing in low-resource languages (Eder et al., 2021). Resource creation
and sharing should be pursued in parallel with the development of
multilingual benchmarks for MWE identification (Savary et al., 2023).
-
MWE identification and interpretation in LLMs: Most current MWE
processing is limited to their identification and detection using
pre-trained language models, but we still lack understanding about how MWEs
are represented and dealt with therein (Garcia et al., 2021), how to better
model the compositionality of MWEs from semantics (Phelps et al., 2024).
Now that NLP has shifted towards end-to-end neural models like BERT,
capable of solving complex tasks with little or no intermediary linguistic
symbols, questions arise about the extent to which MWEs should be
implicitly or explicitly modelled (Shwartz and Dagan, 2019).
-
New and enhanced representation of MWEs in language resources and
computational models of compositionality as gold standards for formative
intrinsic evaluation.
Through this workshop, we will bring together and encourage researchers in
various NLP subfields to submit their MWE-related research, We also intend
to consolidate the converging results of previous joint workshops LAW-MWE-CxG
2018 <http://multiword.sourceforge.net/lawmwecxg2018/>, MWE-WN 2019
<http://multiword.sourceforge.net/mwewn2019/> and MWE-LEX 2020
<http://multiword.sourceforge.net/mwelex2020/>, the joint MWE-WOAH panel in
2021 <https://multiword.org/mwe2021/#program>, the MWE-SIGUL 2022 joint
session <https://multiword.org/mwe2022/>, and the MWE-UD 2024
<https://multiword.org/mweud2024/>, extending our scope to MWEs in
e-lexicons, and WordNets, MWE annotation, as well as grammatical
constructions. Correspondingly, we call for papers on research related (but
not limited) to MWEs and constructions in:
-
Computationally-applicable theoretical work in psycholinguistics and
corpus linguistics;
-
Annotation (expert, crowdsourcing, automatic) and representation in
resources such as corpora, treebanks, e-lexicons, WordNets, constructions
(also for low-resource languages);
-
Processing in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG,
LFG, TAG, UD, etc.);
-
Discovery and identification methods, including for specialized
languages and domains such as clinical or biomedical NLP;
-
Interpretation of MWEs and understanding of text containing them;
-
Language acquisition, language learning, and non-standard language (e.g.
tweets, speech);
-
Evaluation of annotation and processing techniques;
-
Retrospective comparative analyses from the PARSEME shared tasks;
-
Processing for end-user applications (e.g. MT, NLU, summarisation,
language learning, etc.);
-
Implicit and explicit representation in pre-trained language models and
end-user applications;
-
Evaluation and probing of pre-trained language models;
-
Resources and tools (e.g. lexicons, identifiers) and their integration
into end-user applications;
-
Multiword terminology extraction;
-
Adaptation and transfer of annotations and related resources to new
languages and domains including low-resource ones.
Submission formats:
The workshop invites two types of submissions:
-
archival submissions that present substantially original research in
both long paper format (8 pages + references) and short paper format (4
pages + references).
-
non-archival submissions of abstracts describing relevant research
presented/published elsewhere which will not be included in the MWE
proceedings.
Paper submission and templates
Papers should be submitted via the workshop's submission page
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/MWE> (
https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/MWE). Please
choose the appropriate submission format (archival/non-archival). Archival
papers with existing reviews will also be accepted through the ACL Rolling
Review. Submissions must follow the ACL stylesheet
<https://github.com/acl-org/acl-style-files>.
Important Dates
Paper Submission Deadline: January 30, 2025
Notification of acceptance: March 1, 2025
Camera-ready papers due: March 10, 2025
Workshop: May 3 or 4, 2025
All deadlines are at 23:59 UTC-12 (Anywhere on Earth).
Organizing Committee
Verginica Barbu Mititelu, Voula Giouli, Grazina Korvel, A. Seza Doğruöz,
Alexandre Rademaker, Atul Kr. Ojha, Mathieu Constant
Anti-harassment policy
The workshop follows the ACL anti-harassment policy
<https://www.aclweb.org/adminwiki/index.php?title=Anti-Harassment_Policy>.
Contact
For any inquiries regarding the workshop, please send an email to the
Organizing Committee at mwe2025workshop(a)gmail.com.
We offer several fully funded four-year PhD positions at the Language Faculty at Uppsala University.
One position is in Computational Linguistics, with a specialization in Nordic Languages. This position requires knowledge of a Scandinavian language and will be carried out as part of the research project "Language change and non-fictional texts – a large-scale investigation of Late Modern Swedish (1800–1950)”, led by Sara Stymne and David Håkansson
One PhD position in computational linguistics with a specialization in Scandinavian languages at the Department of Linguistics and Philology, UFV-PA 2024/4415<https://uu.varbi.com/en/what:job/jobID:781989/>
Several positions are focused on projects related to linguistic diversity and are open to students in Computational Linguistics, Linguistics, as well as other language subjects.
Five PhD positions on the theme of linguistic diversity at the Department of Linguistics and Philology, UFV-PA 2024/4412<https://uu.varbi.com/en/what:job/jobID:781937/>
One PhD position on the theme of linguistic diversity within any research environment at the faculty, UFV-PA 2025/18<https://uu.varbi.com/en/what:job/jobID:785710/>
There are also several positions in several other language subjects.
https://www.uu.se/en/about-uu/join-us/jobs-and-vacancies/job-details?query=…
Application for all positions closes on March 3.
Best,
Sara
När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy
Dear colleagues,
FYI although on-site attendance is limited to 400 because of the size of
the conference room,
registration for on site attendance is still open.
Contact: lt4all2025-contact(a)ml.naist.ac.jp
Further details can be found at https://www.lt4all2025.eu/
- LT4All 2.0 Advancing Humanism through
Language Technologies
24-26 February 2025,
UNESCO Headquarters, Paris, France
❖ Language Technologies (LT), nurtured in research laboratories for half a
century, are now spreading widely across numerous applications. However,
the
situation varies significantly among the more than 7,500 languages spoken
worldwide.
❖ The first LT4All (LT4All 1.0) conference in 2019 highlighted the
critical role of
multilingualism in cutting-edge technology. This spurred significant
initiatives
by various research institutions and major technological companies toward
developing language technologies for a wider range of languages.
❖ Despite significant progress, many communities are still being left
behind.
The critical issue lies not just in creating language technologies
for numerous languages but also in collaborating with communities
to develop the solutions they need
Organized within the framework of the International Decade of Indigenous
Languages (IDIL 2022-2032) and to commemorate the Silver Jubilee of
International
Mother Language Day 2025, the second edition of LT4All (LT4All 2.0) aims to
further the agenda of language technologies with a focus on community
empowerment. The goal is to harness technology not only to advance
itself but
also to support and enhance individuals' capabilities.
The conference is organized by the international Language Resources
Association
(ELRA) and its Special Interest Group on Under-resourced languages
(SIGUL), a
joint SIG of ELRA and of the International Speech Communication Association
(ISCA), in partnership with UNESCO.
Further details can be found at https://www.lt4all2025.eu/
Contact: lt4all2025-contact(a)ml.naist.ac.jp
For the LT4ALL organization committee
Claudia Soria
facebook <https://www.facebook.com/CNRsocialFB> twitter
<https://twitter.com/CNRsocial_> instagram
<https://www.instagram.com/cnrsocial/> linkedin
<https://www.linkedin.com/company/283032>
Claudia Soria
CNR, ISTITUTO DI LINGUISTICA COMPUTAZIONALE "ANTONIO ZAMPOLLI"
claudia.soria(a)ilc.cnr.it
Tel. 0503153166
Via Giuseppe Moruzzi, 1, 56124 – Pisa
www.ilc.cnr.it
*www.cnr.it* <http://www.cnr.it/>
Devolvi il 5×1000 al CNR
CF 80054330586
Dear list members,
On behalf of the organizing committee, I would like to invite you and your
colleagues to submit papers or poster abstracts to the 29th International
Conference on Asian Language Processing (IALP).
IALP 2025 will be jointly organized by the Faculty of Computer Science and
Information Technology, Universiti Malaysia Sarawak (UNIMAS), and the
Chinese and Oriental Languages Information Processing Society (COLIPS). The
conference will take place from 4–6 August 2025 at the Borneo Cultures
Museum, Kuching, Sarawak in Malaysia.
The International Conference on Asian Language Processing (IALP) is the
flagship event of COLIPS, uniquely focused on advancing research in Asian
language processing. As a recurring conference series, IALP brings together
researchers from diverse linguistic disciplines to foster the development
of science and technology in all areas of Asian language processing. By
providing a collaborative platform, the conference facilitates knowledge
exchange and the exploration of the latest innovations in the field.
All accepted papers will be submitted for potential inclusion in IEEE
proceedings, subject to fulfilling IEEE’s quality standards.
We welcome research papers on techniques, methodologies, and approaches
that include the following but not limited to:
A. Speech:
• Spoken language processing
• Spoken language understanding
• Spoken language generation
• Spoken language translation
• Speech recognition and synthesis
• Rich transcription and spoken information retrieval
• Multimodal representations and processing
• Speaker diariazation and speech enhancement
• Speaker recognition and anti-spoofing
• Trustworthy speech technology
B. Natural Language Processing (NLP):
• Dialogue and interactive systems
• Evaluation methods and user studies
• Information extraction, retrieval, and text mining
• Interpretability and analysis of models for NLP
• Language modeling and statistical methods for NLP
• Machine learning for Natural Language Processing
• Machine translation and multilingual processing
• NLP in vertical domains, such as biomedical, chemical, and legal text
• NLP on noisy unstructured text, such as email, blogs, and SMS
• Natural language applications, tools, and resources
• Question answering
• Sentiment analysis, stylistic analysis, and argument mining
• Tagging, chunking, and parsing
• Text entailment, paraphrasing, generation
• Large language models
• Text and speech resource development
C. Linguistics:
• Asian language input, output, coding, etc.
• Computational linguistics and mathematical linguistics
• Discourse and pragmatics
• Language learning, teaching, and computer-aided language learning
• Lexical semantics, sentence-level semantics, and textual inference
• Linguistic theories, cognitive modeling, and psycholinguistics
• Phonology, morphology, and word segmentation
• Special hardware and software for Asian language computing
To submit your paper, please use the following link:
https://cmt3.research.microsoft.com/User/Login?ReturnUrl=%2FIALP2025. You
can also visit the conference website at www.ialp2025.org for detailed
submission instructions. The deadline for full paper and poster abstract
submissions is *31 March 2025*.
We are excited to receive your submissions and welcome you to IALP 2025.
Should you have any questions or concerns, please do not hesitate to
contact us at ialp(a)unimas.my.
Thank you, and we hope to see you at the Borneo Cultures Museum!
*Sarah Samson Juan*
Senior Lecturer/Deputy Dean of Industry and Community Engagement
Faculty of Computer Science and Information Technology
Universiti Malaysia Sarawak
Kota Samarahan 94300
Sarawak, MALAYSIA
Email: sjsflora(a)unimas.my / sarah.f.juan(a)gmail.com
Website: https://www.fcsit.unimas.my/
Expert: https://expert.unimas.my/profile/1319
++ 2nd reminder to participate in our web survey on data annotation bottlenecks and active learning; apologies for cross-posting ++
Dear list members,
We invite you to participate in our web survey exploring how recent advancements in NLP, such as LLMs, have changed the need for labeled data in Supervised Machine Learning.
Survey details:
* Topic: Web survey on Data Annotation and Active Learning
* Target group: Researchers and practitioners alike in the fields of NLP, Supervised Machine Learning, and Active Learning in particular (knowledge of Active Learning is not required)
* Duration: 5-15 minutes
* Deadline for participation: January 12 26, 2025
* Survey link: https://bildungsportal.sachsen.de/umfragen/limesurvey/index.php/538271
Why should I invest my time in this survey?
* Make an impact: Participate in a community-effort and help to gain a better understanding of the current state and open issues on methods that are used to overcome a lack of labeled data.
* Gain insights: Receive a report with key findings to incorporate these insights into research and development of new methods and technologies.
Thank you for considering participating in our survey!
If you have any questions or require additional information, please don't hesitate to contact us directly at activelearningsurvey2024(a)gmail.com<mailto:activeLearningSurvey2024@gmail.com>.
If you know colleagues or peers who might be interested, we'd be grateful if you could forward this survey to them as well.
Best regards,
Julia Romberg (GESIS - Leibniz Institute for the Social Sciences, Germany)
Christopher Schröder (Institut für Angewandte Informatik e. V., Germany)
Julius Gonsior (TUD Dresden University of Technology)
------------------------------------------------------------------------
[gesis-logo-new-50-50]
Leibniz Institute for the Social Sciences
Julia Romberg
Computational Social Science, Team Data Science Methods
+49(221)47694-742
Neural language models have revolutionised natural language processing (NLP) and have provided state-of-the-art results for many tasks. However, their effectiveness is largely dependent on the pre-training resources. Therefore, language models (LMs) often struggle with low-resource languages in both training and evaluation. Recently, there has been a growing trend in developing and adopting LMs for low-resource languages. LoResLM aims to provide a forum for researchers to share and discuss their ongoing work on LMs for low-resource languages.
LoResLM 2025 will be a physical workshop co-located with COLING 2025, Abu Dhabi on 20th January 2025.
We are pleased to share the programme of LoResLM 2025 with you. Please visit https://loreslm.github.io/program for the full programme.
To register for the workshop, please visit https://coling2025.org/registration/
We are looking forward to welcoming you at LoResLM 2025 in Abu Dhabi.
The workshop is supported in part by CLARIN-UK, funded by the Arts and Humanities Research Council as part of the Infrastructure for Digital Arts and Humanities programme.
>> Keynote Speaker
Jose Camacho-Collados, Cardiff University.
Title - "Multilinguality and Cultural Awareness in Language Models"
>> Organising Committee
Hansi Hettiarachchi, Lancaster University, UK
Tharindu Ranasinghe, Lancaster University, UK
Paul Rayson, Lancaster University, UK
Ruslan Mitkov, Lancaster University, UK
Mohamed Gaber, Birmingham City University, UK
Damith Premasiri, Lancaster University, UK
Fiona Anting Tan, National University of Singapore, Singapore
Lasitha Uyangodage, University of Münster, Germany
>> Programme Committee
Gábor Bella - IMT Atlantique, France
Samuel Cahyawijaya - The Hong Kong University of Science and Technology, Hong Kong
Burcu Can - University of Stirling, UK
Çağrı Çöltekin - University of Tübingen, Germany
Raj Dabre - National Institute of Information and Communications Technology, Japan
Vera Danilova - Uppsala University, Sweden
Debashish Das - Birmingham City University, UK
Ona de Gibert - University of Helsinki, Finland
Alphaeus Dmonte - George Mason University, USA
Bonaventure F. P. Dossou - McGill University, Canada
Daan van Esch - Google
Ignatius Ezeani - Lancaster University, UK
Anna Furtado - University of Galway, Ireland
Amal Htait - Aston University, UK
Ali Hürriyetoğlu - Wageningen University & Research, Netherlands
Danka Jokic - University of Belgrade, Serbia
Diptesh Kanojia - University of Surrey, UK
Daisy Lal - Lancaster University, UK
Colin Leong - University of Dayton, USA
Veronika Lipp - Hungarian Research Centre for Linguistics, Hungary
Muhidin Mohamed - Aston University, UK
Farhad Nooralahzadeh - University of Zurich, Switzerland
Rrubaa Panchendrarajan - Queen Mary University of London, UK
Nadeesha Pathirana - Aston University, UK
Alistair Plum - University of Luxembourg, Luxembourg
Nishat Raihan - George Mason University, USA
Omid Rohanian - University of Oxford, UK
Sandaru Seneviratne - Australian National University, Australia
Ravi Shekhar - University of Essex, UK
Archchana Sindhujan - University of Surrey, UK
Claytone Sikasote - University of Cape Town, South Africa
Marjana Prifti Skenduli - University of New York Tirana, Albania
Uthayasanker Thayasivam - University of Moratuwa, Sri Lanka
Taro Watanabe - Nara Institute of Science and Technology, Japan
Edlira Vakaj - Birmingham City University, UK
John Vidler - Lancaster University, UK
Phil Weber - Aston University, UK
Bryan Wilie - Hong Kong University of Science & Technology, Hong Kong
Artūrs Znotiņš - University of Latvia, Latvia
URL - https://loreslm.github.io/
Twitter - https://x.com/LoResLM2025
LinkedIn - https://www.linkedin.com/company/loreslm/
Apologies for cross-posting.
----------------------------------------
*The International Conference on Spoken Language Translation*
ACL – 22nd* IWSLT 2025 – **S**econd** Call for Participation*
*31 July-1 August 2025 - Vienna, Austria*
http://iwslt.org
The International Conference on Spoken Language Translation (IWSLT)
<https://iwslt.org/> is the premier annual conference for all aspects of
Spoken Language Translation. Every year, the conference organises and
sponsors open evaluation campaigns around key challenges in simultaneous
and consecutive translation, under real-time/low latency or offline
conditions and under low-resource or multilingual constraints. System
descriptions and results from participants’ systems and scientific papers
related to key algorithmic advances and best practices are presented.
IWSLT is the venue of the SIGSLTs <https://iwslt.org/sigslt/>, the Special
Interest Group on Spoken Language Translation <https://iwslt.org/sigslt/>
of ACL <https://www.aclweb.org/portal/>, ISCA <https://www.isca-speech.org/>
and ELRA <https://www.elra.info/>. With a track record of 21 years, IWSLT
benchmarks and proceedings serve as reference for all researchers and
practitioners working on speech translation and related fields.
The 22nd edition of IWSLT will be run as a hybrid ELRA
<https://www.elra.info/>/ACL <https://www.aclweb.org/portal/> event,
co-located with ACL 2025 <https://2025.aclweb.org/> from 31 July to 1
August 2025.
*Important Dates*
*January 1, 2025*: Release of shared task training and dev data
*March 15, 2025*: Scientific paper submission deadline
*Apr 1-15, 2025*: Evaluation period
*April 21, 2025*: System description paper submission deadline
*May 15, 2025*: Notification of acceptance
*June 1, 2025*: Camera-ready deadline (all paper)
*July 31-Aug 1*, *2025*: IWSLT conference
Evaluation
The IWSLT 2025 features shared tasks <https://iwslt.org/2025/#shared-tasks>
that address the following focus areas:
- High-resource ST: Offline track, Simultaneous track, Subtitling track
- Low-resource ST: Low-resource and Indic (multilingual) tracks
- Instruction-following Speech Processing track: Technical domain ST, ASR,
Summarization, and QA
Training and development data for each shared task will be prepared and
released by the respective organisers (for further information on this
initiative, please refer to the IWSLT website <https://iwslt.org/2025/>).
Participants will receive instructions about how to submit their runs. In
addition, participants have the opportunity to present their work
through a system
paper that will be published in the ACL Proceedings.
Conference
IWSLT also invites submissions of scientific papers to be published in the
ACL Proceedings and presented either in oral or poster format. The
conference selects high-quality, original contributions on theoretical and
practical issues of spoken language translation research, technologies and
applications. Submissions will be accepted directly through the IWSLT
submission site (to be announced on the website <https://iwslt.org/2025/>).
We will also accept commitments of submissions with reviews from the ACL
Rolling Review.
Additionally, to foster cross-pollination of ideas, the conference also
invites the presentation of papers on speech translation recently published
elsewhere. Please note that this is for non-archival presentation of papers
relevant to speech translation already published in other venues (e.g.,
Findings for the *ACL, speech, NLP or MT conferences). Submissions for this
category will be accepted through a dedicated form (to be announced on the
website <https://iwslt.org/2025/>). Papers will be checked for relevance to
IWSLT, and assigned either oral or poster presentation slots if selected.
Contact
Please email iwslt-evaluation-campaign(a)googlegroups.com if you have any
questions related to the shared tasks.
Thanks,
Marine, Marcello, Alex, Jan, Sebastian, Elizabeth, Atul
(IWSLT organisers)
Apologies for cross-posting.
---------------------------------------------------------------------------
*The Eighth Workshop on Technologies for Machine Translation of
Low-Resource Languages (LoResMT 2025)*
*https://www.loresmt.org/ <https://www.loresmt.org/>*
*@ NAACL 2025 (May 3–4, 2025)*
*Albuquerque, New Mexico, U.S.A.*
*SUBMISSION*
*
<https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/LoResMT>https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT>*
*TIMELINE*
*Paper submission due:* January 30, 2025 (Anywhere on Earth)
*Pre-reviewed (ARR) submission deadline:* February 20, 2025
*Notification of acceptance:* March 1, 2025
*Camera-ready papers due:* March 10, 2025 (Anywhere on Earth)
*Pre-recorded video due (hard deadline):* April 8, 2025
*Workshop dates at NAACL 2025:* May 3–4, 2025
*SCOPE*
Based on the success of past low-resource machine translation (MT)
workshops at AMTA 2018, MT Summit 2019, AACL-IJCNLP 2020, AMTA 2021, COLING
2022, EACL 2023, ACL 2024, we introduce LoResMT 2025 workshop at NAACL
2025. The workshop provides a discussion panel for researchers working on
MT systems/methods for low-resource and under-represented languages in
general. We would like to help review/overview the state of MT for
low-resource languages and define the most important directions. We also
solicit papers dedicated to supplementary NLP tools that are used in any
language and especially in low-resource languages. Overview papers of these
NLP tools are very welcome. It will be beneficial if the evaluations of
these tools in research papers include their impact on the quality of MT
output.
*TOPICS*
We are highly interested in (1) original research papers, (2)
review/opinion papers, and (3) online systems on the topics below; however,
we welcome all novel ideas that cover research on low-resource languages.
- Neural machine translation (NMT) for low-resource languages
- Use of LLMs (large language models) for low-resource MT systems
- COVID-related corpora, their translations and corresponding NLP/MT systems
- Work that presents online systems for practical use by native speakers
- Word tokenizers/de-tokenizers for specific languages
- Word/morpheme segmenters for specific languages
- Alignment/Re-ordering tools for specific language pairs
- Use of morphology analyzers and/or morpheme segmenters in MT
- Multilingual/cross-lingual NLP tools for MT
- Corpora creation and curation technologies for low-resource languages
- Review of available parallel corpora for low-resource languages
- Research and review papers on MT methods for low-resource languages
- MT systems/methods (e.g. rule-based, SMT, NMT) for low-resource languages
- Pivot MT for low-resource languages
- Zero-shot MT for low-resource languages
- Fast building of MT systems for low-resource languages
- Re-usability of existing MT systems for low-resource languages
- Machine translation for language preservation
*SUBMISSION INFORMATION*
We are soliciting two types of submissions: (1) research, review, and
position papers and (2) system demonstration papers. For research, review
and position papers, the length of each paper should be at least four (4)
and not exceed eight (8) pages, plus unlimited pages for references. For
system demonstration papers, the limit is four (4) pages. Submissions
should be formatted according to the official ACL style templates
(Overleaf). Please refer to the NAACL submission guideline for further
information <https://2025.naacl.org/calls/papers/#paper-submission-details>.
Accepted papers will be published at ACL Anthology in the NAACL 2025 and
will be presented at the conference.
Submissions must be anonymized and should be done using the provided
submission system. Scientific papers that have been or will be submitted to
other venues must be declared as such and must be withdrawn from the other
venues if accepted and published at LoResMT. The review will be
double-blind. Authors of an accepted paper should present their paper in
person at NAACL 2025. Papers should be submitted in PDF to the LoResMT Open
Review
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT>.
We would like to encourage authors to cite papers written in ANY language
that are related to the topics, as long as both original bibliographic
items and their corresponding English translations are provided.
Registration is handled by the main conference (https://2025.naacl.org/).
*ORGANIZING COMMITTEE (LISTED ALPHABETICALLY)*
Atul Kr. Ojha, University of Galway
Chao-Hong Liu, Potamu Research Ltd
Ekaterina Vylomova, University of Melbourne, Australia
Jade Abbott, Retro Rabbit
Jonathan Washington, Swarthmore College
Nathaniel Oco, National University (Philippines)
Tommi A Pirinen, UiT The Arctic University of Norway, Tromsø
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Varvara Logacheva, Skolkovo Institute of Science and Technology
Xiaobing Zhao, Minzu University of China
*PROGRAM COMMITTEE (LISTED ALPHABETICALLY)*
Abigail Walsh, ADAPT Centre, Dublin City University, Ireland
Alberto Poncelas, Rakuten, Singapore
Ali Hatami, University of Galway
Alina Karakanta, Fondazione Bruno Kessler (FBK), University of Trento
Anna Currey, AWS AI Labs
Aswarth Abhilash Dara, Walmart Global Technology
Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP
Bogdan Babych, Heidelberg University
Chao-hong Liu, Potamu Research Ltd
Constantine Lignos, Brandeis University, USA
Daan van Esch, Google
Dana Moukheiber, Massachusetts Institute of Technology
Ekaterina Vylomova, University of Melbourne, Australia
Eleni Metheniti, CLLE-CNRS and IRIT-CNRS
Flammie Pirinen, UiT Norgga árktalaš universitehta
Gaurav Negi, University of Galway
Jinliang Lu, Institute of automation, Chinese Academy of Sciences
John Philip McCrae, University of Galway
Jonathan Washington, Swarthmore College
Koel Dutta Chowdhury, Saarland University
Majid Latifi, UPC University
Maria Art Antonette Clariño, University of the Philippines Los Baños
Milind Agarwal, George Mason University
Mathias Müller, University of Zurich
Nathaniel Oco, De La Salle University
Pavel Rychlý, Masaryk University and Lexical Computing
Pengwei Li, Meta
Rashid Ahmad, International Institute of Information Technology, Hyderabad
Rico Sennrich, University of Zurich
Santanu Pal, Wipro
Sangjee Dondrub, Qinghai Normal University
Sardana Ivanova, University of Helsinki
Sourabrata Mukherjee, Charles University
Thepchai Supnithi, National Electronics and Computer Technology Center
Timothee Mickus, University of Helsinki
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Wen Lai, LMU Munich
Xuebo Liu, Harbin Institute of Technolgy, Shenzhen
Yalemisew Abgaz, Dublin City University
Yasmin Moslem, Bering Lab
Zhanibek Kozhirbayev, National Laboratory Astana, Nazarbayev University
*CONTACT*
Please email loresmt(a)googlegroups.com if you have any
questions/comments/suggestions.