Apologies for cross-posting.
---------------------------------------------------------------------------
*The Eighth Workshop on Technologies for Machine Translation of
Low-Resource Languages (LoResMT 2025)*
*https://www.loresmt.org/ <https://www.loresmt.org/>*
*@ NAACL 2025 (May 3–4, 2025)*
*Albuquerque, New Mexico, U.S.A.*
*SUBMISSION*
*
<https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/LoResMT>https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT>*
*TIMELINE*
*Paper submission due:* January 30, 2025 (Anywhere on Earth)
*Pre-reviewed (ARR) submission deadline:* February 20, 2025
*Notification of acceptance:* March 1, 2025
*Camera-ready papers due:* March 10, 2025 (Anywhere on Earth)
*Pre-recorded video due (hard deadline):* April 8, 2025
*Workshop dates at NAACL 2025:* May 3–4, 2025
*SCOPE*
Based on the success of past low-resource machine translation (MT)
workshops at AMTA 2018, MT Summit 2019, AACL-IJCNLP 2020, AMTA 2021, COLING
2022, EACL 2023, ACL 2024, we introduce LoResMT 2025 workshop at NAACL
2025. The workshop provides a discussion panel for researchers working on
MT systems/methods for low-resource and under-represented languages in
general. We would like to help review/overview the state of MT for
low-resource languages and define the most important directions. We also
solicit papers dedicated to supplementary NLP tools that are used in any
language and especially in low-resource languages. Overview papers of these
NLP tools are very welcome. It will be beneficial if the evaluations of
these tools in research papers include their impact on the quality of MT
output.
*TOPICS*
We are highly interested in (1) original research papers, (2)
review/opinion papers, and (3) online systems on the topics below; however,
we welcome all novel ideas that cover research on low-resource languages.
- Neural machine translation (NMT) for low-resource languages
- Use of LLMs (large language models) for low-resource MT systems
- COVID-related corpora, their translations and corresponding NLP/MT systems
- Work that presents online systems for practical use by native speakers
- Word tokenizers/de-tokenizers for specific languages
- Word/morpheme segmenters for specific languages
- Alignment/Re-ordering tools for specific language pairs
- Use of morphology analyzers and/or morpheme segmenters in MT
- Multilingual/cross-lingual NLP tools for MT
- Corpora creation and curation technologies for low-resource languages
- Review of available parallel corpora for low-resource languages
- Research and review papers on MT methods for low-resource languages
- MT systems/methods (e.g. rule-based, SMT, NMT) for low-resource languages
- Pivot MT for low-resource languages
- Zero-shot MT for low-resource languages
- Fast building of MT systems for low-resource languages
- Re-usability of existing MT systems for low-resource languages
- Machine translation for language preservation
*SUBMISSION INFORMATION*
We are soliciting two types of submissions: (1) research, review, and
position papers and (2) system demonstration papers. For research, review
and position papers, the length of each paper should be at least four (4)
and not exceed eight (8) pages, plus unlimited pages for references. For
system demonstration papers, the limit is four (4) pages. Submissions
should be formatted according to the official ACL style templates
(Overleaf). Please refer to the NAACL submission guideline for further
information <https://2025.naacl.org/calls/papers/#paper-submission-details>.
Accepted papers will be published at ACL Anthology in the NAACL 2025 and
will be presented at the conference.
Submissions must be anonymized and should be done using the provided
submission system. Scientific papers that have been or will be submitted to
other venues must be declared as such and must be withdrawn from the other
venues if accepted and published at LoResMT. The review will be
double-blind. Authors of an accepted paper should present their paper in
person at NAACL 2025. Papers should be submitted in PDF to the LoResMT Open
Review
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT>.
We would like to encourage authors to cite papers written in ANY language
that are related to the topics, as long as both original bibliographic
items and their corresponding English translations are provided.
Registration is handled by the main conference (https://2025.naacl.org/).
*ORGANIZING COMMITTEE (LISTED ALPHABETICALLY)*
Atul Kr. Ojha, University of Galway
Chao-Hong Liu, Potamu Research Ltd
Ekaterina Vylomova, University of Melbourne, Australia
Jade Abbott, Retro Rabbit
Jonathan Washington, Swarthmore College
Nathaniel Oco, National University (Philippines)
Tommi A Pirinen, UiT The Arctic University of Norway, Tromsø
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Varvara Logacheva, Skolkovo Institute of Science and Technology
Xiaobing Zhao, Minzu University of China
*PROGRAM COMMITTEE (LISTED ALPHABETICALLY)*
Abigail Walsh, ADAPT Centre, Dublin City University, Ireland
Alberto Poncelas, Rakuten, Singapore
Ali Hatami, University of Galway
Alina Karakanta, Fondazione Bruno Kessler (FBK), University of Trento
Anna Currey, AWS AI Labs
Aswarth Abhilash Dara, Walmart Global Technology
Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP
Bogdan Babych, Heidelberg University
Chao-hong Liu, Potamu Research Ltd
Constantine Lignos, Brandeis University, USA
Daan van Esch, Google
Dana Moukheiber, Massachusetts Institute of Technology
Ekaterina Vylomova, University of Melbourne, Australia
Eleni Metheniti, CLLE-CNRS and IRIT-CNRS
Flammie Pirinen, UiT Norgga árktalaš universitehta
Gaurav Negi, University of Galway
Jinliang Lu, Institute of automation, Chinese Academy of Sciences
John Philip McCrae, University of Galway
Jonathan Washington, Swarthmore College
Koel Dutta Chowdhury, Saarland University
Majid Latifi, UPC University
Maria Art Antonette Clariño, University of the Philippines Los Baños
Milind Agarwal, George Mason University
Mathias Müller, University of Zurich
Nathaniel Oco, De La Salle University
Pavel Rychlý, Masaryk University and Lexical Computing
Pengwei Li, Meta
Rashid Ahmad, International Institute of Information Technology, Hyderabad
Rico Sennrich, University of Zurich
Santanu Pal, Wipro
Sangjee Dondrub, Qinghai Normal University
Sardana Ivanova, University of Helsinki
Sourabrata Mukherjee, Charles University
Thepchai Supnithi, National Electronics and Computer Technology Center
Timothee Mickus, University of Helsinki
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Wen Lai, LMU Munich
Xuebo Liu, Harbin Institute of Technolgy, Shenzhen
Yalemisew Abgaz, Dublin City University
Yasmin Moslem, Bering Lab
Zhanibek Kozhirbayev, National Laboratory Astana, Nazarbayev University
*CONTACT*
Please email loresmt(a)googlegroups.com if you have any
questions/comments/suggestions.
++ 1st reminder to participate in our web survey on data annotation bottlenecks and active learning; apologies for cross-posting ++
Dear list members,
We invite you to participate in our web survey exploring how recent advancements in NLP, such as LLMs, have changed the need for labeled data in Supervised Machine Learning.
Survey details:
* Topic: Web survey on Data Annotation and Active Learning
* Target group: Researchers and practitioners alike in the fields of NLP, Supervised Machine Learning, and Active Learning in particular (knowledge of Active Learning is not required)
* Duration: 5-15 minutes
* Deadline for participation: January 12, 2025
* Survey link: https://bildungsportal.sachsen.de/umfragen/limesurvey/index.php/538271
Why should I invest my time in this survey?
* Make an impact: Participate in a community-effort and help to gain a better understanding of the current state and open issues on methods that are used to overcome a lack of labeled data.
* Gain insights: Receive a report with key findings to incorporate these insights into research and development of new methods and technologies.
Thank you for considering participating in our survey!
If you have any questions or require additional information, please don't hesitate to contact us directly at activelearningsurvey2024(a)gmail.com<mailto:activeLearningSurvey2024@gmail.com>.
If you know colleagues or peers who might be interested, we'd be grateful if you could forward this survey to them as well.
Best regards,
Julia Romberg (GESIS - Leibniz Institute for the Social Sciences, Germany)
Christopher Schröder (Institut für Angewandte Informatik e. V., Germany)
Julius Gonsior (TUD Dresden University of Technology)
------------------------------------------------------------------------
[gesis-logo-new-50-50]
Leibniz Institute for the Social Sciences
Julia Romberg
Computational Social Science, Team Data Science Methods
+49(221)47694-742
RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING
Varna, Bulgaria
http://ranlp.org/ranlp2025/
Summer School on Deep Learning and LLMs for NLP: 3-5 September 2025 (Wednesday-Friday)
Tutorials: 6-7 September 2025 (Saturday-Sunday)
Main Conference: 8-10 September 2025 (Monday-Wednesday)
Workshops and shared tasks: 11-13 September 2025 (Thursday-Saturday)
The biennial RANLP (Recent Advances in Natural Language Processing) conference is one of the most competitive and influential NLP conferences. The event grew out of the International Summer schools "Contemporary topics in Computational Linguistics" which were organised for many years as international training events. Previous RANLP conferences (1995, 1997, 2001, 2003, 2005, 2007, 2009, 2011, 2013, 2015, 2017, 2019, 2021 and 2023) featured keynote talks by leading experts in NLP as well as presentations/papers of high quality, rigorously reviewed by Programme Committee experts. Since 2009, the papers accepted at RANLP and the associated workshops are included in the ACL Anthology. The RANLP proceedings are indexed by SCOPUS and DBLP. The Proceedings has its own Scopus SJR, in 2023 it is 0,299. The conference will be preceded by a Summer School on Deep Learning and Large Language Models (LLMs) for Natural Language Processing (NLP) as well as tutorials on current topics of particular interest and cutting edge technologies. RANLP-2025 will be followed by specialised workshops as well as shared tasks covering timely NLP topics. A Student Research Workshop will be held in parallel with the main conference. The Student Research Workshops (now the 9th edition) have become active discussion fora for young researchers.
TOPICS
We invite papers reporting recent advances in all aspects of Natural Language Processing and particularly encourage submissions related to (and the employment of) the latest NLP methods including Large Language Models/Generative AI. Contributions from a broad range of areas will be welcome including, but not limited to, the following topics: phonetics, phonology, morphology; syntax, semantics, discourse, pragmatics, dialogue, lexicon; complexity; mathematical, statistical, machine learning and deep learning models; language resources and corpora; crowdsourcing for creation of linguistic resources; electronic dictionaries, terminologies and ontologies; sublanguages and controlled languages; linked data; POS tagging; parsing; semantic role labelling; word-sense disambiguation; multiword expressions and computational phraseology; textual entailment; anaphora resolution; temporal processing; language generation; speech recognition; text-to-speech synthesis; multilingual NLP; machine translation, translation memory systems and computer-aided translation tools, text simplification and readability estimation; knowledge acquisition; information retrieval; text categorisation; information extraction; text summarisation; terminology extraction; question answering; opinion mining and sentiment analysis; fact checking and fake news; stance recognition; hate speech and aggression detection; author profiling; dialogue systems; chatbots and conversational agents; irony and sarcasm detection; negation and speculation detection; computer-aided language learning; multimodal systems; language and vision; NLP for biomedical texts; NLP for educational applications; NLP for healthcare; NLP for financial purposes; NLP for legal texts; for the Semantic web; theoretical and application-orientated papers related to NLP.
CHAIR OF THE PROGRAMME COMMITTEE
Ruslan Mitkov (University of Lancaster)
CHAIR OF THE ORGANISING COMMITTEE
Galia Angelova (Bulgarian Academy of Sciences)
The Programme Committee (PC) members are distinguished NLP experts from all over the world. The list of PC members will be announced at the conference website in due time.
Keynote speakers, tutorial presenters, and summer school lecturers and tutors will be announced in the upcoming calls for papers.
WORKSHOPS and SHARED TASKS:
The RANLP 2025 workshops and shared tasks will be held on 11-13 September 2025. Calls for Proposals of Workshop and Shared Tasks have been already published.
SUBMISSION OF PAPERS, POSTERS, DEMOS
The submissions will be maintained by the conference management software START. For further instructions, please follow the submission information at the conference website at https://ranlp.org/ranlp2025/. The reviewing process will be anonymous. Double submission is acceptable, but authors will be asked to declare it at the time of submission. Submissions will be reviewed by at least three members of the Programme Committee. Authors of accepted papers will receive guidelines regarding how to produce camera-ready versions of their papers for inclusion in the proceedings. All RANLP papers have DOI numbers assigned. The full conference proceedings will be uploaded on the ACL Anthology.
RANLP-2025 aims to provide early notification of acceptance to authors and presenters who need visa to enter Bulgaria. We invite early submissions of authors’ names and paper abstracts, in order to plan quick reviewing. Access to the conference management software will be available as from 1 April 2025.
IMPORTANT DATES
Call for Shared Tasks proposals: September 2024
Shared Tasks selection notification: 4 November 2024
Shared Tasks sample data and task website ready: 15 November 2024
Shared Tasks training data ready: 15 December 2024
Call for workshop proposals: 24 December 2024
Deadline for submission of workshop proposals: 15 March 2025
Workshop selection: 22 March 2025
Conference abstracts submission: April 2025
Conference papers submission: early/mid May 2025 (please check exact dates on RANLP 2025 website)
Conference papers acceptance notification: 28 June 2025
Camera-ready versions of the conference papers: 31 July 2025
Workshop paper submission deadline (suggested): 30 June 2025
Workshop paper acceptance notification (suggested): 28 July 2025
Workshop paper camera-ready versions (suggested): 20 August 2025
Workshop camera-ready proceedings ready (suggested): 31 August 2025
RANLP Summer School on Deep Learning in NLP: 3-5 September 2025
RANLP tutorials: 6-7 September 2025 (Saturday-Sunday)
RANLP conference: 8-10 September 2025 (Monday-Wednesday)
RANLP workshops and Shared Tasks presentations: 11-13 September 2025 (Thursday-Saturday)
VENUE
RANLP 2025 will be held at the conference facilities of Hotel “Cherno More” (http://www.chernomorebg.com<http://www.chernomorebg.com/> ) in Varna, the largest city on the Bulgarian Black Sea Coast. The event venue is centrally located at the entrance of the Sea Garden and offers excellent conference facilities. The city is a major tourist destination with flights to/from the Varna International Airport. It is also known for its Archaeological Museum, which features the oldest gold treasure in the world (https://en.wikipedia.org/wiki/Varna_Necropolis). The conference organisers plan to organise an excursion to Provadia, the oldest salt-production and urban centre in Europe (5600 - 4350 BC, https://provadia-solnitsata.com/en/ ) which is located 50 km from Varna.
THE TEAM BEHIND RANLP-25
Galia Angelova, Bulgarian Academy of Sciences, Bulgaria (Chair Organising Committee)
Ruslan Mitkov, University of Lancaster, UK (Chair Programme Commitee)
Nikolai Nikolov, Bulgarian Association for Computational Linguistics, Bulgaria
Tharindu Ranasinghe, Lancaster University, UK (Workshops Chair and Shared tasks Co-Chair)
Saad Ezzini, Lancaster University, UK (Sponsorship Chair and Shared tasks Co-Chair)
Maria Kunilovskaya, Saarland University, Germany (Publication Chair)
Preslav Nakov, MBZUAI, Abu Dhabi, UAE
Ivelina Nikolova, Bulgarian Academy of Sciences, Bulgaria
Kiril Simov, Bulgarian Academy of Sciences, Bulgaria (Workshops Co-Chair)
Petya Osenova, Bulgarian Academy of Sciences, Bulgaria (Workshops Co-Chair)
Call for Workshop Proposals
================================================
RANLP-2025: 15th Conference on
Recent Advances in Natural Language Processing
Summer School DLinNLP 3-5 September 2025 (Wednesday-Friday)
Tutorials 6-7 September 2025 (Saturday-Sunday)
Main conference: 8-10 September 2025 (Monday-Wednesday)
Workshops and Shared Tasks: 11-13 September 2025 (Thursday-Saturday)
Varna, Bulgaria
https://ranlp.org/ranlp2025/
================================================
Following the workshops held in conjunction with the Conferences "Recent Advances in Natural Language Processing" RANLP-2005, RANLP-2007, RANLP-2009, RANLP-2011, RANLP-2013, RANLP-2015, RANLP-2017, RANLP-2019, RANLP-2021 and RANLP-2023, we are pleased to announce a call for workshop proposals for RANLP-2025.
RANLP-2025 invites workshop proposals on any topic of interest to the Natural Language Processing (NLP) community, ranging from fundamental research issues to more applied industrial or commercial aspects. We encourage workshops related to (or discussing the employment of) the latest NLP methods including Large Language Models/Generative AI. Workshops can vary in length from a half day to full 1-2 days and can also feature demo sessions. The format of each workshop (face-to-face or hybrid) can be determined by its organisers the condition being that onsite sessions are held in Varna for the whole workshop duration so that other RANLP participants can take part in the event. Accepted workshops will receive one free registration to RANLP-2025 (full registration including the summer school, tutorials, all workshops, main conference, reception, conference dinner).
VENUE
The workshops will take place in Hotel "Cherno More", Varna, the main RANLP-2025 conference venue. If more than 5 workshops are selected, the RANLP-2025 organisers will provide conference halls in some of the neighbouring hotels or universities in downtown Varna.
IMPORTANT DATES
Workshop proposals due: 15 March 2025
Workshop selection: 22 March 2025
Workshop website due: 5 April 2025
Workshop paper submission deadline (suggested): 30 June 2025, immediately after RANLP notification
Workshop paper acceptance notification (suggested): 28 July 2025
Workshop paper camera-ready versions (suggested): 20 August 2025
Workshop camera-ready proceedings ready (suggested): 31 August 2025
Workshops: 11-13 September 2025
REQUIREMENTS
Proposals should be no longer than five pages and should contain the following:
1. Title and brief technical description of the workshop, specifying the goals and the technical issues that it will focus on;
2. Brief description of the target audience, including estimates of the numbers of submissions and attendees (a tentative list of potential contributors would be useful);
3. List of related workshops/events held in the last three years or to be held in 2025;
4. Tentative workshop program committee;
5. Names and contact information (web page, email address) of the proposed organising committee;
6. Description of the experience of the proposed organisers in the workshop topics and in organising workshops or related events.
The workshop Organising Committee is responsible for the following:
* Setting up and maintaining the workshop website;
* Disseminating call for papers/participation;
* Organising paper submission, review process, authors notification, and collecting audio/visual presentation requirements;
* Verifying the camera-ready copies, providing electronic conference proceedings which are to be generated with the conference management system START;
* In case of hybrid workshops, organising an onsite workshop component and chairing the live sessions in Varna.
Workshop invited speakers: If the workshop organisers intend to host an invited talk, it is recommended that they invite somebody from the main conference keynote speakers or participants. If the workshop organisers decide to invite another speaker, it is very likely that the workshop organisers will have to secure financial support for this speaker.
The RANLP-2025 Organising Committee is responsible for the following:
* Providing a link to the workshop web page;
* Publishing the workshop proceedings with ISBN numbers, and registering DOI numbers for all accepted papers;
* Providing the workshop venue;
* Organising registration, audio/visual support, coffee breaks, registration facilities, Internet access.
WORKSHOP PROPOSAL SUBMISSION
Workshop proposals in PDF format should be e-mailed to Tharindu Ranasinghe <t.ranasinghe[at]lancaster[dot]ac[dot]uk>, Kiril Simov <kivs[at]bultreebank[dot]org>, Petya Osenova <petya[at]bultreebank[dot]org> and cc'ed to <workshops2025(a)ranlp.org<mailto:workshops2025@ranlp.org>>
EVALUATION
Submitted proposals will be reviewed with respect to the following criteria:
* Relevance, importance, and timeliness of the topics;
* Completeness, clarity, and quality of the workshop proposal;
* Experience of the organisers in the proposed topics;
* Viability of the workshop.
THE TEAM BEHIND RANLP-25
Galia Angelova, Bulgarian Academy of Sciences, Bulgaria (Chair Organising Committee)
Ruslan Mitkov, University of Lancaster, UK (Chair Programme Commitee)
Nikolai Nikolov, Bulgarian Association for Computational Linguistics, Bulgaria
Tharindu Ranasinghe, Lancaster University, UK (Workshops Chair and Shared tasks Co-Chair)
Saad Ezzini, Lancaster University, UK (Sponsorship Chair and Shared tasks Co-Chair)
Maria Kunilovskaya, Saarland University, Germany (Publication Chair)
Preslav Nakov, MBZUAI, Abu Dhabi, UAE
Ivelina Nikolova, Bulgarian Academy of Sciences, Bulgaria
Kiril Simov, Bulgarian Academy of Sciences, Bulgaria (Workshops Co-Chair)
Petya Osenova, Bulgarian Academy of Sciences, Bulgaria (Workshops Co-Chair)
[Apologies for cross-postings]
********************************************************************************
First Call for Papers
21st Workshop on Multiword Expressions (MWE 2025)
Organized, sponsored and endorsed by SIGLEX, the Special Interest Group on
the Lexicon of the ACL
Full-day workshop collocated with NAACL 2025, Albuquerque, New Mexico,
U.S.A., May 3 or 4, 2025
Hybrid (on-site & on-line)
Submission deadline: January 30, 2025
MWE 2025 website: <https://multiword.org/mwe2022/>
https://multiword.org/mwe2025/
********************************************************************************
Multiword expressions (MWEs), i.e., word combinations that exhibit lexical,
syntactic, semantic, pragmatic, and/or statistical idiosyncrasies (Baldwin
and Kim, 2010), such as “by and large”, “hot dog”, “make a decision” and
“break one's leg” are still a pain in the neck for Natural Language
Processing (NLP). The notion encompasses closely related phenomena: idioms,
compounds, light-verb constructions, phrasal verbs, rhetorical figures,
collocations, institutionalized phrases, etc. Given their irregular nature,
MWEs often pose complex problems in linguistic modeling (e.g. annotation),
NLP tasks (e.g. parsing), and end-user applications (e.g. natural language
understanding and Machine Translation), hence still representing an open
issue for computational linguistics (Constant et al., 2017).
For more than two decades, modelling and processing MWEs for NLP has been
the topic of the MWE workshop organised by the MWE section
<https://multiword.org/> of ACL-SIGLEX <http://www.siglex.org/> in
conjunction with major NLP conferences since 2003. Impressive progress has
been made in the field, but our understanding of MWEs still requires much
research considering their need and usefulness in NLP applications. This is
also relevant to domain-specific NLP pipelines that need to tackle
terminologies most often realised as MWEs. Following previous years, for
this 21st edition of the workshop, we identified the following topics on
which contributions are particularly encouraged:
-
MWE processing to enhance end-user applications. MWEs gained particular
attention in end-user applications, including Machine Translation (MT)
(Zaninello and Birch, 2020), simplification (Kochmar et al., 2020),
language learning and assessment (Paquot et al., 2020), social media mining
(Pelosi et al., 2017), and abusive language detection (Zampieri et al.
2020). We believe that it is crucial to extend and deepen these first
attempts to integrate and evaluate MWE technology in these and further
end-user applications.
-
MWE processing and identification in the general language, as well as in
specialized languages and domains: Multiword terminology extraction from
domain-specific corpora (Lossio-Ventura et al, 2014) is of particular
importance to various applications, such as MT (Semmar and Laib, 2017), or
for the identification and monitoring of neologisms and technical jargon
(Chatzitheodorou and Kappatos, 2021).
-
MWE processing in low-resource languages: The PARSEME shared tasks (2017
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_05_MWE_2…>,
2018
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_04_LAW-M…>,
2020
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_02_MWE-L…>)
among others, have fostered significant progress in MWE identification,
providing datasets that include low-resource languages, evaluation
measures, and tools that now allow fully integrating MWE identification
into end-user applications. There are continuous efforts in this direction
(Diaz Hernandez, 2024) and a few of them have also explored methods for the
automatic interpretation of MWEs (Bhatia et al., 2018), and their
processing in low-resource languages (Eder et al., 2021). Resource creation
and sharing should be pursued in parallel with the development of
multilingual benchmarks for MWE identification (Savary et al., 2023).
-
MWE identification and interpretation in LLMs: Most current MWE
processing is limited to their identification and detection using
pre-trained language models, but we still lack understanding about how MWEs
are represented and dealt with therein (Garcia et al., 2021), how to better
model the compositionality of MWEs from semantics (Phelps et al., 2024).
Now that NLP has shifted towards end-to-end neural models like BERT,
capable of solving complex tasks with little or no intermediary linguistic
symbols, questions arise about the extent to which MWEs should be
implicitly or explicitly modelled (Shwartz and Dagan, 2019).
-
New and enhanced representation of MWEs in language resources and
computational models of compositionality as gold standards for formative
intrinsic evaluation.
Through this workshop, we will bring together and encourage researchers in
various NLP subfields to submit their MWE-related research, We also intend
to consolidate the converging results of previous joint workshops LAW-MWE-CxG
2018 <http://multiword.sourceforge.net/lawmwecxg2018/>, MWE-WN 2019
<http://multiword.sourceforge.net/mwewn2019/> and MWE-LEX 2020
<http://multiword.sourceforge.net/mwelex2020/>, the joint MWE-WOAH panel in
2021 <https://multiword.org/mwe2021/#program>, the MWE-SIGUL 2022 joint
session <https://multiword.org/mwe2022/>, and the MWE-UD 2024
<https://multiword.org/mweud2024/>, extending our scope to MWEs in
e-lexicons, and WordNets, MWE annotation, as well as grammatical
constructions. Correspondingly, we call for papers on research related (but
not limited) to MWEs and constructions in:
-
Computationally-applicable theoretical work in psycholinguistics and
corpus linguistics;
-
Annotation (expert, crowdsourcing, automatic) and representation in
resources such as corpora, treebanks, e-lexicons, WordNets, constructions
(also for low-resource languages);
-
Processing in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG,
LFG, TAG, UD, etc.);
-
Discovery and identification methods, including for specialized
languages and domains such as clinical or biomedical NLP;
-
Interpretation of MWEs and understanding of text containing them;
-
Language acquisition, language learning, and non-standard language (e.g.
tweets, speech);
-
Evaluation of annotation and processing techniques;
-
Retrospective comparative analyses from the PARSEME shared tasks;
-
Processing for end-user applications (e.g. MT, NLU, summarisation,
language learning, etc.);
-
Implicit and explicit representation in pre-trained language models and
end-user applications;
-
Evaluation and probing of pre-trained language models;
-
Resources and tools (e.g. lexicons, identifiers) and their integration
into end-user applications;
-
Multiword terminology extraction;
-
Adaptation and transfer of annotations and related resources to new
languages and domains including low-resource ones.
Submission formats:
The workshop invites two types of submissions:
-
archival submissions that present substantially original research in
both long paper format (8 pages + references) and short paper format (4
pages + references).
-
non-archival submissions of abstracts describing relevant research
presented/published elsewhere which will not be included in the MWE
proceedings.
Paper submission and templates
Papers should be submitted via the workshop's submission page
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/MWE> (
https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/MWE). Please
choose the appropriate submission format (archival/non-archival). Archival
papers with existing reviews will also be accepted through the ACL Rolling
Review. Submissions must follow the ACL stylesheet
<https://github.com/acl-org/acl-style-files>.
Important Dates
Paper Submission Deadline: January 30, 2025
Notification of acceptance: March 1, 2025
Camera-ready papers due: March 10, 2025
Workshop: May 3 or 4, 2025
All deadlines are at 23:59 UTC-12 (Anywhere on Earth).
Organizing Committee
Verginica Barbu Mititelu, Voula Giouli, Grazina Korvel, A. Seza Doğruöz,
Alexandre Rademaker, Atul Kr. Ojha, Mathieu Constant
Anti-harassment policy
The workshop follows the ACL anti-harassment policy
<https://www.aclweb.org/adminwiki/index.php?title=Anti-Harassment_Policy>.
Contact
For any inquiries regarding the workshop, please send an email to the
Organizing Committee at mweworkshop2023(a)googlegroups.com.
Dear colleagues, (apologies for cross-posting)
Long-form (also called daylong) recordings (LFR) are increasingly used in a
range of fields, including to document language input and outcomes in
under-described populations (e.g., Casillas et al., 2020); and to assess
potential effects of early childhood interventions (e.g., Weber et al.,
2017).
We are happy to announce two exciting events related to long-form
recordings (LFR) that will take place in person at PSL University/Ecole
Normale Supérieure in Paris. The LFR Interdisciplinary Summit (
lfris2025.sciencesconf.org) on June 19-20, 2025, exploring cutting-edge
innovations in long-form recordings with talks by leading researchers. You
can find more information about this event here
<https://lfris2025.sciencesconf.org/?forward-action=index&forward-controller…>.
Registration for that event will open in March and close in May.
Today, we want to especially draw your attention to the LFRAZ Summer School
(Long-form Recordings from A to Z; lfraz2025.sciencesconf.org), which will
take place June 16-19, 2025. This hands-on summer school aims to provide
attendees who are newbies to the method with all the tools they need to
collect and analyze LFRs. The mornings will feature lectures and
roundtables with leading experts, while afternoons will provide
opportunities for individual and group projects, as well as office hours
for tailored support. Here's what attendees can hope for:
-
Comprehensive Training: From data collection to modeling you’ll gain
practical skills to integrate long-form recordings into your research.
-
Networking Opportunities: The event brings together researchers from
diverse fields, including linguistics, anthropology, economics, and
developmental science.
-
Automatic Speech Annotations: Learn to use open-source tools and
hardware for analyzing speech data in culturally diverse contexts.
We are offering a limited number of travel and accommodation grants for
individuals working outside North America and Europe.
To learn more about the school, visit https://lfraz2025.sciencesconf.org/.
To apply, fill out the form available here, which takes roughly
<https://docs.google.com/forms/d/e/1FAIpQLSdbnxhRibXKazWQSnkEzjo0ICI9G_4whBB…>15
minutes to complete. We recommend preparing one's answers in advance. To
see the full list of questions, see here
<https://drive.google.com/file/d/17km0_R7O4-49icR7hanxGiM5q0nkIoC5/view?usp=…>.
The application deadline is the 15th of January.
If you can't make it to Paris in person, we recommend that you still apply,
since we believe similar schools (Global LFRAZ) will be organized in person
and/or online, so we can keep you posted on those. Also, if you are
interested in being part of the Global LFRAZ
<https://lfraz2025.sciencesconf.org/page/global_lfraz?lang=en>, more
information on that is found here
<https://lfraz2025.sciencesconf.org/page/global_lfraz?lang=en>.
Please share this information with interested parties!
---------------------------------------------------------------
Alex (Alejandrina) Cristia
Researcher, CNRS
Laboratoire de Sciences Cognitives et Psycholinguistique
29, rue d'Ulm, 75005, Paris, FRANCE
My site: www.acristia.org
---------------------------------------------------------------
If you donate, ask me about effective charities
<https://effectivealtruism.us8.list-manage.com/track/click?u=52b028e7f799cca…>.
/ Si vous faites des dons, posez-moi des questions sur le don efficace
<https://www.altruismeefficacefrance.org/donner-efficacement>.
Dear list members,
We invite you to participate in our web survey exploring how recent advancements in NLP, such as LLMs, have changed the need for labeled data in Supervised Machine Learning.
Survey details:
* Topic: Web survey on Data Annotation and Active Learning
* Target group: Researchers and practitioners alike in the fields of NLP, Supervised Machine Learning, and Active Learning in particular (not required).
* Duration: ~15 minutes
* Deadline for participation: January 12, 2025
* Survey link: https://bildungsportal.sachsen.de/umfragen/limesurvey/index.php/538271
Why should I invest my time in this survey?
* Make an impact: Participate in a community-effort and help to gain a better understanding of the current state and open issues on methods that are used to overcome a lack of labeled data.
* Gain insights: Receive a report with key findings to incorporate these insights into research and development of new methods and technologies.
Thank you for considering participating in our survey!
If you have any questions or require additional information, please don't hesitate to contact us directly at activelearningsurvey2024(a)gmail.com<mailto:activeLearningSurvey2024@gmail.com>.
If you know colleagues or peers who might be interested, we'd be grateful if you could forward this survey to them as well.
Best regards,
Julia Romberg (GESIS - Leibniz Institute for the Social Sciences, Germany)
Christopher Schröder (Institut für Angewandte Informatik e. V., Germany)
Julius Gonsior (TUD Dresden University of Technology)
------------------------------------------------------------------------
[gesis-logo-new-50-50]
Leibniz Institute for the Social Sciences
Julia Romberg
Computational Social Science, Team Data Science Methods
+49(221)47694-742
Neural language models have revolutionised natural language processing (NLP) and have provided state-of-the-art results for many tasks. However, their effectiveness is largely dependent on the pre-training resources. Therefore, language models (LMs) often struggle with low-resource languages in both training and evaluation. Recently, there has been a growing trend in developing and adopting LMs for low-resource languages. LoResLM aims to provide a forum for researchers to share and discuss their ongoing work on LMs for low-resource languages.
LoResLM 2025 will be a physical workshop co-located with COLING 2025, Abu Dhabi on 20th January 2025.
We are pleased to share the programme of LoResLM 2025 with you. Please visit https://loreslm.github.io/program for the full programme.
To register for the workshop, please visit https://coling2025.org/registration/
We are looking forward to welcoming you at LoResLM 2025 in Abu Dhabi.
The workshop is supported in part by CLARIN-UK, funded by the Arts and Humanities Research Council as part of the Infrastructure for Digital Arts and Humanities programme.
>> Keynote Speaker
Jose Camacho-Collados, Cardiff University.
>> Organising Committee
Hansi Hettiarachchi, Lancaster University, UK
Tharindu Ranasinghe, Lancaster University, UK
Paul Rayson, Lancaster University, UK
Ruslan Mitkov, Lancaster University, UK
Mohamed Gaber, Birmingham City University, UK
Damith Premasiri, Lancaster University, UK
Fiona Anting Tan, National University of Singapore, Singapore
Lasitha Uyangodage, University of Münster, Germany
>> Programme Committee
Gábor Bella - IMT Atlantique, France
Samuel Cahyawijaya - The Hong Kong University of Science and Technology, Hong Kong
Burcu Can - University of Stirling, UK
Çağrı Çöltekin - University of Tübingen, Germany
Raj Dabre - National Institute of Information and Communications Technology, Japan
Vera Danilova - Uppsala University, Sweden
Debashish Das - Birmingham City University, UK
Ona de Gibert - University of Helsinki, Finland
Alphaeus Dmonte - George Mason University, USA
Bonaventure F. P. Dossou - McGill University, Canada
Daan van Esch - Google
Ignatius Ezeani - Lancaster University, UK
Anna Furtado - University of Galway, Ireland
Amal Htait - Aston University, UK
Ali Hürriyetoğlu - Wageningen University & Research, Netherlands
Danka Jokic - University of Belgrade, Serbia
Diptesh Kanojia - University of Surrey, UK
Daisy Lal - Lancaster University, UK
Colin Leong - University of Dayton, USA
Veronika Lipp - Hungarian Research Centre for Linguistics, Hungary
Muhidin Mohamed - Aston University, UK
Farhad Nooralahzadeh - University of Zurich, Switzerland
Rrubaa Panchendrarajan - Queen Mary University of London, UK
Nadeesha Pathirana - Aston University, UK
Alistair Plum - University of Luxembourg, Luxembourg
Nishat Raihan - George Mason University, USA
Omid Rohanian - University of Oxford, UK
Sandaru Seneviratne - Australian National University, Australia
Ravi Shekhar - University of Essex, UK
Archchana Sindhujan - University of Surrey, UK
Claytone Sikasote - University of Cape Town, South Africa
Marjana Prifti Skenduli - University of New York Tirana, Albania
Uthayasanker Thayasivam - University of Moratuwa, Sri Lanka
Taro Watanabe - Nara Institute of Science and Technology, Japan
John Vidler - Lancaster University, UK
Phil Weber - Aston University, UK
Bryan Wilie - Hong Kong University of Science & Technology, Hong Kong
Artūrs Znotiņš - University of Latvia, Latvia
URL - https://loreslm.github.io/
Twitter - https://x.com/LoResLM2025
Dr Tharindu Ranasinghe
School of Computing and Communications | Lancaster University
Contact me on Teams<https://teams.microsoft.com/l/chat/0/0?users=t.ranasinghe@lancaster.ac.uk>
www.lancaster.ac.uk<https://www.lancaster.ac.uk/>
FYI
=================================
Dear colleagues,
ELAR is excited to share the news that the *Endangered Languages
Documentation Programme* is offering an online training series in Language
Documentation and Archiving from March 6 to June 12, 2025. Applications to
participate in the training series are due 30 January 2025.
Please see the call below for more information. Please help this call reach
a broader audience for this series by sharing it with your students,
colleagues, and others who may be interested in the training.
Best wishes,
The ELAR Team
---------------------------------------------------------------------------------------------------------
Online Training Series in Language Documentation and Archiving
6 March – 12 June 2025
The Endangered Languages Documentation Programme (ELDP) is offering a
series of online trainings in Language Documentation and Archiving from *March
6 to June 12, 2025*. Training participants will meet weekly on Thursdays,
live via Zoom, for a webinar and discussion session. They will be expected
to complete readings, hands-on practice, and online assessments between
sessions. Live attendance at all sessions and the completion of all
assignments is required.
Below are the topics that will be covered in the training series:
· Linguistic diversity and language endangerment
· Language Documentation theory & methods
· Understanding archival collections
· Compiling a documentary collection
· Audio and video recording methods
· Transcription, translation, and annotation with ELAN
· Lexicography and dictionary creation with Fieldworks Language Explorer
(FLEx)
· Metadata creation and managing data
· Project planning and design
· Grant writing for language documentation projects
The online sessions will take place from 9:00 to 11:00 CET. Readings,
hands-on practice, and homework assignments will be made available via a
free course website. The language of instruction is English.
The training series has 25 spots available. Applicants planning to work
with endangered and under-documented languages (see Hammarström 2019
<https://elararchive.org/blog/2019/12/17/which-language-should-i-document-so…>),
especially Papuan languages, are strongly encouraged to apply. Applicants
should meet the criteria listed below:
· Have plans to document an endangered and under-documented language
· Be able to attend all webinar sessions and complete readings and
assignments
· Have a sufficient level of spoken and written English to be able to
complete assignments
· Have regular access to a Windows computer and a reliable internet
connection
To apply and for more information, please go here
<https://www.eldp.net/en/our+trainings/online+training+series/>. The
deadline is January 30th, 2025.
---------------------------------------------------------------------------------------------------------
--
*Interested in keeping up with ELAR? Subscribe to our new **mailing list*
<https://www.listserv.dfn.de/sympa/subscribe/elar-news>*!*
*Endangered Languages Archive*
Berlin-Brandenburg Academy of Sciences and Humanities
Jägerstraße 22/23
10117 Berlin, Germany
Website: https://elararchive.org/
Facebook: https://www.facebook.com/elararchive/
Instagram: https://www.instagram.com/elararchive/
Twitter: @ELARarchive <https://www.twitter.com/elararchive/>
Blog: https://elararchive.org/blog
Vimeo: https://vimeo.com/user64477333/albums
***Apologies for possible cross-posting ***
CALL FOR PAPERS DEADLINE EXTENSION
We are pleased to announce that the submission deadline for the 1st Workshop on Nordic-Baltic Responsible Evaluation and Alignment of Language Models (NB-REAL) has been extended from December 16th to December 23rd, 2024. The workshop will be held on March 2, 2025, as part of the NoDaLiDa/Baltic-HLT 2025 conference in Tallinn, Estonia.
About the Workshop
This half-day workshop focuses on the responsible evaluation and alignment of Large Language Models (LLMs) for Nordic and Baltic languages. Our goal is to bring together researchers, practitioners, and stakeholders to address the unique challenges and opportunities in this rapidly evolving field.
Topics of Interest
We welcome submissions on topics including, but not limited to:
- Ethical benchmarks for evaluating LLMs in Nordic and Baltic
languages
- Methods for creating culturally sensitive and inclusive evaluation
datasets
- Responsible techniques for generating or collecting alignment data
- Challenges and solutions in ethical LLM alignment for less-resourced
languages
- Case studies on responsible LLM evaluation or alignment projects
- Ethical considerations in LLM evaluation and alignment
- Comparative studies of LLM performance and fairness in Nordic and
Baltic languages
- Innovative approaches to leveraging limited language resources in
evaluation or alignment of language models
Important Dates
Paper Submission Deadline: December 16, 2024
Notification of Acceptance: January 13, 2025
Camera-Ready Deadline: February 3, 2025
Workshop Date: March 2, 2025
Workshop Format
NB-REAL 2025 will be a half-day workshop held on March 2, 2025 (pre-conference). It will be a hybrid event with both on-site and online participation available.
Submission
Submissions can be long papers (8 pages) or short papers (4 pages). All submissions must follow the NoDaLida template, available in both LaTeX and MS Word. The templates are available at the official conference website, see https://www.nodalida-bhlt2025.eu/call-for-papers#h.v2k63awq0fpe. All submissions will undergo peer review by the program committee. To submit your paper please visit NB-REAL 2025 Workshop | OpenReview<https://openreview.net/group?id=NoDaLiDa/Baltic-HLT/2025/Workshop/NB-REAL#t…>
Organizers
Hafsteinn Einarsson, Associate Professor in Computer Science, University of Iceland (hafsteinne(a)hi.is)
Annika Simonsen, PhD Student, University of Iceland (annika(a)hi.is)
Dan Saattrup Nielsen, Senior AI Specialist, Alexandra Institute (dan.nielsen(a)alexandra.dk)
For more information, please visit our website: https://nbreal.xyz/
We look forward to your contributions and to seeing you at NB-REAL
2025!