Dear colleagues, (apologies for cross-posting)
Long-form (also called daylong) recordings (LFR) are increasingly used in a
range of fields, including to document language input and outcomes in
under-described populations (e.g., Casillas et al., 2020); and to assess
potential effects of early childhood interventions (e.g., Weber et al.,
2017).
We are happy to announce two exciting events related to long-form
recordings (LFR) that will take place in person at PSL University/Ecole
Normale Supérieure in Paris. The LFR Interdisciplinary Summit (
lfris2025.sciencesconf.org) on June 19-20, 2025, exploring cutting-edge
innovations in long-form recordings with talks by leading researchers. You
can find more information about this event here
<https://lfris2025.sciencesconf.org/?forward-action=index&forward-controller…>.
Registration for that event will open in March and close in May.
Today, we want to especially draw your attention to the LFRAZ Summer School
(Long-form Recordings from A to Z; lfraz2025.sciencesconf.org), which will
take place June 16-19, 2025. This hands-on summer school aims to provide
attendees who are newbies to the method with all the tools they need to
collect and analyze LFRs. The mornings will feature lectures and
roundtables with leading experts, while afternoons will provide
opportunities for individual and group projects, as well as office hours
for tailored support. Here's what attendees can hope for:
-
Comprehensive Training: From data collection to modeling you’ll gain
practical skills to integrate long-form recordings into your research.
-
Networking Opportunities: The event brings together researchers from
diverse fields, including linguistics, anthropology, economics, and
developmental science.
-
Automatic Speech Annotations: Learn to use open-source tools and
hardware for analyzing speech data in culturally diverse contexts.
We are offering a limited number of travel and accommodation grants for
individuals working outside North America and Europe.
To learn more about the school, visit https://lfraz2025.sciencesconf.org/.
To apply, fill out the form available here, which takes roughly
<https://docs.google.com/forms/d/e/1FAIpQLSdbnxhRibXKazWQSnkEzjo0ICI9G_4whBB…>15
minutes to complete. We recommend preparing one's answers in advance. To
see the full list of questions, see here
<https://drive.google.com/file/d/17km0_R7O4-49icR7hanxGiM5q0nkIoC5/view?usp=…>.
The application deadline is the 15th of January.
If you can't make it to Paris in person, we recommend that you still apply,
since we believe similar schools (Global LFRAZ) will be organized in person
and/or online, so we can keep you posted on those. Also, if you are
interested in being part of the Global LFRAZ
<https://lfraz2025.sciencesconf.org/page/global_lfraz?lang=en>, more
information on that is found here
<https://lfraz2025.sciencesconf.org/page/global_lfraz?lang=en>.
Please share this information with interested parties!
---------------------------------------------------------------
Alex (Alejandrina) Cristia
Researcher, CNRS
Laboratoire de Sciences Cognitives et Psycholinguistique
29, rue d'Ulm, 75005, Paris, FRANCE
My site: www.acristia.org
---------------------------------------------------------------
If you donate, ask me about effective charities
<https://effectivealtruism.us8.list-manage.com/track/click?u=52b028e7f799cca…>.
/ Si vous faites des dons, posez-moi des questions sur le don efficace
<https://www.altruismeefficacefrance.org/donner-efficacement>.
Dear list members,
We invite you to participate in our web survey exploring how recent advancements in NLP, such as LLMs, have changed the need for labeled data in Supervised Machine Learning.
Survey details:
* Topic: Web survey on Data Annotation and Active Learning
* Target group: Researchers and practitioners alike in the fields of NLP, Supervised Machine Learning, and Active Learning in particular (not required).
* Duration: ~15 minutes
* Deadline for participation: January 12, 2025
* Survey link: https://bildungsportal.sachsen.de/umfragen/limesurvey/index.php/538271
Why should I invest my time in this survey?
* Make an impact: Participate in a community-effort and help to gain a better understanding of the current state and open issues on methods that are used to overcome a lack of labeled data.
* Gain insights: Receive a report with key findings to incorporate these insights into research and development of new methods and technologies.
Thank you for considering participating in our survey!
If you have any questions or require additional information, please don't hesitate to contact us directly at activelearningsurvey2024(a)gmail.com<mailto:activeLearningSurvey2024@gmail.com>.
If you know colleagues or peers who might be interested, we'd be grateful if you could forward this survey to them as well.
Best regards,
Julia Romberg (GESIS - Leibniz Institute for the Social Sciences, Germany)
Christopher Schröder (Institut für Angewandte Informatik e. V., Germany)
Julius Gonsior (TUD Dresden University of Technology)
------------------------------------------------------------------------
[gesis-logo-new-50-50]
Leibniz Institute for the Social Sciences
Julia Romberg
Computational Social Science, Team Data Science Methods
+49(221)47694-742
Neural language models have revolutionised natural language processing (NLP) and have provided state-of-the-art results for many tasks. However, their effectiveness is largely dependent on the pre-training resources. Therefore, language models (LMs) often struggle with low-resource languages in both training and evaluation. Recently, there has been a growing trend in developing and adopting LMs for low-resource languages. LoResLM aims to provide a forum for researchers to share and discuss their ongoing work on LMs for low-resource languages.
LoResLM 2025 will be a physical workshop co-located with COLING 2025, Abu Dhabi on 20th January 2025.
We are pleased to share the programme of LoResLM 2025 with you. Please visit https://loreslm.github.io/program for the full programme.
To register for the workshop, please visit https://coling2025.org/registration/
We are looking forward to welcoming you at LoResLM 2025 in Abu Dhabi.
The workshop is supported in part by CLARIN-UK, funded by the Arts and Humanities Research Council as part of the Infrastructure for Digital Arts and Humanities programme.
>> Keynote Speaker
Jose Camacho-Collados, Cardiff University.
>> Organising Committee
Hansi Hettiarachchi, Lancaster University, UK
Tharindu Ranasinghe, Lancaster University, UK
Paul Rayson, Lancaster University, UK
Ruslan Mitkov, Lancaster University, UK
Mohamed Gaber, Birmingham City University, UK
Damith Premasiri, Lancaster University, UK
Fiona Anting Tan, National University of Singapore, Singapore
Lasitha Uyangodage, University of Münster, Germany
>> Programme Committee
Gábor Bella - IMT Atlantique, France
Samuel Cahyawijaya - The Hong Kong University of Science and Technology, Hong Kong
Burcu Can - University of Stirling, UK
Çağrı Çöltekin - University of Tübingen, Germany
Raj Dabre - National Institute of Information and Communications Technology, Japan
Vera Danilova - Uppsala University, Sweden
Debashish Das - Birmingham City University, UK
Ona de Gibert - University of Helsinki, Finland
Alphaeus Dmonte - George Mason University, USA
Bonaventure F. P. Dossou - McGill University, Canada
Daan van Esch - Google
Ignatius Ezeani - Lancaster University, UK
Anna Furtado - University of Galway, Ireland
Amal Htait - Aston University, UK
Ali Hürriyetoğlu - Wageningen University & Research, Netherlands
Danka Jokic - University of Belgrade, Serbia
Diptesh Kanojia - University of Surrey, UK
Daisy Lal - Lancaster University, UK
Colin Leong - University of Dayton, USA
Veronika Lipp - Hungarian Research Centre for Linguistics, Hungary
Muhidin Mohamed - Aston University, UK
Farhad Nooralahzadeh - University of Zurich, Switzerland
Rrubaa Panchendrarajan - Queen Mary University of London, UK
Nadeesha Pathirana - Aston University, UK
Alistair Plum - University of Luxembourg, Luxembourg
Nishat Raihan - George Mason University, USA
Omid Rohanian - University of Oxford, UK
Sandaru Seneviratne - Australian National University, Australia
Ravi Shekhar - University of Essex, UK
Archchana Sindhujan - University of Surrey, UK
Claytone Sikasote - University of Cape Town, South Africa
Marjana Prifti Skenduli - University of New York Tirana, Albania
Uthayasanker Thayasivam - University of Moratuwa, Sri Lanka
Taro Watanabe - Nara Institute of Science and Technology, Japan
John Vidler - Lancaster University, UK
Phil Weber - Aston University, UK
Bryan Wilie - Hong Kong University of Science & Technology, Hong Kong
Artūrs Znotiņš - University of Latvia, Latvia
URL - https://loreslm.github.io/
Twitter - https://x.com/LoResLM2025
Dr Tharindu Ranasinghe
School of Computing and Communications | Lancaster University
Contact me on Teams<https://teams.microsoft.com/l/chat/0/0?users=t.ranasinghe@lancaster.ac.uk>
www.lancaster.ac.uk<https://www.lancaster.ac.uk/>
FYI
=================================
Dear colleagues,
ELAR is excited to share the news that the *Endangered Languages
Documentation Programme* is offering an online training series in Language
Documentation and Archiving from March 6 to June 12, 2025. Applications to
participate in the training series are due 30 January 2025.
Please see the call below for more information. Please help this call reach
a broader audience for this series by sharing it with your students,
colleagues, and others who may be interested in the training.
Best wishes,
The ELAR Team
---------------------------------------------------------------------------------------------------------
Online Training Series in Language Documentation and Archiving
6 March – 12 June 2025
The Endangered Languages Documentation Programme (ELDP) is offering a
series of online trainings in Language Documentation and Archiving from *March
6 to June 12, 2025*. Training participants will meet weekly on Thursdays,
live via Zoom, for a webinar and discussion session. They will be expected
to complete readings, hands-on practice, and online assessments between
sessions. Live attendance at all sessions and the completion of all
assignments is required.
Below are the topics that will be covered in the training series:
· Linguistic diversity and language endangerment
· Language Documentation theory & methods
· Understanding archival collections
· Compiling a documentary collection
· Audio and video recording methods
· Transcription, translation, and annotation with ELAN
· Lexicography and dictionary creation with Fieldworks Language Explorer
(FLEx)
· Metadata creation and managing data
· Project planning and design
· Grant writing for language documentation projects
The online sessions will take place from 9:00 to 11:00 CET. Readings,
hands-on practice, and homework assignments will be made available via a
free course website. The language of instruction is English.
The training series has 25 spots available. Applicants planning to work
with endangered and under-documented languages (see Hammarström 2019
<https://elararchive.org/blog/2019/12/17/which-language-should-i-document-so…>),
especially Papuan languages, are strongly encouraged to apply. Applicants
should meet the criteria listed below:
· Have plans to document an endangered and under-documented language
· Be able to attend all webinar sessions and complete readings and
assignments
· Have a sufficient level of spoken and written English to be able to
complete assignments
· Have regular access to a Windows computer and a reliable internet
connection
To apply and for more information, please go here
<https://www.eldp.net/en/our+trainings/online+training+series/>. The
deadline is January 30th, 2025.
---------------------------------------------------------------------------------------------------------
--
*Interested in keeping up with ELAR? Subscribe to our new **mailing list*
<https://www.listserv.dfn.de/sympa/subscribe/elar-news>*!*
*Endangered Languages Archive*
Berlin-Brandenburg Academy of Sciences and Humanities
Jägerstraße 22/23
10117 Berlin, Germany
Website: https://elararchive.org/
Facebook: https://www.facebook.com/elararchive/
Instagram: https://www.instagram.com/elararchive/
Twitter: @ELARarchive <https://www.twitter.com/elararchive/>
Blog: https://elararchive.org/blog
Vimeo: https://vimeo.com/user64477333/albums
***Apologies for possible cross-posting ***
CALL FOR PAPERS DEADLINE EXTENSION
We are pleased to announce that the submission deadline for the 1st Workshop on Nordic-Baltic Responsible Evaluation and Alignment of Language Models (NB-REAL) has been extended from December 16th to December 23rd, 2024. The workshop will be held on March 2, 2025, as part of the NoDaLiDa/Baltic-HLT 2025 conference in Tallinn, Estonia.
About the Workshop
This half-day workshop focuses on the responsible evaluation and alignment of Large Language Models (LLMs) for Nordic and Baltic languages. Our goal is to bring together researchers, practitioners, and stakeholders to address the unique challenges and opportunities in this rapidly evolving field.
Topics of Interest
We welcome submissions on topics including, but not limited to:
- Ethical benchmarks for evaluating LLMs in Nordic and Baltic
languages
- Methods for creating culturally sensitive and inclusive evaluation
datasets
- Responsible techniques for generating or collecting alignment data
- Challenges and solutions in ethical LLM alignment for less-resourced
languages
- Case studies on responsible LLM evaluation or alignment projects
- Ethical considerations in LLM evaluation and alignment
- Comparative studies of LLM performance and fairness in Nordic and
Baltic languages
- Innovative approaches to leveraging limited language resources in
evaluation or alignment of language models
Important Dates
Paper Submission Deadline: December 16, 2024
Notification of Acceptance: January 13, 2025
Camera-Ready Deadline: February 3, 2025
Workshop Date: March 2, 2025
Workshop Format
NB-REAL 2025 will be a half-day workshop held on March 2, 2025 (pre-conference). It will be a hybrid event with both on-site and online participation available.
Submission
Submissions can be long papers (8 pages) or short papers (4 pages). All submissions must follow the NoDaLida template, available in both LaTeX and MS Word. The templates are available at the official conference website, see https://www.nodalida-bhlt2025.eu/call-for-papers#h.v2k63awq0fpe. All submissions will undergo peer review by the program committee. To submit your paper please visit NB-REAL 2025 Workshop | OpenReview<https://openreview.net/group?id=NoDaLiDa/Baltic-HLT/2025/Workshop/NB-REAL#t…>
Organizers
Hafsteinn Einarsson, Associate Professor in Computer Science, University of Iceland (hafsteinne(a)hi.is)
Annika Simonsen, PhD Student, University of Iceland (annika(a)hi.is)
Dan Saattrup Nielsen, Senior AI Specialist, Alexandra Institute (dan.nielsen(a)alexandra.dk)
For more information, please visit our website: https://nbreal.xyz/
We look forward to your contributions and to seeing you at NB-REAL
2025!
Apologies for cross-posting.
----------------------------------------
*The International Conference on Spoken Language Translation*
*ACL – 22nd IWSLT 2025 – First Call for Participation*
*31 July-1 August 2025 - Vienna, Austria*
http://iwslt.org
The International Conference on Spoken Language Translation (IWSLT)
<https://iwslt.org/> is the premier annual conference for all aspects of
Spoken Language Translation. Every year, the conference organises and
sponsors open evaluation campaigns around key challenges in simultaneous
and consecutive translation, under real-time/low latency or offline
conditions and under low-resource or multilingual constraints. System
descriptions and results from participants’ systems and scientific papers
related to key algorithmic advances and best practices are presented.
IWSLT is the venue of the SIGSLTs <https://iwslt.org/sigslt/>, the Special
Interest Group on Spoken Language Translation <https://iwslt.org/sigslt/>
of ACL <https://www.aclweb.org/portal/>, ISCA <https://www.isca-speech.org/>
and ELRA <https://www.elra.info/>. With a track record of 21 years, IWSLT
benchmarks and proceedings serve as reference for all researchers and
practitioners working on speech translation and related fields.
The 22nd edition of IWSLT will be run as a hybrid ELRA
<https://www.elra.info/>/ACL <https://www.aclweb.org/portal/> event,
co-located with ACL 2025 <https://2025.aclweb.org/> from 31 July to 1
August 2025.
*Important Dates*
*January 1, 2025*: Release of shared task training and dev data
*March 15, 2025*: Scientific paper submission deadline
*Apr 1-15, 2025*: Evaluation period
*April 21, 2025*: System description paper submission deadline
*May 15, 2025*: Notification of acceptance
*June 1, 2025*: Camera-ready deadline (all paper)
*July 31-Aug 1*, *2025*: IWSLT conference
Evaluation
The IWSLT 2025 features shared tasks <https://iwslt.org/2025/#shared-tasks>
that address the following focus areas:
- High-resource ST: Offline track, Simultaneous track, Subtitling track
- Low-resource ST: Low-resource and Indic (multilingual) tracks
- Instruction-following Speech Processing track: Technical domain ST, ASR,
Summarization, and QA
Training and development data for each shared task will be prepared and
released by the respective organisers (for further information on this
initiative, please refer to the IWSLT website <https://iwslt.org/2025/>).
Participants will receive instructions about how to submit their runs. In
addition, participants have the opportunity to present their work
through a system
paper that will be published in the ACL Proceedings.
Conference
IWSLT also invites submissions of scientific papers to be published in the
ACL Proceedings and presented either in oral or poster format. The
conference selects high-quality, original contributions on theoretical and
practical issues of spoken language translation research, technologies and
applications. Submissions will be accepted directly through the IWSLT
submission site (to be announced on the website <https://iwslt.org/2025/>).
We will also accept commitments of submissions with reviews from the ACL
Rolling Review.
Additionally, to foster cross-pollination of ideas, the conference also
invites the presentation of papers on speech translation recently published
elsewhere. Please note that this is for non-archival presentation of papers
relevant to speech translation already published in other venues (e.g.,
Findings for the *ACL, speech, NLP or MT conferences). Submissions for this
category will be accepted through a dedicated form (to be announced on the
website <https://iwslt.org/2025/>). Papers will be checked for relevance to
IWSLT, and assigned either oral or poster presentation slots if selected.
Contact
Please email iwslt-evaluation-campaign(a)googlegroups.com if you have any
questions related to the shared tasks.
Thanks,
Marine, Marcello, Alex, Jan, Sebastian, Elizabeth, Atul
(IWSLT organisers)
Apologies for cross-posting.
---------------------------------------------------------------------------
*The Eighth Workshop on Technologies for Machine Translation of
Low-Resource Languages (LoResMT 2025)*
*https://www.loresmt.org/ <https://www.loresmt.org/>*
*@ NAACL 2025 (May 3–4, 2025)*
*Albuquerque, New Mexico, U.S.A.*
*SUBMISSION*
*
<https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/LoResMT>https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT>*
*TIMELINE*
*Paper submission due:* January 30, 2025 (Anywhere on Earth)
*Pre-reviewed (ARR) submission deadline:* February 20, 2025
*Notification of acceptance:* March 1, 2025
*Camera-ready papers due:* March 10, 2025 (Anywhere on Earth)
*Pre-recorded video due (hard deadline):* April 8, 2025
*Workshop dates at NAACL 2025:* May 3–4, 2025
*SCOPE*
Based on the success of past low-resource machine translation (MT)
workshops at AMTA 2018, MT Summit 2019, AACL-IJCNLP 2020, AMTA 2021, COLING
2022, EACL 2023, ACL 2024, we introduce LoResMT 2025 workshop at NAACL
2025. The workshop provides a discussion panel for researchers working on
MT systems/methods for low-resource and under-represented languages in
general. We would like to help review/overview the state of MT for
low-resource languages and define the most important directions. We also
solicit papers dedicated to supplementary NLP tools that are used in any
language and especially in low-resource languages. Overview papers of these
NLP tools are very welcome. It will be beneficial if the evaluations of
these tools in research papers include their impact on the quality of MT
output.
*TOPICS*
We are highly interested in (1) original research papers, (2)
review/opinion papers, and (3) online systems on the topics below; however,
we welcome all novel ideas that cover research on low-resource languages.
- Neural machine translation (NMT) for low-resource languages
- Use of LLMs (large language models) for low-resource MT systems
- COVID-related corpora, their translations and corresponding NLP/MT systems
- Work that presents online systems for practical use by native speakers
- Word tokenizers/de-tokenizers for specific languages
- Word/morpheme segmenters for specific languages
- Alignment/Re-ordering tools for specific language pairs
- Use of morphology analyzers and/or morpheme segmenters in MT
- Multilingual/cross-lingual NLP tools for MT
- Corpora creation and curation technologies for low-resource languages
- Review of available parallel corpora for low-resource languages
- Research and review papers on MT methods for low-resource languages
- MT systems/methods (e.g. rule-based, SMT, NMT) for low-resource languages
- Pivot MT for low-resource languages
- Zero-shot MT for low-resource languages
- Fast building of MT systems for low-resource languages
- Re-usability of existing MT systems for low-resource languages
- Machine translation for language preservation
*SUBMISSION INFORMATION*
We are soliciting two types of submissions: (1) research, review, and
position papers and (2) system demonstration papers. For research, review
and position papers, the length of each paper should be at least four (4)
and not exceed eight (8) pages, plus unlimited pages for references. For
system demonstration papers, the limit is four (4) pages. Submissions
should be formatted according to the official ACL style templates
(Overleaf). Please refer to the NAACL submission guideline for further
information <https://2025.naacl.org/calls/papers/#paper-submission-details>.
Accepted papers will be published at ACL Anthology in the NAACL 2025 and
will be presented at the conference.
Submissions must be anonymized and should be done using the provided
submission system. Scientific papers that have been or will be submitted to
other venues must be declared as such and must be withdrawn from the other
venues if accepted and published at LoResMT. The review will be
double-blind. Authors of an accepted paper should present their paper in
person at NAACL 2025. Papers should be submitted in PDF to the LoResMT Open
Review
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT>.
We would like to encourage authors to cite papers written in ANY language
that are related to the topics, as long as both original bibliographic
items and their corresponding English translations are provided.
Registration is handled by the main conference (https://2025.naacl.org/).
*ORGANIZING COMMITTEE (LISTED ALPHABETICALLY)*
Atul Kr. Ojha, University of Galway
Chao-Hong Liu, Potamu Research Ltd
Ekaterina Vylomova, University of Melbourne, Australia
Jade Abbott, Retro Rabbit
Jonathan Washington, Swarthmore College
Nathaniel Oco, National University (Philippines)
Tommi A Pirinen, UiT The Arctic University of Norway, Tromsø
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Varvara Logacheva, Skolkovo Institute of Science and Technology
Xiaobing Zhao, Minzu University of China
*PROGRAM COMMITTEE (LISTED ALPHABETICALLY)*
Abigail Walsh, ADAPT Centre, Dublin City University, Ireland
Alberto Poncelas, Rakuten, Singapore
Ali Hatami, University of Galway
Alina Karakanta, Fondazione Bruno Kessler (FBK), University of Trento
Anna Currey, AWS AI Labs
Aswarth Abhilash Dara, Walmart Global Technology
Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP
Bogdan Babych, Heidelberg University
Chao-hong Liu, Potamu Research Ltd
Constantine Lignos, Brandeis University, USA
Daan van Esch, Google
Dana Moukheiber, Massachusetts Institute of Technology
Ekaterina Vylomova, University of Melbourne, Australia
Eleni Metheniti, CLLE-CNRS and IRIT-CNRS
Flammie Pirinen, UiT Norgga árktalaš universitehta
Gaurav Negi, University of Galway
Jinliang Lu, Institute of automation, Chinese Academy of Sciences
John Philip McCrae, University of Galway
Jonathan Washington, Swarthmore College
Koel Dutta Chowdhury, Saarland University
Majid Latifi, UPC University
Maria Art Antonette Clariño, University of the Philippines Los Baños
Milind Agarwal, George Mason University
Mathias Müller, University of Zurich
Nathaniel Oco, De La Salle University
Pavel Rychlý, Masaryk University and Lexical Computing
Pengwei Li, Meta
Rashid Ahmad, International Institute of Information Technology, Hyderabad
Rico Sennrich, University of Zurich
Santanu Pal, Wipro
Sangjee Dondrub, Qinghai Normal University
Sardana Ivanova, University of Helsinki
Sourabrata Mukherjee, Charles University
Thepchai Supnithi, National Electronics and Computer Technology Center
Timothee Mickus, University of Helsinki
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Wen Lai, LMU Munich
Xuebo Liu, Harbin Institute of Technolgy, Shenzhen
Yalemisew Abgaz, Dublin City University
Yasmin Moslem, Bering Lab
Zhanibek Kozhirbayev, National Laboratory Astana, Nazarbayev University
*CONTACT*
Please email loresmt(a)googlegroups.com if you have any
questions/comments/suggestions.