- Corpora - ELRA lists

[CfP] First CfP for LowResNLP: Workshop on Advancing NLP for Low-Resource Languages @ RANLP 2025 (Varna, Bulgaria)
by Simon Ostermann 07 Jun '25

07 Jun '25

Dear colleagues, We are pleased to announce the first call for papers of the *Workshop on Advancing NLP for Low-Resource Languages (LowResNLP) at RANLP 2025* The most important information at a glance: 🗓️ Deadline: July 6, Workshop: Sep 11-13 📍 Varna, Bulgaria 🌐 https://lrlnlp.github.io/website/ Despite rapid progress in Natural Language Processing (NLP), the benefits of recent advances - especially large language models (LLMs) - remain unevenly distributed. While high-resource languages like English, French, and Chinese have seen significant performance gains, low-resource languages continue to face substantial challenges across core NLP tasks such as machine translation, sentiment analysis, named entity recognition (NER), and part-of-speech tagging. These disparities arise from a combination of factors: the scarcity of high-quality training data, limited linguistic resources, and a lack of community involvement in data collection and model development. As a result, many languages, particularly African, Indigenous, and minority languages, remain underrepresented in both academic research and deployed NLP systems. LowResNLP is a workshop dedicated to addressing these challenges by fostering research, collaboration, and discussion around methods, resources, and evaluation practices specifically designed for low-resource languages. LowResNLP seeks to actively contribute to the field by inviting submissions that specifically address the unique challenges and opportunities involved in working with low-resource languages. The workshop welcomes a broad range of topics, including but not limited to: * Language models and large language models for low-resource languages * Corpora creation and curation technologies for low-resource languages * Evaluation benchmarks for language models in low-resource languages * Language models and resources for low-resource languages in Spain * Machine/pivot translation for low-resource languages * Fairness in resources/models for low-resource languages * Prompting learning strategies for large language models * Transfer learning and Crosslingual approaches for low-resource NLP * Massively multilingual approaches to Low-Resource NLP Important Dates: Workshop paper submission deadline: 6 July 2025 (AoE) Workshop paper acceptance notification: 31 July 2025 Workshop paper camera-ready versions: 30 August 2025 Workshop camera-ready proceedings ready: 8 September 2025 Workshops: 11-13 September 2025 Submission formats: We invite the submission of both full papers and short papers. Full papers should not exceed 8 pages (plus unlimited number of pages for references and ethics/broader impact statement). Short papers should not exceed 4 pages (plus unlimited number of pages for references and ethics/broader impact statement). All submissions should be prepared using the current ACL templates (see https://ranlp.org/ranlp2025/index.php/submissions/). Papers should be submitted through SoftConf: https://softconf.com/ranlp25/LowResNLP2025 Organizers: For any questions, please drop a mail to lowresnlp-2025-organizers(a)googlegroups.com Ernesto Luis Estevanell-Valladares (University of Alicante, Spain; University of Havana, Cuba) Alicia Picazo-Izquierdo (University of Alicante, Spain) Tharindu Ranasinghe (Lancaster University, UK) Besik Mikaberidze (Georgian Technical University, Georgia) Simon Ostermann (German Research Center for Artificial Intelligence, Germany) Daniil Gurgurov (German Research Center for Artificial Intelligence, Germany) Philipp Müller (German Research Center for Artificial Intelligence, Germany) Kurt Micallef (University of Malta, Malta) Claudia Borg (University of Malta, Malta) Michal Gregor (KINIT, Slovakia) Marián Šimko (KINIT, Slovakia)

1 0

Second call for participation: MT Marathon in Helsinki in August 2025
by Tiedemann, Jörg 06 Jun '25

06 Jun '25

This is the second call for participation on the 18th MT Marathon that will take place in Helsinki on August 25-29, 2025. The eighteenth edition of the MT Marathon will be organized by the Language Technology Research group at the University of Helsinki, Finland, with sponsorship of EAMT<https://eamt.org/> and HPLT<https://hplt-project.org/>. Each Machine Translation Marathon is a week-long gathering of machine translation researchers, developers, students, and users featuring: - MT Lectures and Labs covering the basics and tutorials. - Keynote Talks from experienced researchers and practitioners. - Presentations of research and open-source tools related to MT. - Hacking Projects to advance tools or research in one week or start new collaborations. Details can be found on the event page: https://blogs.helsinki.fi/language-technology/mt-marathon-2025/ ** Registration ** The registration is free of charge for EAMT members. To register, use the following link: https://forms.gle/uvrZuWpeSbcmJozK7. The registration form will remain open until the start of the event or the space we can accommodate is filled. ** Call for Abstract Submissions ** The MT Marathon will again host an open session with poster presentations related to MT/NLP research and open-source tools. We invite students, developers and researchers to submit short abstracts (1 page) featuring previously published results, open-source tool demos, and work in progress. Abstracts are lightly reviewed for topical scope, and all relevant submissions will be accepted for presentation. Deadline: June 15 ** Call for project proposals ** As always, project topics will get finalized on the first day of the Marathon, but it was found useful in the past to announce and refine project proposals earlier. If you have an idea what you'd like to implement in a small team of fellow participants, or if you just want to peek at what is going to be proposed, have a look or edit the live document linked here: https://docs.google.com/document/d/1A4Iy_iOVvYHKAwnSV2ZGIPru7t-jeMauCQd6i9G… . ** Programme ** The event will include a poster session, labs, and lessons from experts in the field, including: * Ayodele Awokoya, McPherson University, University of Ibadan, Masakhane * Wilker Aziz, University of Amsterdam * Marta R. Costa-jussà, Meta AI * Barry Haddow, University of Edinburgh * Sara Papi, FBK Trento * Amit Moryossef, ETH, Bar-Ilan University, sign.mt * Juan Antonio Pérez, University of Alacant * Gema Ramírez-Sánchez, Prompsit * Marco Turchi, Zoom * Jörg Tiedemann, University of Helsinki The programme is still under construction. For up to date information about invited speakers and the topics that will be covered by talks and labs, have a look at the event page here: https://blogs.helsinki.fi/language-technology/mt-marathon-2025/

1 0

Final Call for Papers - NLP4Sustain 2025 - Submission deadline June 10
by Jakob Prange 06 Jun '25

06 Jun '25

Final Call for Papers: NLP for Sustainability (NLP4Sustain) Workshop 2025 This is the final call for papers, the submission deadline is Tuesday, June 10, anywhere on Earth. Summary of important submission information: * Submission page: https://openreview.net/group?id=KONVENS/2025/Workshop/NLP4Sustain * Anonymity: Reviewing of papers will be double-blind. Therefore, the paper must not include the authors' names and affiliations or self-references that reveal the authors’ identity. Program: The workshop program will consist of accepted paper presentations, as well as * a Keynote by Dr. Mariana M. de Brito on Information extraction on climate impacts and adaptation using NLP, and * a Shared Task: The results of the SustainEval 2025 GermEval shared task will also be presented at the workshop. The evaluation phase starts on June 10 and ends on June 27: https://sustaineval.github.io For further details, visit our website: https://nlp4sustain.github.io/ If you have any questions, please contact: jakob.prange(a)uni-a.de and/or c.jakob(a)tu-berlin.de -- Dr. Jakob Prange (er/he) Akademischer Rat auf Zeit / Research Associate Chair for Natural Language Understanding & Digital Humanities (Prof. Friedrich) Faculty for Applied Informatics, University of Augsburg https://jakpra.github.io/

1 0

AthNLP 2025 summer school
by A. Vlachos 06 Jun '25

06 Jun '25

Extended Call for Participation New Deadline: Monday 15 June 2025 AthNLP 2025 - Athens Natural Language Processing Summer School<https://athnlp.github.io/2025/index.html> We invite everyone interested in Natural Language Processing and Machine Learning to participate in the 3rd Athens Natural Language Processing Summer School taking place in Athens, Greece at NCSR Demokritos Campus between 4-10 September 2025. Application Deadline: 15 June 2025 Apply here: <https://athnlp.github.io/2025/cfp.html> https://athnlp.github.io/2025/cfp.html [AthNLP2025 banner]<https://athnlp.github.io/2025/cfp.html> Preliminary schedule<https://athnlp.github.io/2025/schedule.html> Sponsor info<https://drive.google.com/file/d/1r_JBhUFdH9svHbmNFAg5iSggJj91pPVT/view> Important Dates Application Deadline: 15 June 2025 Decision announcement: 20 June 2025 Registration until: 27 June 2025 Summer School: 4-10 September 2025 Following successful editions in 2019 and 2024, AthNLP 2025 returns to the campus of NCSR Demokritos in Athens. The summer school is organised by NCSR Demokritos, the Athens University of Economics and Business, RC ATHENA, and Heriot-Watt University, in close collaboration with LxMLS (Lisbon, 19–25 July 2025). The school focuses on machine learning methods for NLP, offering: Morning lectures on theory Afternoon hands-on lab sessions Evening research talks, poster sessions, and demos *Participants will also have the opportunity to present their work in poster sessions throughout the week. Target Audience: - Students and researchers in NLP and Computational Linguistics Computer scientists with interest in NLP and ML Industry professionals seeking deeper understanding of these fields ** No prior experience in NLP or ML is required—just basic math and Python. Features of AthNLP: * Attendance at the Social Event, daily lunch as well as morning and afternoon coffee breaks are included in the application fee. * Lecturers are leading researchers in machine learning and natural language processing. * Students will be able to (optionally) show their current work in poster sessions during coffee breaks. * In the demo day, students will be able to interact with technical companies and research institutions working in machine learning. Confirmed Speakers * Antonis Anastasopoulos, George Mason Computer Science * Mohit Bansal, UNC Chapel Hill * Eunsol Choi, New York University * Marie-Catherine de Marneffe, UCLouvain * Raquel Fernández, University of Amsterdam * Yulan He, King's College London, UK * Ryan McDonald * Preslav Nakov, MBZUAI * Vlad Niculae, University of Amsterdam * Anna Rogers, IT University of Copenhagen Additional speakers will be announced soon. Fees 300 EUR for students 400 EUR for University professors or researchers at public Institutes 500 EUR for everyone else Any questions should be directed to: athnlp2024(a)athenarc.gr We are looking forward to your participation! The Organising Committee of AthNLP 2025

1 0

Call for Participation: TREC 2025 Biomedical Generative Retrieval (BioGen) Track
by ddemner＠gmail.com 06 Jun '25

06 Jun '25

Dear colleagues and friends, We would like to invite you to participate in the TREC 2025 BioGen Track, which focuses on evaluating the reliability and transparency of large language models (LLMs) in the biomedical domain. Tasks Building on the TREC 2024 pilot task, this year’s track introduces two key challenges: Reference Attribution – Identify source documents that support LLM-generated responses to biomedical questions. Answer Grounding – Cite references for each assertion in an answer to a biomedical question to ensure factual accuracy. These tasks aim to reduce hallucinations and promote the generation of evidence-based answers in biomedical applications. Track Website https://trec-biogen.github.io/docs/ Timeline Dataset Release: June 2, 2025 Baseline and Starter Kit Release: June 2, 2025 Results Submission Deadline: August 15, 2025 Evaluation Results Release: Late September, 2025 Notebook Paper Due: Late October, 2025 TREC 2025 Conference: November 17-21 Registrations Please follow the TREC 2025 registration guidelines from their Call for Participation. Communication Join our Google Group for important updates! If you have any questions, ask in our Google Group or email us. Organizers Deepak Gupta - National Library of Medicine, NIH Dina Demner-Fushman - National Library of Medicine, NIH Bill Hersh - Oregon Health & Science University Steven Bedrick - Oregon Health & Science University Kirk Roberts - University of Texas, Houston Thank you

1 0

Call for Proposals: Common Voice Public API Developer Fund
by Francis Tyers 05 Jun '25

05 Jun '25

[apologies for cross-posting] Dear all, I have been forwarded the following call to share with you about some grants for working with Common Voice: https://common-voice.github.io/common-voice-docs/calls/call_public-api-deve… Highlights: - Grants in the region of $5000-20000 - Language / community specific interface creation - Integrations with other projects Any questions, please contact them on commonvoice(a)mozilla.com. Best regards, Francis M. Tyers

1 0

Free Lancaster webinar - tomorrow
by Brezina, Vaclav 05 Jun '25

05 Jun '25

Dear all, As part of our webinar series introducing MA and PG Cert programmes in Corpus Linguistics at Lancaster University, we offer a talk focused on applications of the corpus methodology "Corpus-based discourse analysis and the digital humanities" 2 April 2-3pm UK time. Register: https://forms.office.com/e/uppRBrE5AF Best, Vaclav Professor Vaclav Brezina Professor in Corpus Linguistics Department of Linguistics and English Language ESRC Centre for Corpus Approaches to Social Science Faculty of Arts and Social Sciences, Lancaster University Lancaster, LA1 4YD Office: County South, room C05 T: +44 (0)1524 510828 @vaclavbrezina

1 1

2nd Call for Papers: 1st International Workshop on Language and Language Models (WoLaLa)
by Héja Enikő 05 Jun '25

05 Jun '25

2nd Call for Papers 1st International Workshop on Language and Language Models (WoLaLa) Budapest, Hungary | November 20-21 The Hungarian Research Centre for Linguistics (HUN-REN) and the Programme Committee are pleased to issue the Second Call for Papers for the 1st International Workshop on Language and Language Models. As the submission deadline approaches, we encourage researchers and practitioners in the social sciences and humanities to contribute extended abstracts and take advantage of the opportunity to hear from our distinguished keynote speakers. Keynote speakers: Erhard Hinrichs, University of Tubingen, Germany Alessandro Lenci, University of Pisa, Italy Contributions should address one or more of the following areas (but submissions on other closely related topics are also welcome): General language models: Critical and comparative analyses of state-of-the-art language models, including their linguistic competence, performance, and limitations. Cultural and linguistic perspectives: Investigations into the cultural, cognitive, and scientific aspects of language processing, including the unexplored territories of model behavior and linguistic capability. Applications and best practices: Case studies and best practices in applying AI to language research, highlighting the potential for cross-disciplinary innovation within SSH. Bridging disciplines: Contributions that examine the role of language models in reshaping traditional SSH methodologies, and proposals on integrating AI insights into linguistic inquiry. IMPORTANT DATES 30 June 2025: Submission deadline 15 September 2025: Notification of acceptance 20 November – 21 November 2025: Workshop in Budapest 15 January 2026: Full paper submission deadline Submissions We expect submissions in the form of extended abstracts (length: 3 to 4 pages including references) in PDF format, in accordance with the template (https://www.overleaf.com/read/sbmczvkpxpzz#4a94e3). Please ensure your submission clearly outlines your research question, methodology, and preliminary findings. Extended abstracts must be submitted through the EasyChair submission system <https://easychair.org/conferences?conf=wolala2025> and will be reviewed by the Programme Committee. Publication Selected papers will be published in Acta Linguistica Academica <https://akjournals.com/view/journals/2062/2062-overview.xml>. After acceptance notifications, the author(s) of accepted submissions will be invited to submit full papers (10-12 pages) to be reviewed according to the same criteria as the abstracts. Programme Committee The Programme Committee for the conference consists of the following members: Gábor Prószéky, HUN-REN Hungarian Research Centre for Linguistics & Pázmány Péter Catholic University (chair) António Branco, University of Lisbon, Portugal Eva Hajičová, Charles University Prague, Czech Republic Erhard Hinrichs, University of Tubingen, Germany András Kornai, HUN-REN Institute for Computer Science and Control, Hungary Csaba Pléh, Central European University, Austria Paul Rayson, Lancaster University, United Kingdom Frédérique Segond, National Institute for Research in Digital Science and Technology, France Frieda Steurs, Dutch Language Institute, Belgium Marko Tadić, University of Zagreb, Croatia Dan Tufiș, Romanian Academy, Romania Hans Uszkoreit, German Research Center for Artificial Intelligence, Germany Tamás Váradi, HUN-REN Hungarian Research Centre for Linguistics, Hungary Martin Wynne, University of Oxford, United Kingdom Venue & registration The workshop will take place at the HUN-REN Hungarian Research Centre in Budapest, Hungary, on 20–21 November 2025. Details on registration fees, travel grants, and accommodation options will be posted on the workshop website: https://wolala.nytud.hu <https://wolala.nytud.hu/>. Early registration will open in September 2025. LINKS 1st International Workshop on Language and Language Models website: https://wolala.nytud.hu <https://wolala.nytud.hu/> EasyChair submission: https://easychair.org/conferences?conf=wolala2025 Template for submissions: ZIP-archive: https://wolala.nytud.hu/templates/WoLaLa2025.zip Overleaf template: <https://www.overleaf.com/read/xsvjrhvjyfmj#f3362f>https://www.overleaf.com/read/sbmczvkpxpzz#4a94e3 Contact for any questions regarding the conference: info(a)wolala.nytud.hu

1 0

ReLDI konferencija: Language science and technology, Belgrade 25-26 September 2025
by Tanja Samardzic 04 Jun '25

04 Jun '25

ReLDI Centre Belgrade invites paper submissions to its first conference! Paper submission deadline: *30 June 2025* Conference dates *25-26 September 2025 *Conference web site: https://reldi.rs/en/conference/ Venue: Palace of Science in Belgrade The goal of the conference is to provide a broadly accessible overview of current scientific insights, technological achievements and upcoming trends in the filed of scientific and computational language modelling. We aim at scientifically sound papers with a contribution that is clear although not necessarily highly ambitious. We expect *full papers* of a *minimum length of 4 pages* (no maximum length) describing original, unpublished and completed work. The minimum length includes text, tables, graphs, but not references. Submission types: * Natural language processing (NLP), systematic model evaluation or performance improvement on one or more of the following data sets: o ReLDI <https://reldi.rs/en/data-sets/> o CLASSLA <https://www.clarin.si/info/k-centre/> o JeRTeh <https://jerteh.rs/index.php/en/> / TESLA <https://tesla.rgf.bg.ac.rs/index.php/en/> o Other publicly available data of similar scope and quality * Language structure from the point of view of general linguistics: empirical testing of hypotheses * Theoretical study of natural language relevant to NLP * Theoretical machine learning relevant to NLP * Language services and computational technology * Surveys on any of these topics Best regards, Tanja Samardzic, Programme chair

1 0

CfP DGfS 2025 Computational Linguistics Poster Session
by Annette Hautli-Janisz 04 Jun '25

04 Jun '25

CALL FOR ABSTRACTS DGfS 2025 Computational Linguistics Poster Session We invite the submission of abstracts for the Computational Linguistics poster session of the 48th annual meeting of the German Linguistic Society (DGfS), hosted by the University of Trier. We invite submissions from all areas of computational linguistics, ranging from models of language across all linguistic areas to corpus lingusitics, multimodal approaches and studies on LLM capability assessment. We especially encourage students and junior researchers to participate. The poster session is organized by the Special Interest Group on Computational Linguistics of the DGfS (dgfs.de/cl). Conference webpage: https://www.uni-trier.de/universitaet/fachbereiche-faecher/fachbereich-ii/f… DATES - Abstract submission due: September 15, 2025 - Notification of acceptance: October 1, 2025 - Short abstract (for conference website/brochure) due: October 15, 2025 - Conference dates: February 24-27, 2026 SUBMISSION Anonymous one-page abstract (A4) in PDF format (12pt). Submissions can be in German or English. Please submit your abstract via Openreview: https://openreview.net/group?id=DGfS.de/2026/Conference_Poster_Session

1 0

2026

2025

2024

2023

2022

Corpora