- Corpora - ELRA lists

If you aren't a current list member, sending this message will subscribe you.
by lecphalt＠mowan666.com 04 Jul '26

04 Jul '26

https://www.onlineblogsandarticles.com/escorts/marsillpost~com-2022.06.13 If you aren't a current list member, sending this message will subscribe you.

1 0

Call for Participation: Terminology Shared Task @WMT2026
by Kirill Semenov 03 Jul '26

03 Jul '26

[apologies for cross-posting] Terminology Translation Task @WMT2026 �C Call for Participation We are excited to announce the 4th Terminology Translation Shared Task, which will be held as part of the 11th Conference on Machine Translation (WMT 2026), co-located with EMNLP 2026 in Budapest, Hungary. TL;DR * Two tracks on terminology-aware document-level machine translation. * Language pairs: Spanish��Basque, English��Polish, and Traditional Chinese��English across engineering, medicine, and finance domains. * Evaluation metrics: both general MT quality and terminology-oriented metrics. * Task started: 30 June 2026 * System submission deadline: 31 July 2026 (AoE) * Shared task website: https://www2.statmt.org/wmt26/terminology.html * Pre-registration: https://forms.gle/nCHmek9UkdJaraqu9 OVERVIEW Despite remarkable progress in machine translation, accurately translating specialized terminology while maintaining overall translation quality remains a challenging problem. Previous editions of the Terminology Translation Shared Task demonstrated that document-level terminology translation is still substantially more difficult than sentence-level translation, particularly when consistent terminology usage is required throughout a document. This year's edition introduces two major developments. First, we expand the shared task to lower-resource, morphologically rich languages: Basque and Polish. Second, alongside our established terminology-dictionary setting, we introduce a new track in which systems must infer terminology from seed bitexts, providing a more realistic and challenging evaluation scenario. We welcome submissions from both academia and industry and impose no restrictions on models or training resources. IMPORTANT DATES (AoE) * 1 Jul 2026: Test data and submission form released; shared task begins * 31 Jul 2026: System submission deadline * 7 Aug 2026: paper description submission (according to the WMT 2026 schedule) * 28-29 October 2026: WMT 2026 conference SUBMISSION TRACKS Track 1: Document-Level Translation with Explicit Dictionary Participants receive document-level source texts together with terminology dictionaries and evaluate their systems under different terminology conditions. Track 2: Document-Level Translation with Seed Bitexts Instead of explicit terminology dictionaries, participants receive terminology-rich parallel examples ("seed bitexts") and must leverage them to translate related documents. LANGUAGES AND DOMAINS * Spanish �� Basque (Engineering & Technology), * English �� Polish (Engineering & Technology, Medicine), * Traditional Chinese �� English (Finance) EVALUATION Following the previous editions of the shared task, systems will be evaluated under multiple terminology modes to assess the causal effect of the explicit terminology contents: * No terminology * Proper terminology * Random terminology Submissions will be assessed using two complementary groups of metrics: * Overall translation quality, including automatic MT metrics (e.g., BLEU) together with evaluation of grammatical term usage. * Terminology-oriented metrics, including terminology success rate and terminology consistency. SUBMISSION GUIDELINES Before submitting, we kindly ask participants to: 1. (Recommended) Pre-register using our Google Form to receive reminders and help us estimate participation. 2. Validate system outputs using the official validation script. 3. Submit translations through the official submission form together with a brief system description. Participants are also encouraged to submit a WMT system description paper following the main conference guidelines. * The participants are allowed to submit systems for one track and/or only subset of the language pairs (however, we encourage you to test your strengths across all language pairs) * Detailed task descriptions, data formats, input/output specifications, and submission instructions are available on the shared task website. ORGANIZERS Nathaniel Berger, Adrian Charkiewicz, Pinzhen Chen, Thierry Etchegoyhen, Harritxu Gete Ugarte, Kamil Guttmann, Xu Huang, David Ponce, Artur Nowakowski, Fr��d��ric Odermatt, Arturo Oncevay, Kirill Semenov, Dawei Zhu, and Vil��m Zouhar. Contact: Kirill Semenov �� firstname.lastname [at] uzh [dot] ch WEBSITE https://www2.statmt.org/wmt26/terminology.html We look forward to your participation!

1 0

[Final CfP] LM Playschool Workshop at EMNLP 2026 — Challenge Deadline in 2 Days!
by Sabrina McCallum 03 Jul '26

03 Jul '26

Improving Language Models through Learning from Dialogue Interaction Co-located with EMNLP 2026 — 28 October 2026, Budapest Website: https://lm-playschool.github.io/ Starter Kit: https://github.com/lm-playpen/playpen The submission deadlines for the LM Playschool Workshop are almost here! The challenge submission deadline is this Sunday, July 5, 2026 (only 2 days away!), followed shortly by the direct paper submission deadline on July 12, 2026 (9 days away!). 📝 SUBMISSION TRACKS We welcome either long or short submissions for the following tracks: 1. Challenge track: Technical reports for the LM-Playschool challenge (archival). 2. Paper-only track: Work-in-progress (archival or non-archival) or recently published papers (non-archival). 🏆 CHALLENGE TRACK (SHARED TASK) The shared task focuses on post-training LLMs to master communicative skills in unseen dialogue games while retaining original language capabilities. Participants are free to choose any base model; evaluation is based on improvement relative to that base model on an unseen test set. Not signed up yet? Register here: https://forms.gle/fhpXPH5kZk4psPXp9 🛠️ STARTER KIT Our sandbox environment Playpen is available on GitHub and ready to use. Key features: * Comprehensive Evaluation: A single command to get your "clemscore" (interactive competence) and "statscore" (classic benchmarks). * Training Recipes: Example scripts for SFT and GRPO to help models learn from game-state success. * Resource Friendly: We encourage using the Qwen3.5 family (0.8B to 27B) to ensure participation is possible with modest compute. 🎯 PAPER-ONLY TRACK: TOPICS OF INTEREST We welcome original research and work-in-progress on: * Architectures and training regimes for interactive agents. * Intrinsic rewards and learning signals (RL from game-state success). * Benchmarking via dialogue games. * Data efficiency and social interaction. * Social cognition and Theory of Mind in interactive systems. * Human-agent collaboration and coordination. * Embodied interactive agents. * Communicative and perceptual grounding. 📅 IMPORTANT DATES * Challenge submission deadline: July 5, 2026 (only 2 days away!) * Direct paper submission deadline: July 12, 2026 (9 days away!) * Pre-reviewed ARR commitment deadline: August 2, 2026 * Notification of acceptance: August 8, 2026 * Camera ready due: August 23, 2026 * Challenge winners announced: Early October 2026 * Workshop at EMNLP 2026: October 28, 2026 — Budapest For more information, visit our website: https://lm-playschool.github.io/ We look forward to your submissions! The LMP 2026 Organizing Committee: Raffaella Bernardi, Raquel Fernández, Mario Giulianelli, Sherzod Hakimov, Alexander Koller, Dieu-Thu Le, Oliver Lemon, Davide Mazzaccara, Sabrina McCallum, David Schlangen, Alessandro Suglia

1 0

5 Positions for Doctoral & Postdoctoral Researchers at the Text & Language Lab, FAU
by Michaela Mahlberg 03 Jul '26

03 Jul '26

Dear all, Please see the information on job opportunities below. I’d be grateful if you could share with potential candidates. Where: The Text & Language Lab in the Department of Digital Humanities and Social Studies (DHSS) at Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) is recruiting up to five outstanding postdoctoral and doctoral researchers. Join a highly collaborative team pursuing an ambitious research agenda on various aspects of language in today’s digital world. Led by Alexander von Humboldt-Professor Michaela Mahlberg, the lab offers an inspiring environment where curious researchers work across disciplines and produce results with clear, real-world relevance. What: Depending on your qualifications and experience there are different options of appointment, which include opportunities to make a contribution to the teaching of the Department. • PhD positions are for up to 4 years (75% or 100%), initial contract 1 year with extension after successful completion of first year • Postdoctoral positions for up to 4.5 years (100 %), initial contract 2 years, after successful completion of that period there is the opportunity to extend the positions until 31.03.2031 The positions: 1) Postdoctoral researcher with focus on web application development and CLiC (clic-fiction.com) Specific responsibilities will include: • Corpus compilation • Modular software development • Leading on the organisation of training workshops and dissemination activities • Developing training materials for workshops • Producing documentation • Collaboration with external software developers • Working as part of a team, with the responsibility to manage specific work packages and lead clear sub-projects (including publications and grant application development) • Managing project tasks such as supervising student assistants and providing progress reports 2) Postdoctoral researcher with focus on experimental methods Specific responsibilities will include: • Designing and running experiments that focus on the engagement with texts and tools (reader response, surveys, interviews, eye-tracking, Human-AI interaction) • Analysing research data • Leading on the organisation of training workshops and dissemination activities • Working as part of a team, with the responsibility to manage specific work packages and lead clear sub-projects (including publications and grant application development) • Managing project tasks such as supervising student assistants and providing progress reports 3) Doctoral researcher - PhD Project: Fictional people and the climate crisis The project will be situated within a research context that deals with textual patterns of characterization with a focus on the relationship between fictional people and their environment. This environment will cover both the natural and the built environment. The textual patterns will be studied on the basis of data from a variety of texts (including climate and science fiction). Methods of analysis will draw on computational textual analysis. The specific research questions of the project will be developed together with the successful candidate. Specific responsibilities will include: • Text and data analysis • Corpus design and compilation • Working as part of a team, with the responsibility to manage specific work packages and lead clear sub-projects (including publications and grant application development) • Managing project tasks such as supervising student assistants and providing progress reports More detail on required qualifications and how to apply: https://www.fau-jobs.de/jobposting/621d437fb9ed15d519420c8046d50aa9431c66ec0 Deadline: 24 July 2026 If you have any questions don’t hesitate to contact me Michaela -- Professor Michaela Mahlberg https://michaelamahlberg.com/<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fmichaelamahlberg.com%2f…> Alexander-von-Humboldt Professor & Professor of Digital Humanities<https://www.youtube.com/watch?v=bgkKhcwvWgI> Head of Department of Digital Humanities and Social Studies (DHSS)<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.dhss.phil.fau.de%2f…> FAU - Friedrich-Alexander-Universität Erlangen-Nürnberg<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fau.de%2f&c=E,1,fU0…> Honorary Professor of Corpus Linguistics University of Birmingham, UK<https://www.birmingham.ac.uk/> Editor of the International Journal of Corpus Linguistics <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fbenjamins.com%2fcatalog…> President of the Dickens Society<https://dickenssociety.org/> Host of the Life and Language Podcast <https://podcasters.spotify.com/pod/show/michaela-mahlberg/> https://www.linkedin.com/in/michaela-mahlberg/

1 0

2nd CfP for the HeMAI workshop at ICMI 2026 - Deadline Extended
by Stefan Hillmann 02 Jul '26

02 Jul '26

========================================================= 2nd Call for Papers - Deadline EXTENDED: HeMAI 2026 Multimodal Interaction with Generative AI Health Applications Workshop at ICMI 2026, Naples, Italy, 9 October 2026 (in person) https://qulab.github.io/HeMAI2026/ Submission via PCS until July 13, 2026 ========================================================= Generative AI is changing how people interact with health technology. Large language models, multimodal foundation models, and conversational agents make it possible to combine text, speech, vision, and physiological signals in ways that weren't practical a few years ago, and they raise difficult questions about usability, safety, explainability, trust, and evaluation when the stakes involve someone's health. HeMAI 2026 brings HCI, AI, and health researchers together to work through those questions. We welcome contributions on multimodal interaction techniques for health systems, conversational and embodied agents in healthcare, integration of text/speech/vision/biosignals, explainability and transparency, human-AI collaboration, safety and trust, ethical and regulatory issues, evaluation methodologies, and real-world deployments. Submission types: - Full research papers (up to 8 pages, excluding references) - Short / work-in-progress / position / demo papers (up to 4 pages, excluding references) Submissions follow ICMI 2026 author guidelines: https://icmi.acm.org/2026/guidelines/ Papers rejected from the main ICMI track are welcome if they fit the scope. Important dates (AoE): - Paper submission: 6 July 2026 (via PCS) - Notification: 29 July 2026 - Camera-ready: 1 August 2026 - Workshop: 9 October 2026 Accepted papers will appear in the workshop proceedings associated with ICMI 2026. Organizers: - Stefan Hillmann (TU Berlin) - Sebastian Möller (TU Berlin & DFKI Berlin) - Catherine Pelachaud (CNRS--ISIR, Sorbonne University) - Lisa Raithel (TU Berlin & BIFOLD & DFKI Berlin & Charité--IKIM) - Roland Roller (DFKI Berlin) Workshop website: https://qulab.github.io/HeMAI2026/ We are looking forward to your submissions!

1 0

Job Openings: Doctoral & Postdoctoral Researchers in Foundation Health Models
by Shaoxiong Ji 02 Jul '26

02 Jul '26

We are recruiting a Doctoral Researcher and a Postdoctoral Researcher! Both roles focused on training health foundation models and medical reasoning using Finland's nationwide health databases. 💻 Access to world-class computing resources, including local clusters (Roihu and Triton) and the LUMI Supercomputer (one of the world's fastest and greenest). 🤝 Direct integration into the ELLIS Institute Finland and the University of Turku research ecosystems. 🌲 Excellent work-life balance and competitive salary in Finland. 📅 Application Deadline: August 17, 2026 at 16:00 (Helsinki time) Find full job details and apply here: https://www.olaresearch.org/news/job-openings-foundation-health-models.html

1 0

CfP – Special Issue on "Big Data Analytics and Mining for Information Retrieval" [BDCC Journal]
by Ida Mele 02 Jul '26

02 Jul '26

Dear Colleagues, We are pleased to invite submissions to the Special Issue "Big Data Analytics and Mining for Information Retrieval" in the "Big Data and Cognitive Computing" Journal (IF 5.3, CiteScore 11.4). This Special Issue seeks advanced analytics and mining methods that integrate big-data technologies with intelligent retrieval models. We welcome original research articles and review papers on research areas including, but not limited to: - web search - social media analytics - recommendation systems - enterprise search - digital libraries - biomedical informatics For details and submission instructions, visit the Special Issue webpage: https://www.mdpi.com/journal/BDCC/special_issues/4643ZF82B5 If you are interested in submitting a manuscript, kindly register a tentative title and abstract via the "Submit Abstract to Special Issue" form on the Special Issue webpage at your earliest convenience. Please feel free to share this invitation with colleagues and collaborators. We look forward to receiving your contributions. Kind regards, Guest Editor Dr. Ida Mele IASI-CNR, Rome, Italy

1 0

3rd and last Call for Papers: FEL XXX 2026, Paris, France, 3-5 November 2026 - Submission deadline 10 July 2026
by Steven Krauwer 02 Jul '26

02 Jul '26

CALL FOR PAPERS The 30th Annual Conference of the Foundation for Endangered Languages, FEL XXX (2026) "Endangered Languages and Innovative Technologies: Documentation, Processing and Revitalisation" Organised by the Foundation for Endangered Languages(FEL <https://ogmios.org/>), the Centre d'Études en Sciences Sociales sur les Mondes Africains Américains et Asiatiques (CESSMA <https://www.inalco.fr/en/cessma>) and the Institut National des Langues et Civilisations Orientales (INALCO <https://www.inalco.fr/en>) Paris, France, 3-5 November 2026 The conference aims to create a space for dialogue between researchers, technologists, and, crucially, language communities themselves, concerning the opportunities and challenges presented by innovative technologies in efforts to prevent language loss and promote the maintenance and revitalisation of endangered languages. We strongly encourage submissions from community members, educators, activists, and practitioners, as well as presentations of collaborative work between academic and non-academic partners. More about the theme: https://www.ogmios.org/conferences/2026/theme.php Important dates: # 10July2026:Deadlineforsubmission of abstracts NB:***Due to the tight schedule this deadline can not be extended!* # 13 July 2026: Registration opens # 01 August2026:Selectedapplicantsinformed # 15 September2026:Deadlineforsubmission of extended versions of accepted abstracts # 03-05November 2026:Conference dates # 06 November 2026: Excursion (to be confirmed) Conference website: https://ogmios.org/conferences/2026 Email: fel2026(a)ogmios.org -- ------------------------------------------------------------------------ Steven Krauwer, CLARIN/FEL/ELSNET/TLC, Drift 10, 3512 BS Utrecht, NL

1 0

Call for participation: the 9th Workshop on Event Extraction and Understanding: Challenges and Applications (EEUCA 2026) @ ACL 2026
by Ali Hurriyetoglu 02 Jul '26

02 Jul '26

Dear Corpora-list community, We are excited to welcome you to the 9th Workshop on Event Extraction and Understanding: Challenges and Applications (EEUCA 2026), taking place July 3, 2026, as part of ACL 2026! This year's workshop highlights how our field continues to evolve from traditional event extraction toward large language models, multimodal event understanding, symbolic reasoning, agent-based systems, and socially impactful AI applications. Some of the key themes of EEUCA 2026 include: * Event extraction and understanding with LLMs * Multimodal analysis of real-world events * Low-resource and multilingual approaches * Benchmark creation and evaluation * Geopolitical event analysis * Shared tasks on multimodal vaccine discourse and toxicity detection in online gaming communities If you are attending ACL 2026, we warmly invite you to join us on July 3. Whether your interests are in information extraction, NLP, computational social science, multimodal AI, or generative models, we would love to see you there! Proceedings: https://aclanthology.org/events/acl-2026/#2026eeuca-1 🗓️ Workshop schedule and additional information: https://research.ku.edu.tr/research-infrastructure/research-centers-and-uni… We sincere thank to all authors, shared task participants, reviewers, program committee members, keynote speakers, and organizers whose hard work made this year's workshop possible. We look forward to welcoming the community and discussing the next generation of event-centered AI research. Ali Hürriyetoğlu, Hristo Tanev, Surendrabikram Thapa

1 0

Last CfP: 9th Workshop on Natural Language for Artificial Intelligence (NL4AI)
by Elisa Leonardelli 01 Jul '26

01 Jul '26

[Apologies for Cross Posting] ═══════════════════════════════════════════════════════════ NL4AI 2026 – 9th Workshop on Natural Language for Artificial Intelligence, at the 24th International Conference of the Italian Association for Artificial Intelligence (AIxIA 2026 <https://aixia2026.unipg.it/>) 6–9 October 2026 | Perugia, Italy ═══════════════════════════════════════════════════════════ Website: http://sag.art.uniroma2.it/NL4AI/ Contact Email: nl4ai2026(a)gmail.com ── IMPORTANT DATES ────────────────── [EXTENDED] Paper Submission deadline: 29 June 2026 6 July 2026, 23:59 AoE🚨 Notification to authors: 31 July 2026 Camera-ready due: 26 August 2026 Workshop Dates: 6–9 October 2026 ──────────────────────────────────── We are pleased to invite submissions to NL4AI 2026, the Ninth Workshop on Natural Language for Artificial Intelligence, to be held in Perugia from the 6th to the 9th of October 2026, within the 24th International Conference of the Italian Association for Artificial Intelligence (AIxIA 2026), and supported by AILC (http://www.ai-lc.it/). The goal of NL4AI is to explore the role of Computational Linguistics and Natural Language Processing in Artificial Intelligence applications. We believe that new technological challenges and opportunities arise at the boundary between NLP and AI. On the one hand, AI applications benefit from a deeper understanding of problems related to Natural Language, and thus the integration of advanced NLP techniques. On the other hand, NLP benefits greatly from being used in wider areas of AI where problems and methodologies related to NL can be evaluated in new contexts. TOPICS OF INTEREST Topics include but are not limited to: - NLP and AI Applications (health, legal domain, social media and journalism, etc.) - Natural Language Interfaces for Human Robot Interaction - Resources, Benchmarks, and Evaluation - Discourse and Pragmatics - Semantics - Natural Language Generation - Creativity, Style, and Narrative Generation - Summarization - Information Extraction in AI Applications - Machine Learning for NLP - LLMs, Foundation Models and Applications - Interpretability, Explainability and Analysis of Models for NLP - Natural Language Inference - Question Answering and Reading Comprehension - Sentiment Analysis, Opinion Mining, and Argumentation - Abusive Language Detection and Analysis - NLP for Fact Checking, Fake News Detection and Analysis - Conversational Agents in Human-Computer Interaction - Speech and Spoken Language Processing - Language and other Multimodality - Multimodal (text-image) data sources - Machine Translation and Multilinguality - Low-Resource NLP and Linguistic Diversity - Cognitive Modeling and Psycholinguistics - Computational Historical Linguistics, Social Science, and Cultural Analytics - Ethics, Fairness, and Societal Impacts of NLP - NLP and Industrial Challenges Accepted papers will be published in the workshop proceedings via CEUR Workshop Proceedings. Depending on the number and quality of papers received, we will consider proposing a special issue in relevant journals. The Program Committee will select the Best Workshop Paper from the accepted papers. SUBMISSIONS We encourage original submissions that describe new theoretical models, applied techniques, and research in progress. Substantial extensions to works already published or presented in other locations are welcome as well. We invite two kinds of submissions: - Short/Demo Paper. Maximum length of 6 pages + up to 2 pages of Acknowledgements/Declaration on Generative AI/References - Regular Papers. Maximum length of 12 pages + up to 2 pages of references Acknowledgements/Declaration on Generative AI/References Please note that papers with less than 25000 characters will be considered short papers in the CEUR proceedings. Submissions Evaluation. Submissions will be peer-reviewed (single-blind) by the program committee members. Evaluation criteria will include novelty, significance for theory/practice, technical soundness, and quality of presentation. Note that reviewers will not be required to evaluate appendices providing a review of the papers. Appendices are intended for including details for reproducibility and/or additional results. How to Submit. Proceedings shall be submitted to CEUR-WS.org for online publication and all papers must follow the 2022 CEUR-ART - 1 Column paper style. The LaTeX template can be downloaded as source file from the NL4AI website <http://sag.art.uniroma2.it/NL4AI/wp-content/uploads/2026/05/CEURART-NL4AI-2…> or accessed as a Template in Overleaf <https://it.overleaf.com/read/cfvwgnqpvpys#a7a014>. All the papers should be submitted via EasyChair: https://easychair.org/conferences/?conf=nl4ai2026 Note: All submissions must be compatible with CEUR (https://ceur-ws.org/) and include the CEUR Declaration on Generative AI section ( https://ceur-ws.org/GenAI/Policy.html). Papers missing this section will be desk rejected. WORKSHOP ORGANIZERS Alessandro Bondielli (University of Pisa) Giovanni Bonetta (Fondazione Bruno Kessler) Elisa Leonardelli (Fondazione Bruno Kessler) Irene Siragusa (University of Palermo) We look forward to seeing you in Perugia! The NL4AI 2026 Workshop Organizers -- -- Le informazioni contenute nella presente comunicazione sono di natura privata e come tali sono da considerarsi riservate ed indirizzate esclusivamente ai destinatari indicati e per le finalità strettamente legate al relativo contenuto. Se avete ricevuto questo messaggio per errore, vi preghiamo di eliminarlo e di inviare una comunicazione all’indirizzo e-mail del mittente. -- The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. If you received this in error, please contact the sender and delete the material.

1 0

2026

2025

2024

2023

2022

Corpora