- Corpora - ELRA lists

INLG 2026: First Call for Workshop & Tutorial Proposals
by Emiel van Miltenburg 18 May '26

18 May '26

Dear Corpora list members, See below for the INLG call for workshops/tutorials. Please note that the deadline is quite soon (25th of May), but proposals are short. For any questions, please contact Jian Yang (Beihang University (BUAA; jiayang(a)buaa.edu.cn). Also note that the first CFP has also been sent out, but the announcement still needs to be released by the list moderators. For details, see: https://2026.inlgmeeting.org/ Best wishes, Emiel ---- INLG 2026: First Call for Workshop & Tutorial Proposals 19th International Conference on Natural Language Generation (INLG 2026) Utrecht, Netherlands — October 17–21, 2026 ======================================= The 19th International Conference on Natural Language Generation (INLG 2026) will be held in Utrecht, Netherlands, from October 17–21, 2026. Building on the success of previous conferences, we aim to include a diverse range of independently organized research workshops and tutorials, which will take place immediately before or after the main conference. The INLG organizers, in collaboration with SIGGEN, warmly invite proposals for one-day or half-day workshops. Hosting a workshop at INLG 2026 provides a unique opportunity to connect with leading researchers and practitioners in Natural Language Generation. Workshops offer a platform for networking, fostering collaborations, and exchanging innovative ideas. They allow organizers and participants to gain visibility for emerging topics and help shape future directions in the NLG community. Workshop & Tutorial Topics We welcome proposals on any topic relevant to the NLG community. Submissions focusing on emerging areas or on topics that foster collaboration between the NLG community and other research fields are strongly encouraged. Proposals that continue an existing workshop series are also welcome. Workshops and tutorials should provide an engaging and informal setting for participants to discuss technical topics and exchange ideas. Proposals should outline the format, and interactive formats (e.g., talks, posters, panels, invited speakers) are encouraged. Workshops are expected to be half-day or full-day events. Proposals can be directly sent to the INLG 2026 Workshop Chair, Jian Yang (email: jiayang(a)buaa.edu.cn). Should you have any questions, feel free to contact the Workshop Chair. Proposals should be approximately 2 pages and include: Workshop or tutorial title Names and affiliations of the organizers Duration: half-day or full-day Objectives of the workshop/tutorial Planned activities and format Target research communities Once accepted, organizers are encouraged to develop a dedicated website for their workshop. Links will be provided on the INLG 2026 main conference website. ***Important Dates*** - Deadline for receipt of workshop proposals: 25 May 2026 - Notification of acceptance: 1 June 2026 ***Suggested Timeline for Workshop Organization*** - Call for workshop papers or abstracts: Immediately after acceptance notification - Submissions due: 1 August 2026 - Notification of acceptance: 1 September 2026 - Camera-ready papers due: 16 September 2026 All deadlines are Anywhere on Earth (AoE). ======================================= Submission Contact Proposals should be sent to the INLG 2026 Workshop Chair: Jian Yang (Beihang University (BUAA)) Website & Social Media Website: https://2026.inlgmeeting.org/ Bluesky: https://bsky.app/profile/siggen.bsky.social LinkedIn: https://www.linkedin.com/company/siggen/ Mastodon: https://fediscience.org/@siggen_acl X (formerly Twitter): twitter.com/inlgmeeting

1 0

37th IEEE International Symposium on Software Reliability Engineering (ISSRE 2026): Last Call for Fast Abstracts and Project Highlights
by Announce 18 May '26

18 May '26

*** Last Call for Fast Abstracts and Project Highlights *** 37th IEEE International Symposium on Software Reliability Engineering (ISSRE 2026) October 20-23, 2026, 5* St. Raphael Resort and Marina Limassol, Cyprus https://cyprusconferences.org/issre2026/ A Fast Abstract (FA) or Project Highlights (PH) paper is a two-page, lightly reviewed technical article. The FA/PH track at ISSRE 2026 aims to bring together researchers and practitioners working in Software Reliability Engineering (SRE) to: • Introduce early original ideas. • Discuss relevant work-in-progress and ongoing experiences. • Challenge the SRE status quo on key topics. • Present critical analyses of prior work. • Share lessons learned from real-world SRE applications. • Propose new problems from industrial or academic experience. • Describe approaches to problems of significance that may not yet have complete results. In addition to traditional Fast Abstracts, the track welcomes Project Highlights (PH) papers. PH papers are expected to disseminate results, visions, methodologies, tools, and ongoing activities from national and international research projects (e.g., European, or multi- institutional initiatives). Project Highlights may include, but are not limited to: • Overviews of funded research projects and their objectives. • Project methodologies, architectures, and experimental frameworks. • Early or intermediate results, including lessons learned and preliminary insights. • Datasets, benchmarks, tools, platforms, and other project outcomes released or in progress. • Collaboration experiences, challenges, and emerging research directions from national or international projects. Project Highlights that can stimulate discussion and collaboration within the ISSRE community are welcome. Ongoing projects and projects completed not earlier than October 2025 are eligible. Accepted contributions will be published in the Supplemental Proceedings of ISSRE 2026 and made available via IEEE Xplore. Topics of Interest Topics of interest include, but are not limited to: • Reliability, safety, maintainability, security, survivability, resilience, robustness, and other dependability attributes. • Faults (defects, bugs, etc.), errors, failures, and other dependability threats. • Reliability of all systems, applications, networks, and software, including problems, solutions, and discussions. • Metrics, measurement, assessment, monitoring, modeling, estimation, and prediction regarding reliability. • Reliability of AI-powered software systems, including large language models (LLMs), autonomous agents, and AI-enabled applications. • Other contents about software reliability, such as normative/regulatory/ethical spaces, societal aspects, etc. Presentations The presentation might be in the form of a short talk in a Fast Abstracts/Project Highlights session or a poster. Further details about presentations and posters will be shared with authors upon notification. Submission Guidelines Manuscripts must be: • submitted via EasyChair as a single Portable Document Format (PDF) file with all fonts embedded; • written in English and be formatted according to the IEEE Computer Society Format Guidelines. Papers are submitted via Easy Chair https://easychair.org/conferences?conf=issre2026 . Manuscripts must adhere to IEEE Conference Publishing Policies. Particularly, they should NOT have been previously published or be under submission elsewhere. All submissions will be screened for plagiarized material through the IEEE Cross Check portal. Contacts Please contact the Fast Abstract/Project Highlights Co-chairs (issre2026-fast- abstracts(a)easychair.org) for any questions or further clarifications. Important Dates (AoE) • Submission deadline: June 15, 2026 • Notification to authors: August 5, 2026 • Camera ready papers: August 19, 2026 Organisation General Chairs • Leonardo Mariani, University of Milano - Bicocca, Italy • George A. Papadopoulos, University of Cyprus, Cyprus Program Coordinator • Roberto Natella, GSSI, Italy Research Program Committee Chairs • Domenico Cotroneo, UNC Charlotte, USA • Jie M. Zhang, King's College London, UK Industry Program Chairs • Jinyang Liu, Bytedance, USA • Sigrid Eldh, Ericsson AB, Sweden Workshop Chairs • Georgia Kapitsaki, University of Cyprus, Cyprus • August Shi, The University of Texas at Austin, USA Doctoral Symposium Chairs • Stefan Winter, LMU Munich, Germany • Lili Wei, McGill University, Canada Fast Abstract Chairs • Luigi Lavazza, University of Insubria, Italy • Yintong Huo, SMU, Singapore JIC2 Chair • Helene Waeselynck, LAAS-CNRS, France Publicity Chairs • Allison K. Sulivan, The University of Texas at Arlington, USA • Jose D'Abruzzo Pereira, University of Coimbra, Portugal Publication Chairs • Sherlock Licorish, Otago Business School, New Zealand • Maria Teresa Rossi, GSSI, Italy Artifact Evaluation Chairs • Naghmeh Ivaki, University of Coimbra, Portugal • Fumio Machida, University of Tsukuba, Japan Diversity and Inclusion Chair • Eleni Constantinou, University of Cyprus, Cyprus Financial Chair • Costas Pattichis, University of Cyprus, Cyprus Web Chairs • Michalis Ioannides, Easy Conferences LTD • Elena Masserini, University of Milano - Bicocca, Italy Registration Chair • Easy Conferences LTD

1 0

ACM ICMI 2026 Call for Demonstrations and Exhibits
by Gualtiero Volpe 17 May '26

17 May '26

ICMI 2026 CALL FOR DEMONSTRATIONS & EXHIBITS =============================================== 5-9 October 2026, Napoli - Italy https://icmi.acm.org/2026/ =============================================== We invite submissions for Demonstrations and Exhibits at the 28th ACM International Conference on Multimodal Interaction (ICMI 2026), taking place October 5–9, 2026, in Napoli, Italy. This track is your chance to showcase cutting-edge multimodal systems, interactive technologies, and innovative applications—from early-stage prototypes to mature products. Two submission types: * Demonstrations: 2–3 page paper (published in ACM proceedings) + video * Exhibits: Short proposal (no proceedings paper) + video All submissions require a video (<=200MB) to illustrate your system. Accepted presenters will be provided with: * Demo table & poster board * Power access * Shared wireless internet Important Dates * Submission deadline: June 21, 2026 * Notification: July 15, 2026 * Final papers (demos): August 2, 2026 Submission guidelines: https://icmi.acm.org/2026/guidelines/ At least one author must register and attend the conference. Questions? Contact the Demo & Exhibits Chairs: Micol Spitale & Josh Andres – icmi2026-demo-exhibits-chairs(a)acm.org Bring your ideas to life and engage the ICMI community in Napoli!

1 0

CFP:‘LAnguage TEchnologies for Low-resource Languages’ (LaTeLL ’2026)
by Amal Haddad 17 May '26

17 May '26

International Conference 'LAnguage TEchnologies for Low-resource Languages' (LaTeLL '2026) Fes, Morocco 28, 29 and 30 September 2026 www.latell.org/2026/ [1] Call for Papers ***Extended submission deadline 15 June 2026*** *** Note the slightly revised conference dates: 28, 29 and 30 September 2026 *** The conference Natural Language Processing (NLP) has witnessed remarkable progress in recent years, largely driven by the emergence of deep learning architectures and, more recently, large language models (LLMs). Nevertheless, these advances have disproportionately benefited high-resource languages that possess abundant data for model training. By contrast, low-resource languages which account for at least 85% of the world's linguistic diversity and are often spoken by smaller or marginalised communities, have not yet reaped the full benefits of contemporary NLP technologies. This imbalance can be attributed to several interrelated factors, including the scarcity of high-quality training data, limited computational and financial resources, and insufficient community engagement in data collection and model development. Developing NLP applications for low-resource languages poses major challenges, particularly the need for large, well-annotated datasets, standardised tools, and robust linguistic resources. Although several workshops have previously addressed NLP for low-resource languages, LaTeLL is the first international conference dedicated specifically to the automatic processing of such languages. The event aims to provide a forum for researchers to present and discuss their latest work in NLP in general, and particularly in the development and evaluation of language models for low-resource languages. Conference topics We invite submissions on a broad range of themes concerning linguistic and computational studies focusing on low-resource languages, including but not limited to the following topics: Language resources for low-resource languages * Dataset creation and annotation * Evaluation methodologies and benchmarks for low-resource settings * Lexical resources, corpora, and linguistic databases * Crowdsourcing and community-driven data collection * Tools and frameworks for low-resource language processing Core language technologies for low-resource languages * Language modelling and pre-training for low-resource languages * Speech recognition, text-to-speech, and spoken language understanding * Phonology, morphology, word segmentation, and tokenisation * Syntax: tagging, chunking, and parsing * Semantics: lexical and sentence-level representation NLP Applications for low-resource languages * Information extraction and named entity recognition * Question answering systems * Dialogue and interactive systems * Summarisation * Machine translation * Sentiment analysis, stylistic analysis, and argument mining * Content moderation * Information retrieval and text mining Multimodality and Grounding for low-resource languages * Vision and language for low-resource contexts * Speech and text multimodal systems * Low-resource sign language processing Ethics, Equity, and Social Impact for low-resource languages * Bias and fairness in low-resource language technologies * Sociolinguistic considerations in technology development * Cultural appropriateness and sensitivity Human-Centred Approaches in low-resource languages * Usability and accessibility of low-resource language technologies * Educational applications and language learning * Community needs assessment and technology adoption * User experience research in low-resource contexts Multilinguality and Cross-Lingual Methods for low-resource languages * Multilingual language models and their adaptation * Code-switching and code-mixing * Cross-lingual transfer learning in low-resource languages. Special Theme Track 1 -- Building Applications Based on Large Language Models for Low-Resource Languages _LaTeLL'2026_ will feature a Special Theme Track dedicated to the development of applications based on Large Language Models (LLMs) for low-resource languages. This track aims to explore innovative methodologies, architectures, and tools that leverage the power of LLMs to enhance linguistic processing, accessibility, and inclusivity for underrepresented languages. Contributions are encouraged on topics such as model adaptation and fine-tuning, multilingual and cross-lingual transfer, ethical and fairness considerations, and the creation of datasets and benchmarks that facilitate the integration of LLM-based solutions in low-resource settings. Special Theme Track 2 -- Modern Standard Arabic (MSA) and Arabic Dialects This special track addresses the unique challenges and opportunities in processing Modern Standard Arabic (MSA) and the rich landscape of Arabic dialects. The diglossic nature of Arabic, where the formal MSA coexists with numerous, widely used spoken dialects, presents a significant hurdle for NLP. While MSA is relatively well-resourced, Arabic dialects are quintessential examples of low-resource languages, often lacking standardised orthographies, annotated corpora, and dedicated processing tools. This track invites submissions on novel research and resources aimed at bridging this gap and advancing the state of the art in Arabic language technology. Topics of interest include, but are not limited to: * Dialect identification and classification * Creation of corpora and lexical resources for Arabic dialects * Machine translation between MSA and dialects, and across different dialects * Speech recognition and synthesis for dialectal Arabic * Computational modelling of morphology, syntax, and semantics for dialects * NLP applications (e.g., sentiment analysis, NER) for dialectal user-generated content * Code-switching between Arabic dialects, MSA, and other languages Submissions and Publication _LaTeLL'2026_ welcomes high-quality submissions in English, which may take one of the following two forms: * Regular papers:Up to eight (8) pages in length, presenting substantial, original, completed, and unpublished research. * Short/poster papers:Up to four (4) pages in length, suitable for concise or focused contributions, ongoing research, negative results, system demonstrations, and similar work. Short papers will be presented during a dedicated poster session. The conference will not consider submissions consisting of abstracts only. All accepted papers (both long and short) will be published as electronic proceedings (with ISBN) and made available on the conference website at the time of the event. The organisers will submit the proceedings for inclusion in the ACL Anthology. To prepare your submission, please make sure to use the LaTeLL'2026 style files available here: LaTeX: https://drive.google.com/file/d/1RceWyUqjFLEbv_oNto-x2Quop7qT4-wf/view?usp=… Word: https://docs.google.com/document/d/1m6VeC9jtMpe-Ku2QREgrPlE2-NTDvJvZ/edit?u… [2] Overleaf: https://www.overleaf.com/read/ttzzfcnjrgvw#e82bef [3] Papers should be submitted through Softconf/START using the following link: https://softconf.com/p/latell2026 Authors of papers receiving exceptionally positive reviews will be invited to prepare extended and substantially revised versions for submission to a leading journal in the field of Natural Language Processing (NLP). The conference will also feature a Student Workshop, and awards will be presented to the authors of outstanding papers. Important dates Due to multiple requests, the submission deadline has been extended to 15 June 2026. In addition, note the revised conference dates below. * Submissions due: 15 June 2026 * Notification of acceptance: 21 July 2026 * Camera-ready due: 31 July 2026 * Conference: 28, 29 and 30 September 2026 Keynote speaker Nizar Habash (New York University Abu Dhabi) Organisation Conference Chair Ruslan Mitkov (Lancaster University and University of Alicante) Programme Committee Chairs Saad Ezzini (King Fahd University of Petroleum & Minerals) Salima Lamsiyah (University of Luxembourg) Tharindu Ranasinghe (Lancaster University) Organising Committee Maram Alharbi (Lancaster University) Salmane Chafik (Mohammed VI Polytechnic University) Ernesto Estevanell (University of Alicante) Milica Ikonić Nešić (University of Belgrade) Shahin Yousefi (Institute for Advanced Studies in Basic Sciences, Zanjan) Further information and contact details The follow-up calls will provide more details on the conference venue and registration. The conference website is www.latell.org/2026/ [1] and will be updated on a regular basis. For further information, please email 2026(a)latell.org Conference registration is now open -- please visit the conference website for further details. -- Amal Haddad Haddad (She/her) Facultad de Traducción e Interpretación Universidad de Granada |https://www.ugr.es/personal/amal-haddad-haddad Lexicon Research Group |http://lexicon.ugr.es/haddad Co-Convenor, BAAL SIG 'Humans, Machines, Language'|https://r.jyu.fi/humala Event Coordinator, BAAL SIG 'Language, Learning and Teaching' =============== Cláusula de Confidencialidad: "Este mensaje se dirige exclusivamente a su destinatario y puede contener información privilegiada o confidencial. Si no es Ud. el destinatario indicado, queda notificado de que la utilización, divulgación o copia sin autorización está prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, se ruega lo comunique inmediatamente por esta misma vía y proceda a su destrucción. This message is intended exclusively for its addressee and may contain information that is CONFIDENTIAL and protected by professional privilege. If you are not the intended recipient you are hereby notified that any dissemination, copy or disclosure of this communication is strictly prohibited by law. If this message has been received in error, please immediately notify us via e-mail and delete it" =============== Links: ------ [1] http://www.latell.org/2026/ [2] https://docs.google.com/document/d/1m6VeC9jtMpe-Ku2QREgrPlE2-NTDvJvZ/edit?u… [3] https://www.overleaf.com/latex/templates/latell-26-template/kfcvbgxmccvb

1 0

International Conference on Software and Systems Reuse, Product Lines, and Configuration (VARIABILITY 2026): Last Call for Project Showcases
by Announce 16 May '26

16 May '26

*** Last Call for Project Showcases *** International Conference on Software and Systems Reuse, Product Lines, and Configuration (VARIABILITY 2026) 29 September - 2 October 2026, 5* St. Raphael Resort and Marina Limassol, Cyprus https://conf.researchr.org/home/variability-2026 The VARIABILITY conference series brings together the communities previously served by ICSR, SPLC, and VaMoS, forming a unified venue for research on variability, configuration, customization, and related disciplines in software and systems engineering. As part of this mission, VARIABILITY 2026 invites submissions to its Project Showcase Track, a forum dedicated to presenting ongoing or recently completed research projects. The track offers a stage for research teams to share their vision, goals, early outcomes, intermediate results, final achievements, and lessons learned from funded projects of all scales, including collaborative research centers, EU projects, and nationally or regionally funded initiatives. The goal is to encourage interaction, foster collaboration opportunities, and help disseminate project insights to the broader community. Objectives and Scope We welcome submissions on research projects that address reuse, product lines, and variable/configurable software systems. A list of research topics that are relevant for this track is available from the call for the papers for the VARIABILITY 2026 Research Track, at: https://conf.researchr.org/track/variability-2026/variability-2026-papers#C… Submissions are expected to describe ongoing or recently completed research projects within this scope. This track is not intended for publishing mature research results. Instead, it focuses on project summaries and overviews, highlighting goals, structure, challenges, insights, and project level impact. Examples of suitable submissions include: • Ongoing projects focusing on goals, challenges, methodology, or early findings • Recently completed projects summarizing outcomes, evidence, and impact • Large scale, collaborative, or multi partner efforts, where visibility and networking are beneficial • Smaller or emerging projects that would benefit from early feedback and exposure PhD thesis projects are not in scope for this track. We warmly encourage PhD candidates to submit their work to the VARIABILITY 2026 Doctoral Symposium. Submission Format • Length: 7 to 10 pages, excluding references • Format: LNCS (Springer), single blind submissions Each submission will receive feedback from three reviewers. All submissions must adhere to the LNCS (Springer) format. Please refer to the official LNCS template at https://www.springer.com/gp/computer-science/lncs/conference-proceedings-gu… . Submissions must be in PDF format and submitted via EasyChair: https://easychair.org/conferences/?conf=variability2026 (Select “Projects Showcase Track”). Presentation and Publication Accepted papers will appear in the VARIABILITY 2026 Companion Proceedings published by Springer in the LNCS series. Accepted submissions will receive a presentation slot. At least one author of each accepted paper must: • Register for the full conference, and • Present the contribution at the event Evaluation Criteria Submissions will be evaluated on: • Relevance to the conference scope • Clarity of project goals, context, and contributions • Potential for impact, collaboration, reuse, or technology transfer • Value for discussion and interaction at the conference The focus is on clarity, relevance, and value to the community rather than scientific novelty. Important Dates (AoE) • Submission of Papers: 1 June 2026 • Notification of Acceptance: 21 June 2026 • Camera-Ready Submission: 15 July 2026 • Author Registration: 15 July 2026 Organisation General Chairs • George A. Papadopoulos, University of Cyprus, Cyprus • Gilles Perrouin, FNRS & University of Namur, Belgium Research Track Chairs • Thorsten Berger, Ruhr University Bochum, Germany • Ina Schaefer, KIT, Germany Industry Track Chairs • Shaukat Ali, Simula Research Lab and Oslo Metropolitan University, Norway • Martin Becker, Fraunhofer IESE, Germany Journal First Track Chairs • Mathieu Acher, University Rennes, Inria, CNRS, IRISA, France • Xhevahire Tërnava, LTCI, Télécom Paris, Institut Polytechnique de Paris, France Doctoral Symposium Track Chairs • Rick Rabiser, LIT CPS, Johannes Kepler University Linz, Austria • Iris Reinhartz-Berger, University of Haifa, Israel Demos and Tools Track Chairs • Sandra Greiner, University of Southern Denmark, Denmark • Leopoldo Teixeira, Federal University of Pernambuco Projects Showcase Chairs • Daniel Struber, Chalmers, University of Gothenburg, Radbound University, Sweden • Dalila Tamzalit, Nantes Université, France Hall of Fame Chairs • Martin Becker, Fraunhofer IESE, Germany • Goetz Botterweck, Lero - The Irish Software Research Centre and University of Limerick, Ireland • Natsuko Noda, Shibaura Institute of Technology, Japan Workshops Chairs • Lidia Fuentes, Universidad de Malaga, Spain • Malte Lochau, University of Siegen, Germany Tutorials Chairs • Loek Cleophas, Eindhoven University of Technology and Stellenbosch University, The Netherlands • Mahsa Varshosaz, IT University of Copenhagen, Denmark Proceedings Chair • Sophie Fortz, King's College London, UK Publicity Chairs • Wesley Assunção, North Carolina State University, USA • Kentaro Yoshimura, Hitachi Ltd, Japan Local Organiser and Finance Chair • George A. Papadopoulos, University of Cyprus, Cyprus

1 0

Universal Dependencies, release 2.18
by Dan Zeman 16 May '26

16 May '26

We are very happy to announce the twenty-fourth release of annotated treebanks in Universal Dependencies, v2.18, available at https://universaldependencies.org/. Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective (de Marneffe et al., 2021; Nivre et al., 2020). The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). The general philosophy is to provide a universal inventory of categories and guidelines to facilitate consistent annotation of similar constructions across languages, while allowing language-specific extensions when necessary. The *353* treebanks in v2.18 are annotated according to version 2 of the UD guidelines and represent the following *193* languages: Abaza, Abkhaz, Afrikaans, Akkadian, Akuntsu, Albanian, Alemannic, Amharic, Ancient Greek, Ancient Hebrew, Apurina, Arabic, Armenian, Assamese, Assyrian, Azerbaijani, Bambara, Basque, Bavarian, Beja, Belarusian, Bengali, Bhojpuri, Bokota, Bororo, Brahui, Breton, Bulgarian, Buryat, Cantonese, Cappadocian, Catalan, Cebuano, Central Kurdish, Chinese, Chintang, Chukchi, Classical Armenian, Classical Chinese, Coptic, Croatian, Czech, Danish, Dutch, Egyptian, English, Erzya, Esperanto, Estonian, Faroese, Finnish, French, Frisian Dutch, Galician, Georgian, German, Gheg, Gorontalo, Gothic, Greek, Guajajara, Guarani, Gujarati, Gwichin, Haitian Creole, Hausa, Hebrew, Highland Puebla Nahuatl, Hindi, Hittite, Hungarian, Icelandic, Ika, Indonesian, Irish, Italian, Japanese, Javanese, Kaapor, Kadiweu, Kangri, Karelian, Karo, Kazakh, Khoekhoe, Kiche, Komi Permyak, Komi Zyrian, Korean, Kyrgyz, Latgalian, Latin, Latvian, Ligurian, Lithuanian, Livvi, Low Saxon, Luxembourgish, Macedonian, Madi, Maghrebi Arabic French, Makurap, Malayalam, Maltese, Manx, Marathi, Mbya Guarani, Middle Armenian, Middle French, Moksha, Munduruku, Naga, Naija, Neapolitan, Nenets, Nepali, Nheengatu, North Sami, Northern Kurdish, Northwest Gbaya, Norwegian, Occitan, Odia, Old Church Slavonic, Old East Slavic, Old English, Old French, Old Georgian, Old Irish, Old Occitan, Old Turkish, Ottoman Turkish, Pashto, Paumari, Persian, Pesh, Phrygian, Polish, Pomak, Portuguese, Punjabi, Romanian, Russian, Ruuli, Sanskrit, Scottish Gaelic, Serbian, Shanghainese, Sicilian, Sindhi, Sinhala, Skolt Sami, Slovak, Slovenian, South Levantine Arabic, Southern Kurdish, Spanish, Spanish Sign Language, Swedish, Swedish Sign Language, Tagalog, Tamil, Tatar, Teko, Telugu, Telugu English, Thai, Tswana, Tupinamba, Turkish, Turkish English, Turkish German, Ukrainian, Umbrian, Upper Sorbian, Urdu, Uyghur, Uzbek, Veps, Vietnamese, Warlpiri, Welsh, Western Armenian, Western Sierra Puebla Nahuatl, Wolof, Xavante, Xibe, Yakut, Yiddish, Yoruba, Yupik, Zaar and Zazaki. The 193 languages belong to *36* families: Afro-Asiatic, Arawakan, Arawan, Austro-Asiatic, Austronesian, Basque, Bororoan, Chibchan, Chukotko-Kamchatkan, Code switching, Constructed, Creole, Dravidian, Eskimo-Aleut, Guaicuruan, Indo-European, Japanese, Kartvelian, Khoe-Kwadi, Korean, Macro-Je, Mande, Mayan, Mongolic, Na-Dene, Niger-Congo, Northwest Caucasian, Pama-Nyungan, Sign Language, Sino-Tibetan, Tai-Kadai, Tungusic, Tupian, Turkic, Uralic and Uto-Aztecan. Depending on the language, the treebanks range in size from less than 1,000 tokens to over 3 million tokens. We expect the next release to be available in November 2026. The size of the following 41 treebanks changed significantly since the last release: Abkhaz AbNC : 10585 → 13054 Assamese AiW : 0 → 871 Brahui Kholum : 0 → 819 Cappadocian AMGiC : 451 → 820 Egyptian PC : 0 → 34234 Egyptian UJaen : 24375 → 0 Gorontalo BungoLoLombi : 0 → 205 Greek GLCII : 0 → 9868 Greek Lesbian : 5926 → 6840 Hausa EasternAutogramm : 0 → 9485 Hausa NorthernAutogramm : 4158 → 15424 Hebrew PostRab : 0 → 8029 Italian KIParlaForest : 9348 → 18680 Kadiweu Unicamp : 0 → 318 Khunsari AHA : 74 → 0 Korean KSL : 137122 → 155104 Latin CIRCSE : 24899 → 29055 Marathi CMUPAN : 0 → 118198 Mbya Guarani Dooley : 11771 → 0 Middle Armenian ArmTDP : 0 → 1093 Middle French PROFITEROLE : 89809 → 119001 Nayini AHA : 78 → 0 Nepali BK : 0 → 801 Odia ODTB : 1029 → 5818 Old Georgian GLC : 0 → 6884 Old Occitan CorAG : 45389 → 52539 Ottoman Turkish DUDU : 17125 → 22027 Ottoman Turkish TueCL : 0 → 929 Pashto Prince : 0 → 1180 Pashto Sikaram : 4067 → 5467 Phrygian KUL : 1687 → 1921 Punjabi CS : 0 → 1903 Punjabi Rang : 0 → 1907 Ruuli RDT : 0 → 6301 Sinhala Appuwa : 0 → 669 Soi AHA : 55 → 0 Swedish SweLL : 8644 → 10895 Turkish English BUTR : 393 → 441 Western Sierra Puebla Nahuatl ITML : 9461 → 0 Western Sierra Puebla Nahuatl MesoTree: 0 → 19535 Zazaki ZSD : 0 → 1441 In total, the new release contains *2,337,062* sentences, 37,424,698 surface tokens and *38,181,322* syntactic words. Daniel Zeman, Joakim Nivre, Rimsha Abid, Mitchell Abrams, Elia Ackermann, Jephtey Adolphe, Noëmi Aepli, Muhammad Afzal, Željko Agić, Lars Ahrenberg, Chika Kennedy Ajede, Arofat Akhundjanova, Furkan Akkurt, Orly Albek, Gabrielė Aleksandravičiūtė, Dominick Maia Alexandre, Ika Alfina, Avner Algom, Khalid Alnajjar, Chiara Alzetta, Antonios Anastasopoulos, Erik Andersen, Kirk Andrews, Matthew Andrews, Lene Antonsen, Tatsuya Aoyama, Katya Aplonova, Angelina Aquino, Carolina Aragon, Glyd Aranes, Maria Jesus Aranzabe, Bilge Nas Arıcan, Þórunn Arnardóttir, Wirote Aroonmanakun, Gashaw Arutie, Jessica Naraiswari Arwidarasti, Hiwa Asadpour, Masayuki Asahara, Katla Ásgeirsdóttir, Deniz Baran Aslan, Cengiz Asmazoğlu, Luma Ateyah, Furkan Atmaca, Mohammed Attia, Aitziber Atutxa, Liesbeth Augustinus, Mariana Avelãs, Salwan Aziz, Elena Badmaeva, Jana Bajorat, Keerthana Balasubramani, Miguel Ballesteros, Esha Banerjee, Sebastian Bank, Bryan Khelven da Silva Barbosa, Verginica Barbu Mititelu, Starkaður Barkarson, Rodolfo Basile, Victoria Basmov, Colin Batchelor, John Bauer, Seyyit Talha Bedir, Zarina Begum, Shabnam Behzad, Nathanaël Beiner, Juan Belieni, Alevtina Bémová, Kepa Bengoetxea, İbrahim Benli, Yifat Ben Moshe, Marie Benzerrak, Aleksandrs Berdicevskis, Ansu Berg, Gözde Berk, Delphine Bernhard, Astrid Berntsson Ingelstam, Riyaz Ahmad Bhat, Erica Biagetti, Eckhard Bick, Agnė Bielinskienė, Esma Fatıma Bilgin Taşdemir, Helin Binici, Kristín Bjarnadóttir, Samuel BK, Verena Blaschke, Rogier Blokland, Nina Böbel, Victoria Bobicev, Loïc Boizou, Stavros Bompolas, Morgane Bona, Johnatan Bonilla, Emanuel Borges Völker, Lars Borin, Carl Börstell, Cristina Bosco, Gosse Bouma, Sam Bowman, Adriane Boyd, Anouck Braggaar, António Branco, Myriam Bras, Elisheva Brauner, Théo Brillet, Kristina Brokaitė, Lanni Bu, Eva Buráňová, Aljoscha Burchardt, Carmen Cabeza, Natalia Cáceres Arandia, Olesea Caftanatov, Marisa Campos, Marie Candito, Caterina Maria Cappello, Bernard Caron, Gauthier Caron, Catarina Carvalheiro, Rita Carvalho, Lauren Cassidy, Maria Clara Castro, Sérgio Castro, Tatiana Cavalcanti, Gülşen Cebiroğlu Eryiğit, Flavio Massimiliano Cecchini, Giuseppe G. A. Celano, Anila Çepani, Slavomír Čéplö, Neslihan Cesur, Savas Cetin, Özlem Çetinoğlu, Fabricio Chalub, Liyanage Chamila, Claudine Chamoreau, Aditi Chaudhary, Shweta Chauhan, Yifei Chen, Ethan Chi, Taishi Chika, Yongseok Cho, Jinho Choi, Bermet Chontaeva, Jayeol Chun, Juyeon Chung, Alessandra T. Cignarella, Silvie Cinková, Esther Cocco, Aurélie Collomb, Çağrı Çöltekin, Miriam Connor, Claudia Corbetta, Daniela Corbetta, Francisco Costa, Marine Courtin, Benoît Crabbé, Mihaela Cristescu, Vladimir Cvetkoski, Netanel Dahan, Ingerid Løyning Dale, Sabrina D'Alì, Philemon Daniel, Anna S. Danielyan, Khensa Daoudi, Bijayalaxmi Dash, Satya Ranjan Dash, Elizabeth Davidson, Leonel Figueiredo de Alencar, Mathieu Dehouck, Martina de Laurentiis, Marie-Catherine de Marneffe, Ahmet Demir, Valeria de Paiva, Mehmet Oguz Derin, Elvis de Souza, Arantza Diaz de Ilarraza, Roberto Antonio Díaz Hernández, Carly Dickerson, Ariani Di Felippo, Arawinda Dinakaramani, Elisa Di Nuovo, Bamba Dione, Peter Dirix, Hoa Do, Kaja Dobrovoljc, Mahîr Dogan, Caroline Döhmer, Adrian Doyle, Timothy Dozat, Kira Droganova, Magali Sanches Duran, Puneet Dwivedi, Andrew Dyer, Andrew Thomas Dyer, Christian Ebert, Hanne Eckhoff, Masaki Eguchi, Sandra Eiche, Roald Eiselen, Marhaba Eli, Ali Elkahky, Binyam Ephrem, Olga Erina, Tomaž Erjavec, Louise Esher, Soudabeh Eslami, Farah Essaidi, Aline Etienne, Wograine Evelyn, Sidney Facundes, Richárd Farkas, Ján Faryad, Federica Favero, Jannatul Ferdaousi, Marília Fernanda, Hector Fernandez Alcalde, Amal Fethi, Jennifer Foster, Barbara Francioni, Theodorus Fransen, Cláudia Freitas, Shlomit Fuchs, Kazunori Fujita, Katarína Gajdošová, Daniel Galbraith, Charlotte Chambelland Galves, Edith Galy, Federica Gamba, Marcos Garcia, José María García-Miguel, Moa Gärdenfors, Tanja Gaustad, Efe Eren Genç, Fabrício Ferraz Gerardi, Kim Gerdes, Luke Gessler, Filip Ginter, Gustavo Godoy, Iakes Goenaga, Koldo Gojenola, Memduh Gökırmak, Yoav Goldberg, Gili Goldin, Xavier Gómez Guinovart, Berta González Saavedra, Mathieu Goux, Caroline Grand-Clement, Bernadeta Griciūtė, Matias Grioni, Loïc Grobol, Normunds Grūzītis, Mario Guglielmetti, Bruno Guillaume, Kirian Guiller, Céline Guillot-Barbance, Tunga Güngör, Vladimir Gurevich, Nizar Habash, Hinrik Hafsteinsson, Michael Hahn, Jan Hajič, Jan Hajič jr., Eva Hajičová, Mika Hämäläinen, Linh Hà Mỹ, Na-Rae Han, Muhammad Yudistira Hanifmuti, Takahiro Harada, Sam Hardwick, Kim Harris, Naïma Hassert, Dag Haug, Jiří Havelka, Johannes Heinecke, Oliver Hellwig, Felix Hennig, Barbora Hladká, Jaroslava Hlaváčová, Florinel Hociung, Diana Hoefels, Barbara Hoff, Petter Hohle, Nick Howell, Yidi Huang, Marivel Huerta Mendez, Jena Hwang, Takumi Ikeda, Inessa Iliadou, Anton Karl Ingason, Radu Ion, Elena Irimia, Ọlájídé Ishola, Artan Islamaj, Kaoru Ito, Federica Iurescia, Jessica K. Ivani, Sandra Jagodzińska, Siratun Jannat, Tomáš Jelínek, Apoorva Jha, Ratanon Jiamsundutsadee, Katharine Jiang, Sylvanus Job, Mayank Jobanputra, Anders Johannsen, Hildur Jónsdóttir, Fredrik Jørgensen, Zhuoxuan Ju, María Ximena Juarez Huerta, Markus Juutinen, Hüner Kaşıkara, Nadezhda Kabaeva, Sylvain Kahane, Hiroshi Kanayama, Jenna Kanerva, Neslihan Kara, Ritván Karahóǧa, Jiří Kárník, Andre Kåsen, Tolga Kayadelen, Sarveswaran Kengatharaiyer, Václava Kettnerová, Ali Haider Khan, Lilit Kharatyan, Jesse Kirchner, Elena Klementieva, Christina Klironomou, Elena Klyachko, Petr Kocharov, Arne Köhn, Abdullatif Köksal, Veronika Kolářová, Kamil Kopacewicz, Timo Korkiakangas, Mehmet Köse, Alexey Koshevoy, Nelda Kote, Natalia Kotsyba, Barbara Kovačić, Jolanta Kovalevskaitė, Emmanuelle Kowner, Simon Krek, Parameswari Krishnamurthy, Sandra Kübler, Lucie Kučová, Adrian Kuqi, Elmurod Kuriyozov, Pranav Kushare, Oğuzhan Kuyrukçu, Aslı Kuzgun, Sookyoung Kwak, Kris Kyle, Käbi Laan, Veronika Laippala, Lorenzo Lambertino, Israel Landau, Tatiana Lando, Septina Dian Larasati, Pierre Larrivée, Kusum Lata, Alexei Lavrentiev, John Lee, Phương Lê Hồng, Alessandro Lenci, Wei Qi Leong, Saran Lertpradit, Herman Leung, Lori Levin, Maria Levina, Lauren Levine, Cheuk Ying Li, Josie Li, Keying Li, Yixuan Li, Yuan Li, KyungTae Lim, Bruna Lima Padovani, Yi-Ju Jessica Lin, Krister Lindén, Yitzchak Lindenbaum, Yang Janet Liu, Zoey Liu, Nikola Ljubešić, Irina Lobzhanidze, Olga Loginova, Markéta Lopatková, Lucelene Lopes, Edita Luftiu, Arsenii Lukashevskyi, Stefano Lusito, Anne-Marie Lutgen, Andry Luthfi, Mikko Luukko, Olga Lyashevskaya, Teresa Lynn, Vivien Macketanz, Menel Mahamdi, Jean Maillard, Punyanuch Maitreenukul, Ilya Makarchuk, Aibek Makazhanov, Francesco Mambrini, Michael Mandl, Christopher Manning, Ruli Manurung, Büşra Marşan, Cătălina Mărănduc, David Mareček, Katrin Marheinecke, Stella Markantonatou, Ángeles Márquez Hernández, Héctor Martínez Alonso, Lorena Martín Rodríguez, André Martins, Cláudia Martins, Arianna Masciolini, Jan Mašek, Sanatbek Matlatipov, Hiroshi Matsuda, Yuji Matsumoto, Caterina Mauri, Alessandro Mazzei, Ryan McDonald, Sarah McGuinness, Maitrey Mehta, Ephraim Meiri, Pierre André Ménard, Gustavo Mendonça, Hilla Merhav, Tatiana Merzhevich, Paul Meurer, Niko Miekka, Marie Mikulová, Emilia Milano, Aleksandra Miletić, Aaron Miller, Junghyun Min, Yael Minerbi, Jiří Mírovský, Karina Mischenkova, Anna Missilä, Cătălin Mititelu, Maria Mitrofan, Yusuke Miyao, Biswakalpita Mohapatra, Judit Molnár, Amirsaeid Moloodi, Simonetta Montemagni, Amir More, Laura Moreno Romero, Giovanni Moretti, Shinsuke Mori, Tomohiko Morioka, Shigeki Moro, Bjartur Mortensen, Bohdan Moskalevskyi, Katerina Mouzou, Kadri Muischnek, Robert Munro, Yugo Murawaki, Nikolett Mus, Kaili Müürisep, Pinkey Nainwani, Mariam Nakhlé, Minoo Nassajian, Juan Ignacio Navarro Horñiacek, Anna Nedoluzhko, Gunta Nešpore-Bērzkalne, Manuela Nevaci, Lương Nguyễn Thị, Huyền Nguyễn Thị Minh, Yoshihiro Nikaido, Vitaly Nikolaev, Rattima Nitisaroj, Victor Norrman, Alireza Nourian, Michal Novák, Maria das Graças Volpe Nunes, Hanna Nurmi, Colleen Alena O'Brien, Stina Ojala, Atul Kr. Ojha, Hulda Óladóttir, Adédayọ̀ Olúòkun, Mai Omura, Emeka Onwuegbuzia, Noam Ordan, Petya Osenova, Robert Östling, Annika Ott, Lilja Øvrelid, Masanori Oya, Şaziye Betül Özateş, Merve Özçelik, Arzucan Özgür, Balkız Öztürk Başaran, Teresa Paccosi, Petr Pajas, Thomas Palakapilly, Alessio Palmero Aprosio, Jarmila Panevová, Ludovica Pannitto, Anastasia Panova, Thiago Alexandre Salgueiro Pardo, Shantipriya Parida, Hyunji Hayley Park, Niko Partanen, Elena Pascual, Thelka Pasparaki, Marco Passarotti, Agnieszka Patejuk, Guilherme Paulino-Passos, Giulia Pedonese, Oggi Peeters, Angelika Peljak-Łapińska, Siyao Peng, Siyao Logan Peng, Rita Pereira, Sílvia Pereira, Cenel-Augusto Perez, Natalia Perkova, Guy Perrier, Slav Petrov, Daria Petrova, Eva Pettersson, Andrea Peverelli, Jason Phelan, Claudel Pierre-Louis, Jussi Piitulainen, Yuval Pinter, Clara Pinto, Rodrigo Pintucci, Tommi A Pirinen, Emily Pitler, Magdalena Plamada, Barbara Plank, Alistair Plum, Thierry Poibeau, Charin Polpanumas, Larisa Ponomareva, Martin Popel, Clamença Poujade, Rangga Prangwedana Prangwedana, Lauma Pretkalniņa, Rigardt Pretorius, Sophie Prévost, Prokopis Prokopidis, Adam Przepiórkowski, Robert Pugh, Tiina Puolakainen, Christoph Purschke, Sampo Pyysalo, Peng Qi, Andreia Querido, Andriela Rääbis, Ella Rabinovich, Alexandre Rademaker, Mutee-u Rahman, Mizanur Rahoman, Taraka Rama, Loganathan Ramasamy, Carlos Ramisch, Joana Ramos, Fam Rashel, Mohammad Sadegh Rasooli, Vinit Ravishankar, Livy Real, Petru Rebeja, Siva Reddy, Mathilde Regnault, Georg Rehm, Arij Riabi, Ivan Riabov, Michael Rießler, Erika Rimkutė, Larissa Rinaldi, Laura Rituma, Putri Rizqiyah, Luisa Rocha, Eiríkur Rögnvaldsson, Ivan Roksandic, Norton Trevisan Roman, Mykhailo Romanenko, Natalia Romanova, Rudolf Rosa, Valentin Roșca, Paulette Roulon, Davide Rovati, Ben Rozonoyer, Olga Rudina, Jack Rueter, Paolo Ruffolo, Kristján Rúnarsson, Rozana Rushiti, Attapol T. Rutherford, Shoval Sadde, Pegah Safari, Aleksi Sahala, Kalyanamalini Sahoo, Saraswati Sahoo, Shadi Saleh, Alessio Salomoni, Tanja Samardžić, Warangana Sammani, Konstantinos Sampanis, Stephanie Samson, Xulia Sánchez-Rodríguez, Filomena Spatti Sandalo, Manuela Sanguinetti, Ezgi Sanıyar, Dage Särg, Marta Sartor, Albina Sarymsakova, Mitsuya Sasaki, Baiba Saulīte, Agata Savary, Yanin Sawanakunanon, Shefali Saxena, Kevin Scannell, Salvatore Scarlata, Emmanuel Schang, Robert Schikowski, Nathan Schneider, Sebastian Schuster, Lane Schwartz, Djamé Seddah, Wolfgang Seeker, Sven Sellmer, Kaushik Sengupta, Mojgan Seraji, Magda Ševčíková, Petr Sgall, Syeda Shahzadi, Mo Shen, Atsuko Shimada, Gyu-Ho Shin, Hiroyuki Shirasu, Yana Shishkina, Avi Shmidman, Muh Shohibussirri, Maria Shvedova, Jean Sibille, Janine Siewert, Einar Freyr Sigurðsson, João Silva, Aline Silveira, Natalia Silveira, Sara Silveira, Maria Simi, Radu Simionescu, Katalin Simkó, Mária Šimková, Haukur Barri Símonarson, Kiril Simov, Dmitri Sitchinava, Ted Sither, Aaron Smith, Isabela Soares-Bastos, Per Erik Solberg, Dolores Sollberger, Barbara Sonnenhauser, Shafi Sourov, Nina Speransky, Rachele Sprugnoli, Panyut Sriwirote, Vivian Stamou, Steinþór Steingrímsson, Antonio Stella, Jan Štěpánek, Barbora Štěpánková, Abishek Stephen, Milan Straka, Omer Strass, Emmett Strickland, Jana Strnadová, Sara Stymne, Alane Suhr, Yogi Lesmana Sulestio, Umut Sulubacak, Jack Sun, Hakyung Sung, Shingo Suzuki, Daniel Swanson, Zsolt Szántó, Maria Irena Szawerna, Chihiro Taguchi, Dima Taji, Rachel Tal, Luigi Talamo, Fabio Tamburini, Mary Ann C. Tan, Takaaki Tanaka, Dipta Tanaya, Mirko Tavoni, Nursena Teker, Samson Tella, Isabelle Tellier, Marinella Testori, Santhawat Thanyawong, Guillaume Thomas, Tarık Emre Tıraş, William Chandra Tjhi, Thea Tollersrud, Kamil Tomaszek, Sara Tonelli, Liisi Torga, Lucas Toribio, Marsida Toska, Trond Trosterud, Anna Trukhina, Reut Tsarfaty, Kira Tulchynska, Utku Türk, Francis Tyers, Sveinbjörn Þórðarson, Vilhjálmur Þorsteinsson, Sumire Uematsu, Roman Untilov, Zdeňka Urešová, Larraitz Uria, Hans Uszkoreit, Andrius Utka, Elena Vagnoni, Sowmya Vajjala, Socrates Vak, Socrates Vakirtzian, Rob van der Goot, Martine Vanhove, Daniel van Niekerk, Gertjan van Noord, Viktor Varga, Helena Vaz, Uliana Vedenina, Giulia Venturi, Marianne Vergez-Couret, Annemarie Verkerk, Luiz Veronesi, Anna Veselovsky, Barbora Vidová Hladká, Eric Villemonte de la Clergerie, Veronika Vincze, Anishka Vissamsetty, Natalia Vlasova, Eleni Vligouridou, Alan Vogel, Aya Wakasa, Joel C. Wallenberg, Lars Wallin, Abigail Walsh, John Wang, Jonathan North Washington, Leonie Weissweiler, Maximilan Wendt, Paul Widmer, Aleksandra Wieczorek, Shira Wigderson, Sri Hartati Wijono, Vanessa Berwanger Wille, Seyi Williams, Miriam Winkler, Shuly Wintner, Mats Wirén, Christian Wittern, Alena Witzlack-Makarevich, Tsegay Woldemariam, Tak-sum Wong, Alina Wróblewska, Qishen Wu, Mary Yako, Kayo Yamashita, Naoki Yamazaki, Chunxiao Yan, Qizhen Yang, Xiulin Yang, Koichi Yasuoka, Marat M. Yavrumyan, Arife Betül Yenice, Enes Yılandiloğlu, Olcay Taner Yıldız, Zhuoran Yu, Arlisa Yuliawati, Zdeněk Žabokrtský, Shorouq Zahra, Amir Zeldes, Annie Zhang, Larry Zhang, He Zhou, Hanzhi Zhu, Yilun Zhu, Anna Zhuravleva, Rayan Ziane, Artūrs Znotiņš, Eleonora Zucchini References Marie-Catherine de Marneffe, Christopher Manning, Joakim Nivre, Daniel Zeman. 2021. Universal Dependencies. In Computational Linguistics 47:2, pp. 255–308. Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis Tyers, Daniel Zeman. 2020. Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection. In Proceedings of LREC. -------------------------------------------------------------------------------- Marie-Catherine de Marneffe, Bill MacCartney, and Christopher D. Manning. 2006. Generating typed dependency parses from phrase structure parses. In Proceedings of LREC. Marie-Catherine de Marneffe and Christopher D. Manning. 2008. The Stanford typed dependencies representation. In COLING Workshop on Cross-framework and Cross-domain Parser Evaluation. Marie-Catherine de Marneffe, Timothy Dozat, Natalia Silveira, Katri Haverinen, Filip Ginter, Joakim Nivre, and Christopher Manning. 2014. Universal Stanford Dependencies: A cross-linguistic typology. In Proceedings of LREC. Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajič, Christopher D. Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, Daniel Zeman. 2016. Universal Dependencies v1: A Multilingual Treebank Collection. In Proceedings of LREC. Slav Petrov, Dipanjan Das, and Ryan McDonald. 2012. A universal part-of-speech tagset. In Proceedings of LREC. Daniel Zeman. 2008. Reusable Tagset Conversion Using Tagset Drivers. In Proceedings of LREC.

1 0

Undergraduate summer position (minimum wage, Ireland)
by Carl Vogel 15 May '26

15 May '26

Hello. A position suitable for a qualified undergraduate student with the right to work in Ireland has become available. Questions may be directed to fintan.obyrne(a)adaptcentre.ie with subject line: Research Assistant UX/Interaction Designer. For details, see: https://www.adaptcentre.ie/careers/research-assistant-ux-interaction-design… Colleagues are seeking a research assistant UX/Interaction Designer to support the team on a temporary basis for the following workloads: ● AI Dataset pre-processing tasks – Review large dataset of CC0 images and carry out batch operations on audited subsets to normalise the data for model training. ● Support the maintenance of a documented design system that will be utilised for the generation of screens for various workflows ● Mocking screenflows using AI assisted design tools and processes ● Visual testing of screens and logging of issues ● Manual content population of art catalogues for pilot projects Qualifications and skills ● This position is ideally suited to an undergraduate studying UX/Interaction Design ● Experience with Photoshop, Figma essential ● Experience with AI assisted design tools like Claude Design essential ● Knowledge and familiarity with the history of art a definite advantage ● Prior experience working with large visual datasets a definite advantage This is a hybrid 3-month specific purpose research assistant position. Candidates must be available to work on site in the ADAPT Centre in Trinity College Dublin as required. Please direct any questions to fintan.obyrne(a)adaptcentre.ie with subject line: Research Assistant UX/Interaction Designer. All my best, Carl Vogel

1 0

May 2026 Newsletter - LDC
by Penn LDC 15 May '26

15 May '26

In this newsletter: New publications: MADCAT Phases 1-3 Composite Evaluation Set<https://catalog.ldc.upenn.edu/LDC2026T05> CALLHOME German Second Edition<https://catalog.ldc.upenn.edu/LDC2026S06> CALLHOME German Lexicon Second Edition<https://catalog.ldc.upenn.edu/LDC2026L04> ________________________________ New publications: MADCAT Phases 1-3 Composite Evaluation Set<https://catalog.ldc.upenn.edu/LDC2026T05> contains the evaluation data created by LDC for Phases 1-3 of the DARPA MADCAT program and the NIST OpenHaRT<https://www.nist.gov/itl/iad/mig/openhart> 2010 and 2013 evaluations. It consists of handwritten Arabic documents scanned at high resolution and annotated for the physical coordinates of each line and token, digital transcripts, and English translations with content and annotation layers integrated in a single MADCAT XML output. This release includes 1,643 images and corresponding annotation files. Source documents were web text and newswire collected by LDC. Arabic-speaking scribes copied documents by hand, following specific instructions as to the writing style, writing implement, and paper. Each page was scanned and the images annotated. The goal of the MADCAT program was to automatically convert foreign language text images into English transcripts for use by humans and downstream processes, including summarization and information extraction. The core evaluation task in MADCAT was the translation of handwritten Arabic documents. 2026 members can access this corpus through their LDC accounts. Non-members may license this data for a fee. * CALLHOME German Second Edition<https://catalog.ldc.upenn.edu/LDC2026S06> was developed by LDC and contains 48 hours of speech from 100 unscripted telephone conversations between native German speakers. This publication is a re-release of the original CALLHOME German collection, combining CALLHOME German Speech (LDC97S43)<https://catalog.ldc.upenn.edu/LDC97S43> and CALLHOME German Transcripts (LDC97T15)<https://catalog.ldc.upenn.edu/LDC97T15>, with additional transcription and updated directory structure, file formats, and documentation. This release contains the 100 telephone conversations published in CALLHOME German Speech which represented training data (80 calls) and development data (20 calls). Participants spoke on topics of their choice in a single telephone call lasting up to 30 minutes. Calls were manually audited for language, recording quality, channel characteristics, dialect, and region. For this second edition, all audio was converted from SPHERE files to FLAC format, and the original training/development partitioning was removed. This release also features revised transcripts conforming to updated LDC transcription guidelines that addressed normalization of annotation formats, standardization of speaker-produced and background noises, application of foreign-language marking, whitespace cleanup, and corrections and consistency fixes. The CALLHOME series consists of telephone conversations and transcripts developed by LDC and Rutgers, The State University of New Jersey, in support of research in speaker identification, language identification, and related technologies. Languages in the series include American English, Egyptian Arabic, German, Japanese, Mandarin Chinese, and Spanish. 2026 members can access this corpus through their LDC accounts. Non-members may license this data for a fee. * CALLHOME German Lexicon Second Edition<https://catalog.ldc.upenn.edu/LDC2026L04> was developed by LDC and contains 318,809 German words with morphological, phonological, stress, and frequency information. This second edition updates file formats, directory structure, and documentation. The first edition is available as CALLHOME German Lexicon (LDC97L18)<https://catalog.ldc.upenn.edu/LDC97L18>. The words in the lexicon were derived from the CELEX German lexicon (CELEX2 (LDC96L14)<https://catalog.ldc.upenn.edu/LDC96L14>) and from 100 training and development transcripts representing unscripted telephone conversations between native German speakers contained in CALLHOME German Second Edition, LDC2026S04. The lexicon has seven tab-separated information fields: (1) headword: orthographic form; (2) morph: morphological analysis of the headword; (3) pron: pronunciation of the headword; (4) stress: primary stress information of the word; (5) celex: whether the headword appears in the CELEX German lexicon; (6) train_freq: frequency of the headword in the CALLHOME training transcripts; and (7) dev_freq: frequency of the headword in the CALLHOME development transcripts. This release also includes a pronunciation dictionary derived from the lexicon in CMUdict<https://stdlib.io/docs/api/latest/@stdlib/datasets/cmudict> format. 2026 members can access this corpus through their LDC accounts provided they have submitted a completed copy of the special license agreement. Non-members may license this data for a fee. To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance. Membership Coordinator Linguistic Data Consortium<ldc.upenn.edu> University of Pennsylvania T: +1-215-573-1275 E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu> M: 3600 Market St. Suite 810 Philadelphia, PA 19104

1 0

[EXTENDED DEADLINE] CFP: CLEF 2026 – Submission Deadline Extended to 25 May 2026 (AoE)
by Philipp Schaer 14 May '26

14 May '26

Dear colleagues, The submission deadline for CLEF 2026 has been extended to 25 May 2026 (AoE). We invite submissions to the 17th Conference and Labs of the Evaluation Forum (CLEF 2026), to be held in Jena, Germany, from 21–24 September 2026. CLEF 2026 Conference and Labs of the Evaluation Forum Information Access Evaluation meets Multilinguality, Multimodality, and Visualization https://clef2026.clef-initiative.eu/calls/papers/ Important Dates (AoE) * 25 May 2026: Extended full paper submission deadline (Long; Short; Past, Present, Future) * 29 May 2026: Best of 2025 Labs paper submission * 26 June 2026: Notification of acceptance * 17 July 2026: Camera-ready version due * 21–24 September 2026: Conference in Jena, Germany Aim and Scope The CLEF Conference addresses all aspects of Information Access in any modality and language. CLEF consists of the presentation of research papers and a series of workshops presenting the results of lab-based comparative evaluation benchmarks. CLEF 2026 continues the CLEF campaigns running since 2000, contributing to the systematic evaluation of information access systems through experimentation on shared tasks. The conference focuses on experimental Information Access as carried out within evaluation forums such as CLEF Labs, TREC, NTCIR, FIRE, MediaEval, RomIP, SemEval, and TAC, with particular attention to multimodality, multilinguality, and interactive search. CLEF welcomes submissions describing rigorous hypothesis testing regardless of whether results are positive or negative. Reproducibility and clear research design are strongly encouraged, as are links to code and data repositories. Topics of Interest * Information retrieval, question answering, recommender systems, image retrieval, search interfaces, and infrastructures * Interactive and conversational search evaluation, including RAG systems * Analytics for information access * Reproducibility and replicability studies * Fairness, accountability, transparency, ethics, and explainability (FATE) * Low-resource and multilingual information access * Collaborative and social data models * User studies and crowdsourcing * Evaluation methodologies, metrics, and statistical tools * Technology transfer and deployment * Domain-specific applications (health, legal, cultural heritage, social media, etc.) * New data collections * Reflections on past achievements and future research directions Paper Categories * Long research papers: 12 pages + references * Short research papers: 6 pages + references * Past, Present, Future papers: up to 12 pages For details on submission and formatting, please refer to https://clef2026.clef-initiative.eu/calls/papers/ Best Paper Award One outstanding paper will receive the CLEF 2026 Best Paper Award, sponsored by Springer LNCS, including a certificate and a €500 prize. We look forward to your submissions and to welcoming you to Jena for CLEF 2026. Best regards, Philipp Schaer, Eva Zangerle Program Committee Chairs CLEF 2026 ############################

1 0

ACM ICMI 2026 Call for Late-Breaking Results (LBR)
by Gualtiero Volpe 13 May '26

13 May '26

ICMI 2026 CALL FOR LATE-BREAKING RESULTS (LBR) =============================================== 5-9 October 2026, Napoli - Italy https://icmi.acm.org/2026/ =============================================== Dear colleagues, Please find below the Call for Papers for the Late-Breaking Results (LBR) track of the 28th ACM International Conference on Multimodal Interaction (ICMI 2026). Based on the success of the Late-Breaking Results (LBR) track, ICMI 2026 will continue soliciting submissions for this special venue. The goal of this venue is to provide a way for researchers to share emerging results at the conference. Accepted submissions will be presented in a poster session at the conference, and the extended abstract will be published in the Adjunct Proceedings (Companion Volume) of the main ICMI Proceedings. Like similar venues at other conferences, the LBR venue is intended to allow sharing of ideas, getting formative feedback on early-stage work, and furthering collaborations among colleagues. * Online Submission https://new.precisionconference.com/submissions/icmi26a * Highlights - Submission deadline: June 21st, 2026 - Notifications: July 15th, 2026 - Camera-ready deadline: August 2nd, 2026 - Conference Dates: October 6–8, 2026 - Submission format: Anonymized short paper (four-page paper in a double-column format, not including references), following the submission guidelines - Selection process: Peer-Reviewed - Presentation format: Participation in the conference poster session - Proceedings: Included in the Adjunct Proceedings (Companion Volume) and ACM Digital Library - LBR Co-chairs: Daniel Riccio and Hung-Hsuan Huang * What are Late-Breaking Results? Late-Breaking Results (LBR) submissions represent work such as preliminary results, provoking and current topics, novel experiences or interactions that may not have been fully validated yet, cutting-edge or emerging work that is still in exploratory stages, smaller-scale studies, or, in general, work that has not yet reached a level of maturity expected for the full-length main track papers. However, LBR papers are still expected to bring a contribution to the ICMI community, commensurate with the preliminary, short, and quasi-informal nature of this track. * Why submit to the Late-Breaking Results track at ICMI? Accepted LBR papers will be presented as posters during the conference. This provides an opportunity for researchers to receive feedback on early-stage work, explore potential collaborations, and otherwise engage in exciting, thought-provoking discussions about their work in an informal setting that is significantly less constrained than a paper presentation. The LBR track also offers those new to the ICMI community a chance to share their preliminary research as they become familiar with this field. Late-Breaking Results papers appear in the Adjunct Proceedings (Companion Volume) of the ICMI Proceedings. Copyright is retained by the authors, and the material from these papers can be used as the basis for future publications as long as there are significant revisions, as per the ACM and ACM SIGCHI policies. LBR papers will be published as ACM extended abstracts in the Adjunct Proceedings. Under ACM Open, extended abstract article types are not subject to Article Processing Charges (APCs). * Submission Guidelines Extended Abstract An anonymized short paper, four-page paper in a double-column ACM conference format, using LaTeX or Word (excluding references). Papers should follow the same guidelines as papers published in the proceedings of the ACM ICMI conference. The paper should be submitted in PDF format and through the ICMI submission system in the "Late-Breaking Results" track. Due to the tight publication timeline, it is recommended that authors submit a very nearly finalized paper that is as close to camera-ready as possible, as there will be a very short timeframe for preparing the final camera-ready version, and no deadline extensions can be granted. Anonymization Authors are instructed not to include author information in their submission. In order to help reviewers judge the situation of the LBR relative to prior work, authors should not remove or anonymize references to their own prior work. Instead, authors should refer to their own prior work in the third person during submission. After acceptance, such references can be changed to first person if desired. * Review Process LBRs will be evaluated to the extent that they are presenting work still in progress, rather than complete work, which is under-described in order to fit into the LBR format. The LBR track will undergo an external peer review process. Submissions will be evaluated by a number of factors, including (1) the relevance of the work to ICMI, (2) the quality of the submission, and (3) the degree to which it fits the LBR track, for example, in-progress results. More particularly, the quality of the submission will be evaluated based on the potential contributions of the research to the field of multimodal interfaces and its impact on the field and beyond. Authors should clearly justify how the proposed ideas can bring measurable breakthroughs compared to the state of the art. * Attendance Similar rules for registration and attendance will be applied for authors of LBR papers as for regular papers. Further information will be made available later on the conference website. * Website For updates, please visit: https://icmi.acm.org/2026/late-breaking-results/ * Contact For further questions, contact the LBR co-chairs, Daniel Riccio and Hung-Hsuan Huang, at: icmi2026-latebreaking-chairs(a)acm.org We would be grateful if you could circulate this call among colleagues and interested researchers. Best regards, ICMI 2026 LBR Chairs Daniel Riccio and Hung-Hsuan Huang

1 0

2026

2025

2024

2023

2022

Corpora