- Corpora - ELRA lists

March 2026 Newsletter - LDC
by Penn LDC 16 Mar '26

16 Mar '26

In this newsletter: LDC data and commercial technology development New publications Ancient Chinese WordNet<https://catalog.ldc.upenn.edu/LDC2026L03> CALLHOME Spanish Second Edition<https://catalog.ldc.upenn.edu/LDC2026S04> CALLHOME Spanish Lexicon Second Edition<https://catalog.ldc.upenn.edu/LDC2026L02> ________________________________ LDC data and commercial technology development For-profit organizations are reminded that an LDC membership is a pre-requisite for obtaining a commercial license to almost all LDC databases. Non-member organizations, including non-member for-profit organizations, cannot use LDC data to develop or test products for commercialization, nor can they use LDC data in any commercial product or for any commercial purpose. LDC data users should consult corpus-specific license agreements for limitations on the use of certain corpora. Visit the Licensing<https://www.ldc.upenn.edu/data-management/using/licensing> page for further information. ________________________________ New publications: Ancient Chinese WordNet<https://catalog.ldc.upenn.edu/LDC2026L03> was developed by Nanjing Normal University<https://www.njnu.edu.cn/> and contains lexical and semantic information for Ancient Chinese vocabulary from the Pre-Qin period (before 221 BCE). The WordNet comprises 38,781 word forms and 55,100 senses, each manually linked to a corresponding synset in Princeton WordNet 1.6<https://wordnet.princeton.edu/> and covering 22 noun categories, 15 verb categories, and additional adjective and adverb categories. The Ancient Chinese WordNet project began in 2012 with the goal of creating a structured lexical database to support linguistic research and natural language processing applications involving historical Chinese language materials. 2026 members can access this corpus through their LDC accounts. Non-members may license this data for a fee. * CALLHOME Spanish Second Edition<https://catalog.ldc.upenn.edu/LDC2026S04> was developed by LDC and contains 38 hours of speech from 120 unscripted telephone conversations between native Spanish speakers. This publication is a re-release of the original CALLHOME Spanish collection, combining CALLHOME Spanish Speech (LDC96S35)<https://catalog.ldc.upenn.edu/LDC96S35> and CALLHOME Spanish Transcripts (LDC96T17)<https://catalog.ldc.upenn.edu/LDC96T17>, with additional transcription and updated directory structure, file formats, and documentation. This corpus contains the 120 calls from CALLHOME Spanish Speech which represented training and development data and a subset of evaluation data. Participants spoke on topics of their choice in a single telephone call lasting up to 30 minutes. Calls were manually audited for language, recording quality, channel characteristics, dialect, and region. For this second edition, all audio was converted from SPHERE files to FLAC format, and the original training/development/test partitioning was removed. This release also features revised transcripts conforming to updated LDC transcription guidelines that addressed normalization of annotation formats, standardization of speaker-produced and background noises, application of foreign-language marking, whitespace cleanup, and corrections and consistency fixes. The CALLHOME series consists of telephone conversations and transcripts developed by LDC and Rutgers, The State University of New Jersey, in support of research in speaker identification, language identification, and related technologies. Languages in the series include American English, Egyptian Arabic, German, Japanese, Mandarin Chinese, and Spanish. 2026 members can access this corpus through their LDC accounts. Non-members may license this data for a fee. * CALLHOME Spanish Lexicon Second Edition<https://catalog.ldc.upenn.edu/LDC2026L02> was developed by LDC and contains 45,547 Spanish words with morphological, phonological, stress, and frequency information. This second edition updates file formats, directory structure, and documentation. The first edition is available as CALLHOME Spanish Lexicon (LDC96L16)<https://catalog.ldc.upenn.edu/LDC96L16>. The words in the lexicon were derived from 80 transcripts representing unscripted telephone conversations between native Spanish speakers contained in CALLHOME Spanish Second Edition LDC2026S04 and from various Spanish news texts. The lexicon contains nine tab-separated information fields: (1) headword: orthographic form; (2) morph: morphological analysis of the headword; (3) pron: pronunciation of the headword; (4) stress: primary stress information of the word; (5) callh freq: frequency of the headword in CALLHOME transcripts; (6) madrid freq: frequency of the headword in Madrid Radio transcripts; (7) ap freq: frequency of the headword in Associated Press newswire; (8) reut freq: frequency of the headword in Reuters newswire; and (9) norte freq: frequency of the headword in El Norte newswire. This release also includes a pronunciation dictionary derived from the lexicon in CMUdict<https://stdlib.io/docs/api/latest/@stdlib/datasets/cmudict> format and the grapheme-to-phoneme (G2P) tools used to automatically generate pronunciations for the original lexicon. 2026 members can access this corpus through their LDC accounts provided they have submitted a completed copy of the special license agreement. Non-members may license this data for a fee. To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance. Membership Coordinator Linguistic Data Consortium<ldc.upenn.edu> University of Pennsylvania T: +1-215-573-1275 E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu> M: 3600 Market St. Suite 810 Philadelphia, PA 19104

1 0

CFP: 61st Linguistics Colloquium, Sept. 9 to 12, Pavia, Italy
by Reinhard Rapp 16 Mar '26

16 Mar '26

Call for Papers 61st Linguistics Colloquium LingColl 2026 Università di Pavia, Italy, September 9 to 12, 2026 https://lingcoll26.unipv.it Scope: all fields of Linguistics Conference languages: English and German Deadline for abstract submission: May 3, 2026 Special theme: Rethinking Language Comparison: Contrastive Linguistics between Corpora and AI The 61st Linguistics Colloquium (www.lingcoll.de) will take place at the University of Pavia, Italy, from September 9 to 12, 2026. Founded in Hamburg in 1966, the Linguistics Colloquium (has since been hosted in almost 20 countries. It provides a platform for the study of language and languages in all areas of linguistics and warmly welcomes researchers from diverse theoretical backgrounds. The colloquium is distinguished by its cooperative and open culture of discussion: innovative ideas meet critical reflection, and the exchange of research results is actively promoted. Its aim is to create an inspiring space where new approaches, methods, and perspectives can be jointly discussed and developed. In addition, contrastive linguistics will be a focal point at this year’s colloquium. Since its beginnings, contrastive linguistics has undergone significant development, expanding both its methodological and conceptual scope. Today, language comparison is no longer limited to language pairs but can involve multiple languages. It integrates geographical and sociolinguistic dimensions, extends its focus to semantic, pragmatic, textual, and discourse-linguistic levels, and also takes into account historical stages and diachronic comparisons within a single language. Moreover, contrastive linguistics has increasingly established itself as a theoretically reflective discipline: analysing a language in the light of another allows for the identification of linguistic phenomena that might otherwise remain unnoticed or inadequately explained. Recent advances have been particularly driven by the use of large corpora and digital methods. AI-supported analytical methods are expected to provide further developments in the near future. The planned conference will focus on current theoretical, methodological, and applied approaches in contrastive linguistics, with a particular emphasis on German in comparison with other languages. Its aim is to bring together research that empirically investigates systematic differences and similarities across languages and highlights their relevance for applied contexts. Thematic Focus (including, but not limited to): Contrastive Analyses in the Areas of: - Phonetics and phonology - Morphology and syntax - Semantics and lexicon - Phraseology and pragmatics - Text and discourse Corpus-Based, Corpus-Driven, and AI-Supported Approaches: - Contrastive corpus linguistics - Comparative corpus annotation - Corpus-based analyses of phraseological patterns, collocations, and constructions - Quantitative and qualitative methods - Use of AI, NLP, and LLMs in contrastive Research Methodological and Theoretical Issues: - Comparability of data and corpora - Modelling linguistic differences at the word, phrase, and discourse levels - Interfaces between linguistics, corpus linguistics, computational linguistics, and AI Applied Perspectives, Including: - German as a foreign and second language (DaF/DaZ) - Specialized and professional language - Phraseodidactics and discourse-oriented language teaching - Lexicography, phraseography, and terminology work - Translation studies, interpreting, and contrastive discourse analysis- - Language teaching and language comparison in the Classroom We welcome contributions that are theoretically informed as well as empirically oriented, including work that bridges basic research and application. Submissions presenting innovative methods or new resources are particularly encouraged. In addition, in keeping with the tradition of the Linguistics Colloquium, presentations from all other areas of linguistics may be proposed. Submission Abstracts (approx. 300 words) can be submitted until May 3, 2026. lingcoll2026(a)gmail.com Notification of acceptance will be sent by May 15, 2026. Conference Languages The conference languages are German and English. Registration Registration deadline: 30 June 2026 lingcoll2026(a)gmail.com Registration fee Participants with a regular income: €200.00 Participants without a regular income (PhD candidates, scholarship holders): €100.00

1 0

Second CALL: Multimodal sexism identification with sensor data - EXIST 2026
by JORGE AMANDO CARRILLO DE ALBORNOZ CUADRADO 16 Mar '26

16 Mar '26

Please consider contributing and/or forwarding to appropriate colleagues and groups. ****We apologize for the multiple copies of this e-mail**** ---------------------------------------------------------------------------------------------------- Call for Participation ---------------------------------------------------------------------------------------------------- Second Call for Participation: EXIST 2026: Multimodal sexism identification with sensor data Website: http://nlp.uned.es/exist2026/ EXIST is a series of scientific events and shared tasks on sexism identification in social networks. EXIST aims to foster the automatic detection of sexism in a broad sense, from explicit misogyny to other subtle expressions that involve implicit sexist behaviours (EXIST 2021, EXIST 2022, EXIST 2023, EXIST 2024, EXIST 2025). The sixth edition of the EXIST shared task will be held as a Lab in CLEF 2026, on September 21-24, 2026, at Friedrich-Schiller-Universität Jena, Germany . In EXIST 2026, we take a significant step forward by integrating the principles of Human-Centered AI (HCAI) into the development of automatic tools for detecting sexism online. Recognizing that no single interpretation can fully capture the diversity of human perception, we go beyond traditional annotation paradigms by combining Learning With Disagreement (LeWiDi) with sensor-based data (EEG, heart rate, and eye-tracking signals) collected from subjects exposed to potentially sexist content, with the aim of capturing unconscious responses to sexism. This dual approach represents a breakthrough in dataset creation for sensitive and value-laden tasks: for the first time, datasets will include not only divergent judgments from annotators, but also the embodied traces of how this content affect. This richer, multidimensional annotation process will enable the development of more inclusive, equitable, and socially aware AI systems for detecting sexism in complex multimedia formats like memes and short videos, where ambiguity and affect play a critical role. Similar to the approaches in the 2023, 2024 and 2025 edition, this edition will also embrace the Learning With Disagreement (LeWiDi) paradigm for both the development of the dataset and the evaluation of the systems. The LeWiDi paradigm doesn’t rely on a single “correct” label for each example. Instead, the model is trained to handle and learn from conflicting or diverse annotations. This enables the system to consider various annotators’ perspectives, biases, or interpretations, resulting in a fairer learning process. Building upon the EXIST 2025 dataset, this edition focuses exclusively on multimedia formats, comprising six experimental subtasks applied to images (memes) and videos (TikToks). Participants are challenged to address three main objectives: sexism identification (x.1), source intention detection (x.2), and sexism categorization (x.3) (numbering of subtask is consistent with EXIST 2025). Participants will be asked to classify memes and videos (in English and Spanish) according to the following tasks: TASK 2: Sexism detection in Memes: TASK 2.1 - Sexism Identification in Memes: this is a binary classification subtask consisting on determining wheter a meme describes a sexist situation or criticizes a sexist behaviour, and classifying it into two categories: YES and NO. Task 2.2: Source Intention in Memes: this subtask aims to categorize the meme according to the intention of the author. Due to the characteristics of the memes systems should only classify memes into the DIRECT or JUDGEMENTAL categories. Task 2.3: Sexism Categorization in Memes: once a message has been classified as sexist, the third subtask aims to categorize the message in different types of sexism (according to a categorization proposed by experts and that takes into account the different facets of women that are undermined). In particular, each sexist tweet must be categorized in one or more of the following categories: (i) IDEOLOGICAL AND INEQUALITY, (ii) STEREOTYPING AND DOMINANCE, (iii) OBJECTIFICATION, (iv) SEXUAL VIOLENCE and (v) MISOGYNY AND NON-SEXUAL VIOLENCE. TASK 3: Sexism detection in Videos: SUBTASK 3.1 - Sexism Identification in Videos: this is a binary classification task as in Subtasks 2.1. SUBTASK 3.2: Source Intention in Videos: this subtask replicates subtask 2.2 for memes, but it takes as source videos. SUBTASK 3.3: This subtask aims to classify sexist videos according to the categorization provided for Subtask 2.3: (i) IDEOLOGICAL AND INEQUALITY, (ii) STEREOTYPING AND DOMINANCE, (iii) OBJECTIFICATION, (iv) SEXUAL VIOLENCE and (v) MISOGYNY AND NON-SEXUAL VIOLENCE. Although we recommend to participate in all subtasks and in both languages, participants are allowed to participate just in one of them (e.g. subtask 2.1) and in one language (e.g. English). During the training phase, the task organizers will provide the participants with the manually-annotated EXIST 2026 dataset. For the evaluation of the systems, the unlabeled test data will be released. We encourage participation from both academic institutions and industrial organizations. We invite participants to register for the lab at CLEF 2026 Labs Registration site (https://clef-labs-registration.dipintra.it/). You will receive information about how to join the Discord Group for the EXIST 2026 shared task. Important Dates: * 17 November 2025: Registration opens. ¡¡¡¡DONE!!! * 26 February 2026: Training set available. ¡¡¡¡DONE!!! * 9 April 2026: Test set available. * 23 April 2026: Registration closes. * 7 May 2026: Runs submission due to organizers. * 28 May 2026: Results notification to participants. * 4 June 2026: Submission of Working Notes by participants. * 30 June 2026: Notification of acceptance (peer reviews). * 6 July 2026: Camera-ready participant papers due to organizers. * 21-24 September 2026: EXIST 2026 at CLEF Conference. ** Note: All deadlines are 11:59PM UTC-12:00 ("anywhere on Earth") ** Organizers: Laura Plaza, Universidad Nacional de Educación a Distancia (UNED) Jorge Carrillo-de-Albornoz, Universidad Nacional de Educación a Distancia (UNED) Iván Arcos, Universitat Politècnica de València (UPV) Maria Aloy Mayo, Universitat Politècnica de València (UPV) Paolo Rosso, Universitat Politècnica de València (UPV) Damiano Spina, Royal Melbourne Institute of Technology (RMIT) Contact: Contact the organizers by writing to: jcalbornoz(a)lsi.uned.es Website: http://nlp.uned.es/exist2026/ AVISO LEGAL. Este mensaje puede contener información reservada y confidencial. Si usted no es el destinatario no está autorizado a copiar, reproducir o distribuir este mensaje ni su contenido. Si ha recibido este mensaje por error, le rogamos que lo notifique al remitente. Le informamos de que sus datos personales, que puedan constar en este mensaje, serán tratados en calidad de responsable de tratamiento por la UNIVERSIDAD NACIONAL DE EDUCACIÓN A DISTANCIA (UNED) c/ Bravo Murillo, 38, 28015-MADRID-, con la finalidad de mantener el contacto con usted. La base jurídica que legitima este tratamiento, será su consentimiento, el interés legítimo o la necesidad para gestionar una relación contractual o similar. En cualquier momento podrá ejercer sus derechos de acceso, rectificación, supresión, oposición, limitación al tratamiento o portabilidad de los datos, ante la UNED, Oficina de Protección de datos<https://www.uned.es/dpj>, o a través de la Sede electrónica<https://uned.sede.gob.es/> de la Universidad. Para más información visite nuestra Política de Privacidad<https://descargas.uned.es/publico/pdf/Politica_privacidad_UNED.pdf>.

1 0

Final CFP: the 9th Workshop on Event Extraction and Understanding: Challenges and Applications (EEUCA 2026) @ ACL 2026
by Ali Hurriyetoglu 16 Mar '26

16 Mar '26

The 9th Workshop on Event Extraction and Understanding: Challenges and Applications (EEUCA 2026) (formerly CASE) @ ACL 2026 Also, this year, the EEUCA workshop (previously CASE) continues the tradition of the eight previous editions of our workshop on challenges and applications of event extraction. Website: https://bit.ly/EEUCA2026 Submission page: https://openreview.net/group?id=aclweb.org/ACL/2026/Workshop/EEUCA Paper submission deadline: March 29, 2026 (Updated!) Pre-reviewed ARR commitment deadline: April 15, 2026 Notification of acceptance: April 28, 2026 Camera-ready paper due: May 12, 2026 Pre-recorded video due (hard deadline): June 4, 2026 Shared tasks and shared task papers: Start of the Competition: Dec 10, 2025 Eval Phase Start: Dec 10, 2025 Test Phase Start: Jan 15, 2026 Test Phase End: March 15, 2026 Paper Submission Deadline: March 28, 2026 Notification of acceptance: April 28, 2026 Camera-ready paper due: May 12, 2026 We invite work on all aspects of automated coding and analysis of events from mono- or multi-lingual text sources. This includes (but is not limited to) the following topics 1) Extracting events and their arguments in and beyond a sentence or document, event coreference resolution. 2) New datasets, training data collection and annotation for event information. 3) Event-event relations, e.g., subevents, main events, spatiotemporal relations, causal relations. 4) Event dataset evaluation in light of reliability and validity metrics. 5) Defining, populating, and facilitating event schemas and ontologies. 6) Automated tools and pipelines for event collection related tasks. 7) Lexical, syntactic, semantic, discursive, and pragmatic aspects of event manifestation. 8) Methodologies for development, evaluation, and analysis of event datasets. 9) Applications of event databases, e.g. early warning, conflict prediction, and policymaking. 10) Estimating what is missing in event datasets using internal and external information. 11) Detection of new event types, e.g. creative protests, cyber activism, COVID-19 related, terrorism, food safety, food security, climate change, extreme weather events, disasters. 12) Release of new event datasets, 13) Bias and fairness of the sources and event datasets. 14) Ethics, misinformation, privacy, and fairness concerns pertaining to event datasets. 15) Copyright issues on event dataset creation, dissemination, and sharing. 16) Cross-lingual, multilingual, and multimodal aspects in event analysis. 17) Exploiting LLMs in Event Extraction. 18) Generative AI and event reports: detecting AI-generated news, exploiting generative AI for creating event corpora, etc. Shared Task 1: Multimodal Identification of Vaccine Critical Content on Social Media This shared task focuses on detecting vaccine-critical stance in multimodal social media memes. Using the VaxMeme dataset of over 10,000 annotated memes, participants will develop models that jointly leverage visual and textual signals to classify a meme’s stance as pro-vaccine, vaccine-critical, or neutral. The task encourages research on cross-modal understanding, sarcasm, implicit messaging, and misinformation dynamics in public health discourse. External data and transfer learning are permitted, and submissions will be evaluated using macro-F1. All system description papers will be published in the ACL Anthology. Learn More: https://github.com/therealthapa/eeuca-vaccine Shared Task 2: Understanding Toxic Behavioral Intent in Gaming Chat Logs for Healthy Online Interaction This shared task tackles intent-level toxicity detection in online gaming communities using the GameTox dataset of 53,000 annotated chat utterances from World of Tanks. Participants will develop models that classify a player’s message into six fine-grained intent categories, including hate, threats, insults, extremism, and non-toxic communication. The challenge highlights contextual nuance, gaming slang, implicit aggression, and varied severity levels of toxicity. External datasets are allowed, and submissions are evaluated using macro-F1. All system description papers will be published in the ACL Anthology. Learn More: https://github.com/therealthapa/eeuca-toxicity Keep an eye on the workshop page that is being updated: https://bit.ly/EEUCA2026 and contact us for any inquiries (submission, collaboration, contribution, or just saying Hi! ). EEUCA Organization Committee

1 0

CfP DHOW: Workshop Diffusion of Harmful Content on Online Web Workshop - @ ACMMM 2025
by Thomas Mandl 15 Mar '26

15 Mar '26

The 2st Workshop on DHOW: Diffusion of Harmful Content on Online Web Workshop The workshop will be conducted in a *hybrid* format to ensure maximum participation, accommodating attendees both *online* and in person. Submission deadline: *July 11 2025 AOE* *Workshop site*: https://dhow-workshop.github.io/2025/ *Co-located with ACMMM 2025* https://acmmm2025.org/ <https://lrec-coling-2024.org/> Dublin, Ireland, 27-31 October 2024 *Important Dates* Submission deadline: extended to *July 11, 2025* Notification of acceptance: August 01, 2025 Camera-ready papers due: August 11, 2025 Workshop date: October 27/28, 2025 *Workshop Description* With the advancement of digital technologies and gadgets, online content is easily accessible. At the same time, harmful content also gets spread. There are different harmful content available on different platforms in multiple languages. The topic of harmful content is broad and covers multiple research directions. But from the user’s aspect, they are affected by them all. Often, it is studied individually, like misinformation and hate speech. Research has been done on one platform, monolingual, on a particular issue. It leads to harmful content spreaders switching platforms and languages to reach the user base. Harmful is not limited to social media but also news media. Spreader shares harmful content in posts, news articles, comments, and hyperlinks. So, there is a need to study the harmful content by combining cross-platform, language, multimodal data and topics. We will bring the research on harmful content under one umbrella so that research on different topics (hate speech, misinformation, disinformation, self-harm, offensive content, etc.) can bring some novel methods and recommendations for users, leveraging text analysis with image, audio, and video recognition to detect harmful content in diverse formats. The workshop will cover the ongoing issue of war or elections in 2025. We believe this workshop will provide a unique opportunity for researchers and practitioners to exchange ideas, share latest developments, and collaborate on addressing the challenges associated with harmful contents spread across the Web. We expect that the workshop will generate insights and discussions that will help advance the field of societal artificial intelligence (AI) for the development of safer internet. In addition to attracting high quality research contributions to the workshop, one of the aims of the workshop is to mobilise the researchers working on the related areas to form a community. *Submissions Topics* •Studying different types of harmful content •Computational fact-checking & Misinformation Detection •Role of Generative AI in Mitigating Harmful Content •Harassment, Bullying, and Hate Speech Detection •Explainable AI for Harmful Content Analysis •Multimodal and Multilingual Harmful Content Detection such as fake news, spam, and troll detection. •Deepfake and Synthetic Media •Ethical & Societal Implications of AI in Content Moderation •Both Qualitative and Quantitative study on harmful content •Psychological effects of harmful content like mental health •Approaches for data collection or data annotation using multimodal large models on harmful content •User study on the effects of harmful content on human beings *Submissions* - Submission Instructions: https://dhow-workshop.github.io/2025/#call <https://dhow-workshop.github.io/2025/#call> - Submission Link: https://openreview.net/group?id=acmmm.org/ACMMM/2025/Workshop/DHOW <https://openreview.net/group?id=acmmm.org/ACMMM/2025/Workshop/DHOW> ***Workshop organizers* •Thomas Mandl (University of Hildesheim, Germany) •Haiming Liu (University of Southampton, United Kingdom) •Gautam Kishore Shahi(University of Duisburg-Essen, Germany) •Amit Kumar Jaiswal (University of Surrey, United Kingdom ) •Durgesh Nandini (University of Bayreuth, Germany) DHOW 2025

1 3

Rodolfo Delmonte
by Rocco Tripodi 15 Mar '26

15 Mar '26

Dear Colleagues, It is with great sadness that we announce the death of Rodolfo Delmonte, Retired Professor at Ca’ Foscari University of Venice. He passed away suddenly on Thursday, March 12, at the age of 79. He graduated in 1973 at the University of Venice in Foreign Languages and Literatures and immediately after his graduation won a European-Australian Award. It allowed him to start his research career spending one year at the University of Melbourne working on one of the first computational approaches to quantitative analysis of literary texts. He got his Ph.D. in 1977 and published part of his thesis in a book with the title Piercing into the Psyche: the Poetry of Francis Webb. At the same time in 1976 he started working at the University of Venice as Assistant Professor, teaching English in scientific curricula. From 1978 to 1986 he collaborated with the Department of Electronics of the University of Padua on a pioneering speech synthesis application to encode Italian linguistic information and use it for the generation of speech from text with unlimited vocabulary. He was at the origin of many successful tools for automatic processing and analysis of Italian and English covering all layers of language. He dedicated his entire life to computational linguistics, continuing his work after retirement and keeping fruitful relations with colleagues and friends. Above all, those who had the opportunity to meet him, know how kind and warm-hearted he was. His friendly attitude inspired many of us in our careers. This will always be remembered with deep gratitude. We will gather on March 18, 2026, at 11:00 in the Sala Laica of the San Michele in Isola cemetery (Venice) to say our final farewell to Rodolfo. Sara Tonelli and Rocco Tripodi

1 0

2nd Call for Papers CARI'2026 - Benin
by Mathieu ROCHE 13 Mar '26

13 Mar '26

=========================================== CARI'2026 – 2nd Call For Papers The 18th African Conference on Research in Computer Science and Applied Mathematics (CARI'2026) October 21-24, 2026 University of Abomey-Calavi, Cotonou – Benin https://cari-conf.bj/ ============================================ OVERVIEW ============================================ CARI, the African Conference on Research in Computer Science and Applied Mathematics, is the flagship event of ASDS - African Society in Digital Science (https://asds.africa/). It brings together researchers and practitioners from Africa and beyond to present and discuss advances in computer science and applied mathematics, aiming to strengthen collaboration, international cooperation, and the visibility of African research while fosteringinnovation to address the continent's challenges. CARI'2026 will be held on October 21-24, 2026. The program will feature keynote talks, technical sessions, poster presentations, and panel discussions, preceded by workshops and tutorials on October 22, 2026. ============================================ SCOPE AND TOPICS OF INTEREST ============================================ CARI 2026 invites submissions in English of full papers presenting original research results and short papers reporting work in progress or position papers. The conference is structured around two main tracks: Computer Science and Applied Mathematics. Topics of interest include, but are not limited to: Track: Computer Science - Algorithms and optimisation - Artificial intelligence, machine learning, and data science - Distributed systems and cloud computing - Networking and the Internet of Things - Security, privacy, and dependable systems - Digital sovereignty and computing for Africa Track: Applied Mathematics - Analysis of dynamical Systems - Partial differential equations and their applications - High-performance scientific computing - Mathematical foundations of artificial intelligence - Mathematical Modelling - Stochastic Systems CARI'2026 especially welcomes applied research addressing African contexts and challenges, with application domains including agriculture, healthcare, education, environmental systems, transportation, and logistics. ============================================ IMPORTANT DATES (All deadlines are at 23:59 GMT) ============================================ - Abstract submission [optional]: 23 March, 2026 - Paper submission: 30 March, 2026 - Notification to authors: 22 June, 2026 - Camera-ready deadline: 6 July, 2026 ============================================ PAPER SUBMISSION AND PUBLICATION ============================================ CARI'2026 accepts submissions (in English) in three categories: - Full papers describing original research (up to 14 pages excluding references). - Work-in-progress papers on early results (up to 7 pages in length excluding references). - Position papers proposing novel or unconventional ideas - preferably supported by empirical data and measurements - that differ from prior published work (up to 7 pages excluding references). All submissions must be original, unpublished, and not under consideration elsewhere. All submissions will be reviewed based on relevance, originality, significance, and clarity. The use of AI systems to generate text (e.g. LLM) for inclusion in a CARI submission is only allowed for improving language and readability and if its role is properly documented in the paper (in the acknowledgementssection). Papers should follow the Lecture Notes in Computer Science (LNCS) format (Springer) and be submittedvia EasyChair (https://easychair.org/conferences/?conf=cari2026) CARI 2026 employs a single-blind review process, with authors' names included in submissions. As CARI'2024, all accepted papers should be published in Springer's book series Communications in Computer and Information Science (CCIS) or Trends in Mathematics and made available through the SpringerLink Digital Library (indexed in Scopus, ACM Digital Library, DBLP, and Google Scholar). Selected papers from CARI'2026 will be invited to submit extended versions for possible publication in ARIMA. ============================================ FOR MORE INFORMATION ============================================ Web: https://cari-conf.bj/ E-mail: Cari2026(a)gmail.com ============================================

1 0

[CfP] 14th Workshop on New Frontiers in Mining Complex Patterns (NFMCP 2026) - ECMLPKDD 2026
by angelica.liguori＠icar.cnr.it 13 Mar '26

13 Mar '26

14th Workshop on New Frontiers in Mining Complex Patterns (NFMCP 2026) https://nfmcpworkshop.github.io * Submission deadline: June 14, 2026 * Notification of acceptance: July 16, 2026 * Workshop: September 7-11, 2026 The 14th edition of the Workshop New Frontiers in Mining Complex Patterns (NFMCP) will be hosted by the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) in Naples, Italy, from September 7 to 11, 2026. The workshop aims to bring together data mining and machine learning researchers and practitioners interested in analyzing complex data, such as multi-table data, semi-structured data, web data, time series and sequences, graphs, and trees. The NFMCP workshop is focused on exploring and sharing new techniques for analyzing complex data sources, which has become an increasingly common challenge with the rise of automatic systems collecting vast amounts of data with complex structures. The workshop's primary goal is to promote collaboration among researchers and practitioners, share knowledge and ideas, and encourage the exploration of innovative solutions for analyzing complex data sources. The NFMCP workshop provides a platform for researchers to present their latest research findings, discuss their work with others in the field, and foster interdisciplinary research collaborations. It is an exciting opportunity for data mining and machine learning enthusiasts to share their knowledge, explore new techniques, and collaborate with others in the field. The workshop's focus on complex and massive data sources makes it highly relevant to the machine-learning community, and its goals of promoting collaboration and innovation in the field make it a valuable event for all those interested in data mining and machine learning. ------------------------------------------------------------------------------------------ ### Call for papers We welcome submissions focusing on recent advances and the latest developments in analysing complex and massive data sources such as blogs, event or log data, medical data, spatio-temporal data, social networks, mobility data, sensor data and streams. In addition, we encourage submissions from statistics, machine learning and big data analytics, which present techniques that take advantage of the informative richness of complex, massive data for efficiently and effectively identifying new patterns. ### Workshop topics * Mining Big Data * Mining Biological Data * Social Media Analytics * Ontology and Metadata * Mining Multimedia Data * Sustainable Data Mining * LLMs for Pattern Mining * Mining Multi-relational Data * Mining Networks and Graphs * Mining Spatio-Temporal Data * Data Mining for Cybersecurity * Privacy-preserving Data Mining * Mining Dynamic and Evolving Data * Semantic Web and Knowledge Databases * Mining Environmental and Scientific Data * Mining Heterogeneous and Ubiquitous Data * Mining Semi-structured and Unstructured Data * Mining Stream, Time series and Sequence Data * Foundation on Pattern Mining, Pattern Usage and Pattern Understanding ### Paper submissions NFMCP welcomes diverse contributions, ranging from research papers showcasing well-established findings or ongoing work to innovative papers introducing novel ideas or exploratory research. Additionally, the conference embraces papers that share industry experiences and case studies. However, it is essential to highlight that papers based on work that has been recently published elsewhere will not be considered for inclusion in the conference proceedings. Submissions are accepted in two formats: * Extended abstract having at most 8 pages including references. Extended abstracts are intended to stimulate discussion and collaboration by either reviewing previously published research or outlining new emerging ideas. * Regular research papers having at most 12 pages including references. To be published in the proceedings, research papers must be original, not published previously, and not submitted concurrently elsewhere. The accepted contributions will be included in a joint Post-Workshop proceeding published by Springer Lecture Notes in Computer Science, in 1-2 volumes, organised by focused scope and possibly indexed by WOS. Papers authors will have the faculty to opt-in or opt-out. We suggest workshop papers are prepared and submitted in the format: LNCS format.

1 0

Call for contributions: PsychLing-101 (open repository of psycholinguistic datasets)
by psychling101＠mpib-berlin.mpg.de 13 Mar '26

13 Mar '26

Dear colleagues, We are a group of researchers from the Max Planck Institute for Human Development, Humboldt University of Berlin, and the University of Milano-Bicocca, collaborating on a research initiative to build shared infrastructure for psycholinguistic research. We would like to invite contributions to PsychLing-101, a community-driven repository that collects psycholinguistic datasets in a unified format for both traditional analyses and evaluation of large language models. Project overview PsychLing-101 aims to build a curated corpus of psycholinguistic datasets stored in a standardized structure. The goal is to make datasets easier to find, reuse, and compare across studies, and to support more reproducible research in psycholinguistics and cognitive science. To build a broad and useful resource, we welcome contributions from researchers across the field. We invite datasets of many types, from norming studies and behavioral experiments to eye-tracking and neuroimaging data, including both large-scale collections and smaller curated studies. Repository and documentation: https://github.com/Data-X01/PsychLing-101 Ways to contribute Researchers can contribute in several ways: • Submit a dataset following the instructions in the repository README: https://github.com/Data-X01/PsychLing-101/blob/main/README.md • Curate an existing study from the list in the CONTRIBUTING file: https://github.com/Data-X01/PsychLing-101/blob/main/CONTRIBUTING.md • Suggest a dataset or study by opening an issue in the repository Outcome for contributors All dataset contributors will be included as co-authors on an overview paper describing the repository and the standardized corpus. Questions For questions or assistance, please open a GitHub issue or contact: psychling101(a)gmail.com If you think this initiative may be of interest to colleagues or students, we would greatly appreciate you sharing it within your networks. Best wishes, Taisiia Tikhomirova & Dirk Wulff (on behalf of the PsychLing-101 team)

1 0

2nd CFP: 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)
by Ekaterina Kochmar 13 Mar '26

13 Mar '26

Second Call for Papers The 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026) San Diego, California, United States and online Thursday, July 2 and Friday, July 3, 2026 (co-located with ACL 2026) https://sig-edu.org/bea/current Submission: https://softconf.com/acl2026/bea2026/ Submission Deadline: Monday, March 23, 2026, 11:59pm UTC-12 WORKSHOP DESCRIPTION The BEA Workshop is a leading venue for NLP innovation in the context of educational applications. It is one of the largest one-day workshops in the ACL community with over 100 registered attendees in the past several years. The growing interest in educational applications and a diverse community of researchers involved resulted in the creation of the Special Interest Group in Educational Applications<https://sig-edu.org/members> (SIGEDU) in 2017, which currently has over 400 members. The 21st BEA will be a 2-day workshop, with one in-person workshop day and one virtual workshop day. The workshop will feature oral presentation sessions and large poster sessions to facilitate the presentation of a wide array of original research. Moreover, there will be a panel discussion on “Transitioning from Academia to the EdTech Industry<https://sig-edu.org/news/bea21-call-for-panelists/>”, a half-day tutorial on Theory of Mind and its Applications in Educational Contexts, and two shared tasks on Vocabulary Difficulty Prediction for English Learners<https://www.britishcouncil.org/data-science-and-insights/bea2026st> and on Rubric-based Short Answer Scoring for German<https://edutec.science/bea-2026-shared-task/> comprising an oral overview presentation by the shared task organizers and several poster presentations by the shared task participants. The workshop will accept submissions of both full papers and short papers, eligible for either oral or poster presentation. We solicit papers that incorporate NLP methods, including, but not limited to: * use of generative AI in education and its impact; * automated scoring of open-ended textual and spoken responses; * automated scoring/evaluation for written student responses (across multiple genres); * game-based instruction and assessment; * educational data mining; * intelligent tutoring; * collaborative learning environments; * peer review; * grammatical error detection and correction; * learner cognition; * spoken dialog; * multimodal applications; * annotation standards and schemas; * tools and applications for classroom teachers, learners and/or test developers; and * use of corpora in educational tools. PANEL We invite applications and nominations for panelists to participate in a panel discussion on “Transitioning from Academia to the EdTech Industry.” Please see the call for panelists here: https://sig-edu.org/news/bea21-call-for-panelists/. SHARED TASKS Vocabulary Difficulty Prediction for English Learners Organizers: Mariano Felice (British Council) and Lucy Skidmore (British Council). Description: This shared task aims to advance research into vocabulary difficulty prediction for learners of English with diverse L1 backgrounds, an essential step towards custom content creation, computer-adaptive testing and personalised learning. In a context where traditional item calibration methods have become a bottleneck for the implementation of digital learning and assessment systems, we believe predictive NLP models can provide a more scalable, cost-effective solution. The goal of this shared task is to build regression models to predict the difficulty of English words given a learner’s L1. We believe this new shared task provides a novel approach to vocabulary modelling, offering a multidimensional perspective that has not been explored in previous work. To this aim, we will use the British Council’s Knowledge-based Vocabulary Lists (KVL), a multilingual dataset with psychometrically calibrated difficulty scores. We believe this unique dataset is not only an invaluable contribution to the NLP community but also a powerful resource that will enable in-depth investigations into how linguistic features, L1 background and contextual cues influence vocabulary difficulty. For more information on how to participate and latest updates, please refer to the shared task website: https://www.britishcouncil.org/data-science-and-insights/bea2026st Rubric-based Short Answer Scoring for German Organizers: Sebastian Gombert (DIPF), Zhifan Sun (DIPF), Fabian Zehner (DIPF), Jannik Lossjew (IPN), Tobias Wyrwich (IPN), Berrit Katharina Czinczel (IPN), David Bednorz (IPN), Sascha Bernholt (IPN), Knut Neumann (IPN), Ute Harms (IPN), Aiso Heinze (IPN), and Hendrik Drachsler (DIPF) Description: Short answer scoring is a well-established task in educational natural language processing. In this shared task, we introduce and focus on rubric-based short-answer scoring, a task formulation in which models are provided with a question, a student answer, and a textual scoring rubric that specifies criteria for each possible score level. Successfully solving this task requires models to interpret the semantics of scoring rubrics and apply their criteria to previously unseen answers, closely mirroring how human raters assign scores in educational assessment. Although rubrics have been used as auxiliary information in prior work on free-text scoring and LLM-based approaches, there has been little focused investigation of rubric-based short-answer scoring as a task in its own right. This setting poses distinct challenges, including ambiguous or underspecified rubric criteria and a wide range of valid student responses. With this shared task, we aim to stimulate systematic research on rubric-based scoring, assess how well current NLP methods can reason over rubrics, and identify promising modeling strategies. Additionally, by providing a German-language dataset, the shared task contributes a new non-English benchmark to the field. For more information on how to participate and latest updates, please refer to the shared task website: https://edutec.science/bea-2026-shared-task/ TUTORIAL Theory of Mind and Application in Educational Context Organizers: Effat Farhana (Auburn University), Maha Zainab (Auburn University), Qiaosi Wang (Carnegie Mellon University), Niloofar Mireshghallah (Carnegie Mellon University), Ramira van der Meulen (Leiden University), Max van Duijn (Leiden University). Description: This tutorial examines the integration of Theory of Mind (ToM) into AI-driven online tutoring systems, focusing on how advanced technologies, such as Large Language Models (LLMs), can model learners’ cognitive and emotional states to provide adaptive, personalized feedback. Participants will learn foundational principles of ToM from cognitive science and psychology and how these concepts can be operationalized in AI systems. We will discuss mutual ToM, where both AI tutors and learners maintain models of each other’s mental states, and address challenges such as detecting learner misconceptions, modeling meta-cognition, and maintaining privacy in data-driven tutoring. The tutorial also presents hands-on demonstrations of Machine ToM applied to programming education using datasets such as CS1QA and CodeQA, which contain Java and Python samples. By combining conceptual foundations, research insights, and practical exercises, this tutorial provides a comprehensive overview of designing human-centered, ethically aware, and cognitively informed AI tutoring systems. IMPORTANT DATES All deadlines are 11.59 pm UTC-12 (anywhere on earth). * Submission deadline: Monday, March 23, 2026 * Notification of acceptance: Tuesday, April 28, 2026 * Camera-ready papers due: Tuesday, May 12, 2026 * Workshop: Thursday, July 2, and Friday, July 3, 2026 SUBMISSION INFORMATION We will be using the ACL Submission Guidelines for the BEA Workshop this year. Authors are invited to submit a long paper of up to eight (8) pages of content, plus unlimited references; final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers’ comments can be taken into account. We also invite short papers of up to four (4) pages of content, plus unlimited references. Upon acceptance, short papers will be given five (5) content pages in the proceedings. Authors are encouraged to use this additional page to address reviewers’ comments in their final versions. We generally follow ACL submission guidelines and will require that all submitted papers should include a dedicated "Limitations" section, which does not count toward the page limit. Papers which describe systems are also invited to give a demo of their system. If you would like to present a demo in addition to presenting the paper, please make sure to select either “long paper + demo” or “short paper + demo” under “Submission Category” in the START submission page. Previously published papers cannot be accepted. The submissions will be reviewed by the program committee. As reviewing will be blind, please ensure that papers are anonymous. Self-references that reveal the author’s identity, e.g., “We previously showed (Smith, 1991) …”, should be avoided. Instead, use citations such as “Smith previously showed (Smith, 1991) …”. We have also included conflict of interest in the submission form. You should mark all potential reviewers who have been authors on the paper, are from the same research group or institution, or who have seen versions of this paper or discussed it with you. Link for submissions: https://softconf.com/acl2026/bea2026/ DOUBLE SUBMISSION POLICY We will follow the official ACL double-submission policy. Specifically, papers being submitted both to BEA and another conference or workshop must: * Note on the title page the other conference or workshop to which they are being submitted. * State on the title page that if the authors choose to present their paper at BEA (assuming it was accepted), then the paper will be withdrawn from other conferences and workshops. ORGANIZING COMMITTEE * Ekaterina Kochmar, MBZUAI * Andrea Horbach, Hildesheim University * Ronja Laarmann-Quante, Ruhr University Bochum * Marie Bexte, FernUniversität in Hagen * Anaïs Tack, KU Leuven, imec * Victoria Yaneva, National Board of Medical Examiners * Bashar Alhafni, MBZUAI * Zheng Yuan, University of Sheffield * Jill Burstein, Duolingo * Stefano Bannò, Cambridge University Workshop contact email address: bea.nlp.workshop(a)gmail.com<mailto:bea.nlp.workshop@gmail.com> PROGRAM COMMITTEE Tazin Afrin; David Alfter; Bashar Alhafni; Maaz Amjad; Nischal Ashok Kumar; Stefano Bannò; Michael Gringo Angelo Bayona; Lee Becker; Beata Beigman Klebanov; Luca Benedetto; Bhavya Bhavya; Serge Bibauw; Ted Briscoe; Dominique Brunato; Jie Cao; Dan Carpenter; Jeevan Chapagain; Guanliang Chen; Mei-Hua Chen; Christopher Davis; Orphee De Clercq; Kordula De Kuthy; Jasper Degraeuwe; Dushyanta Dhyani; Yuning Ding; Rahul Divekar; Kosuke Doi; Mohsen Dorodchi; Yo Ehara; Hamza El Alaoui; Sarra El Ayari; Andrew Emerson; Yao-Chung Fan; Mariano Felice; Nigel Fernandez; Michael Flor; Thomas François; Thomas Gaillat; Ananya Ganesh; Ritik Garg; Sebastian Gombert; Samuel González López; Cyril Goutte; Abigail Gurin Schleifer; Na-Rae Han; Ching Nam Hang; Jiangang Hao; Aki Härmä; Hasnain Heickal; Chieh-Yang Huang; Chung-Chi Huang; Radu Tudor Ionescu; Elsayed Issa; N J Karthika; Anisia Katinskaia; Elma Kerz; Fazel Keshtkar; Grandee Lee; Ji-Ung Lee; Arun Balajiee Lekshmi Narayanan; Jiazheng Li; Anastassia Loukina; Wanjing Anya Ma; Jakub Macina; Lieve Macken; Nitin Madnani; Arianna Masciolini; Detmar Meurers; Michael Mohler; Phoebe Mulcaire; Ricardo Muñoz Sánchez; Sungjin Nam; Diane Napolitano; Huy Nguyen; S Jaya Nirmala; Sergiu Nisioi; Michael Noah-Manuel; Adam Nohejl; Amin Omidvar; Daniel Oyeniran; Robert Östling; Ulrike Pado; Yannick Parmentier; Ted Pedersen; Mengyang Qiu; Martí Quixal; Chatrine Qwaider; Arjun Ramesh Rao; Vivi Peggie Rantung; Manikandan Ravikiran; Hanumant Redkar; Robert Reynolds; Saed Rezayi; Frankie Robertson; Aiala Rosá; Andreas Säuberli; Nicy Scaria; Ronald Seoh; Pritam Sil; Astha Singh; Lucy Skidmore; Maja Stahl; Katherine Stasaski; Helmer Strik; Hakyung Sung; Sowmya Vajjala; Elena Volodina; Nikhil Wani; Alistair Willis; Fabian Zehner. If you would like to join our PC, please fill in the form: https://forms.gle/gtKo6Bx6EFmwWf9w5

1 1