April 2026 - Corpora

CfP: RiCL Journal special issue - Learner Corpus Research meets the Common European Framework of Reference for Languages and the Companion Volume
by María Belén Díez Bedmar 08 Apr '26

08 Apr '26

*CFPs special issue of Research in Corpus Linguistics (RiCL)* *‘Learner Corpus Research meets the Common European Framework of Reference for Languages and the Companion Volume’* Due to the importance of the *Common European Framework of Reference for Languages* (CEFR) and the *Companion Volume* (CV) (Council of Europe, 2001, 2020) for the learning, teaching and assessment of languages, most learner corpora nowadays employ the learner’s CEFR level to specify the student’s communicative language competence level or proficiency level. Most learner corpora compiled in CEFR-aligned high-stakes foreign language accreditation/certification exams - for instance, the *Cambridge Learner Corpus* (O’Keeffe & Mark, 2017), the *FineDesc Learner Corpus *(Díez-Bedmar, 2025) - are composed of learners of English production in successful certification/accreditation exams at different CEFR levels. Other target languages are compiled in similar foreign language accreditation/certification exam conditions, such as the CELI corpus for Italian (Spina et al., 2023) or the Merlin corpus for Italian, German and Czech (Wisniewski, 2020). Other learner corpora composed of accreditation/certification exams that are not aligned to the CEFR or learner corpora compiled in other contexts have also been partially or fully aligned to the CEFR levels, as reported by Gablasova et al., (2019) regarding the *Trinity Lancaster* *Corpus*, Thewissen (2013) concerning *ICLE* or Tono (2018) as to the *JEFLL corpus*. The use of LCR results from CEFR-aligned learner corpora to inform or facilitate the implementation of the CEFR/CV is, however, still limited. Among the most important LCR contributions in this respect are those by the *English Profile Project* (Salamoura, 2008) and the CEFR-J Project (Tono, 2019). The former used the *Cambridge Learner Corpus* to provide linguistic information on the language produced at each CEFR level (UCLES/CUP, 2011) and freely available online resources (eg., the English Grammar Profile and the English Vocabulary Profile). The latter employed learner corpora, among other corpora types, to adapt the CEFR for the L1 Japanese EFL context (12 sublevels), create the Vocabulary Profile and the Grammar Profile. Despite these efforts, CEFR/CV end-users and stake-holders find difficulties in the implementation of the CEFR/CV in their L1-contexts, on most occasions due to the language neutral nature of the CEFR/CV descriptors (see Díez-Bedmar & Byram, 2019; Díez-Bedmar & Luque-Agulló, 2023; Luque-Agulló & Díez-Bedmar, 2025). It is for this reason that they demand fine-tuned descriptors, i.e., CEFR/CV descriptors informed by CEFR-aligned L1-specific learner corpus results (Díez-Bedmar, 2018). In the Spanish context, the *FineDesc Project* (Grant PID2020-117041GA-I00, funded by MICIU/AEI/10.13039/501100011033) has provided L1 Spanish CEFR/CV end-users and stake-holders’ with fine-tuned descriptors thanks to the analysis of the 1.3-million-word *FineDesc Learner* *Corpus* (Díez-Bedmar, 2025), a freely available L1-specific learner corpus by L1 Spanish monolingual students (or bilingual L1 Spanish/a co-official language in Spain). The learner corpus results have informed fine-tuned descriptors not only for the linguistic competence, but also for the sociolinguistic and pragmatic ones, when students engage in different communicative language activities at B1, B2 and C1 levels (see Díez-Bedmar et al., 2026). These fine-tuned descriptors aim at paving the way for the implementation of the CEFR/CV in the L1 Spanish context. These are just some examples which show how LCR may inform the CEFR/CV and facilitate its implementation in different contexts. Other efforts are being made by the LCR community by using either general CEFR-aligned learner corpora or L1-specific ones, as shown in some papers presented at the International Online Conference ‘Bringing together research on the CEFR/CV and LCR: a focus on descriptors’ which was organized by the FineDesc Project (https://web.ujaen.es/investiga/finedesc/index.php). It is the objective of this special issue to bring together research on the different ways how LCR may meet the CEFR/CV. Contributions which employ any reliably CEFR-aligned learner corpus with this objective in mind are welcome, whether they were presented at the conference or not. Researchers who do not have access to any CEFR-aligned learner corpus are encouraged to use the *FineDesc Learner Corpus*, freely available at www.finedesc.com. *Potential topics for this special issue are (but are not limited to):* -The fine-tuning of CEFR/CV descriptors for L1-specific contexts thanks to LCR. -The integration of CAF results in fine-tuned descriptors. -The (cross-sectional) analysis of CEFR-aligned learner corpora considering the linguistic, sociolinguistic or pragmatic competences to inform CEFR-aligned pedagogical resources/language assessment. -The exploitation of CEFR-aligned learner corpora to design tools/software which may help analyse learner corpora at different CEFR levels. -The overcoming of any difficulties in CEFR/CV implementation with the help of LCR. *Important dates* Deadline for proposals: April 30, 2026 Outcome of proposal review: May 21, 2026 Deadline for manuscript first drafts: December 1,2026 Notification of reviewer outcome: March 5, 2027 Deadline for manuscript final drafts: May 28th, 2027 Special issue publication: Autumn 2027 *Proposal format and submission* Potential contributors should prepare an extended 800-word abstract of the proposed paper following RiCL’s submission guidelines, which can be found at https://ricl.aelinco.es/index.php/ricl/about/submissions The abstract should include a tentative title, motivate the study, state the research questions, provide methodological information (learner corpus or learner corpora analysed and the procedure employed to align it/them to CEFR/CV levels as well as the statistical tests employed), tentative results and clear information on the way how the results inform the CEFR/CV and its implementation. Please submit your proposal to the special issue editor, María Belén Díez-Bedmar (belendb(a)ujaen.es), before the deadline (April 30, 2026) including in your proposal your name(s), email(s) and affiliation(s). *Peer review* All accepted manuscripts will undergo double-blind peer review. *References* Council of Europe (2001). Common European Framework of Reference for Languages: learning, teaching, assessment. Council of Europe Publishing. Council of Europe (2020). Common European Framework of Reference for Languages: learning, teaching, assessment – Companion Volume. Council of Europe Publishing. Díez-Bedmar, M. B. (2018). Fine-tuning descriptors for CEFR B1 level: insights from learner corpora. *ELT Journal*, 72(2), 199-209. https://doi.org/10.1093/elt/ccx052 Díez-Bedmar, M. B. (2025). FineDesc Learner Corpus 2.0 (España, 2510243469857). SafeCreative. https://www.safecreative.org/validity Díez-Bedmar, M. B., & Byram, M. (2019). The current influence of the CEFR in secondary education: teachers’ perceptions. *Language, Culture and Curriculum*,* 32*(1), 1–15. https://doi.org/10.1080/07908318.2018.1493492 Díez-Bedmar, M. B., Laso-Martín, N. J., Maíz-Arévalo, C., & Carrió-Pastor, M. L. (2026). Supplement to the Common European Framework of Reference for Languages and the Companion Volume: L1 Spanish users (Spain, 2601214327051). SafeCreative. https://www.safecreative.org/validity Díez-Bedmar, M. B. & Luque-Agulló, G. (2023). Analysing the CEFR/CV in University Language Centres in Spain: The Raters' Perspective. In M. Fernández Álvarez & A. L. Gordenstein Montes (Eds.), *Global Perspectives on Effective Assessment in English Language Teaching* (pp. 1-33). IGI Global. Gablasova, D., Brezina, V., & McEnery, T. (2019). The Trinity Lancaster Corpus. Development, description and application. *International Journal of Learner Corpus Research*, *5*(2), 126-158. https://doi.org/10.1075/ijlcr.19001.gab Luque-Agulló, G. & Díez-Bedmar, M.B. (2025). Listening to the teachers: CEFR implementation in University language Centres in Spain. *Revista de Lingüística y Lenguas Aplicadas* (RLyLA),*20*, 72-86. https://doi.org/10.4995/rlyla.2025.21285 O’Keeffe, A., and Mark, G. 2017. The English Grammar Profile of learner competence: Methodology and key findings. *International Journal of Corpus Linguistics,* *22*(4), 457-489. https://doi.org/10.1075/ijcl.14086.oke Salamoura, A. (2008). Aligning English Profile research data to the CEFR. Cambridge ESOL: *Research Notes*, *33*, 5–7. Spina, S., Fioravanti, I., Forti, L., & Zanda, F. (2023). The CELI corpus: Design and linguistic annotation of a new online learner corpus. *Second Language Research*, 40(2), 457-477. https://doi.org/10.1177/02676583231176370 Thewissen, J. (2013). Capturing L2 accuracy developmental patterns: Insights from an error-tagged EFL learner corpus. *The Modern Language Journal*, *97*(S1), 77-101. Tono, Y. (2018). Corpus approaches to L2 Learner Profiling Research. In Y. N. Leung, J. Katchen, S. Y. Hwang & Y. Chen (Eds.), Reconceptualizing English language teaching and learning in the 21st century: A special monograph in memory of Professor Kai-Chong Cheung (pp. 390-402). Taipei, Taiwan: Crane Publishing Company. Tono, Y. (2019). Coming full circle – from CEFR to CEFR-J and back. CEFR Journal. *Research and Practice*, *1*, 5-17. https://doi.org/10.37546/JALTSIG.CEFR1-1 UCLES/CUP (2011). *English Profile. Introducing the CEFR for English Version 1.1*. Cambridge University Press. Wisniewski, K. (2020). SLA developmental stages in the CEFR-related learner corpus MERLIN: Inversion and verb-end structures in German A2 and B1 learner texts. *International Journal of* *Learner Corpus Research*, *6*(1), 1–17. https://doi.org/10.1075/ijlcr.18008.wis <https://doi.org/10.1075/ijlcr.18008.wis> <https://doi.org/10.1075/ijlcr.18008.wis>

1 0

[Job] Postdoctoral researcher (m/f/x) - emoji semantics (April 28, 2026)
by Tatjana Scheffler 08 Apr '26

08 Apr '26

website for applications: https://jobs.ruhr-uni-bochum.de/jobposting/f5001c40d914e7ce325557bfe526aa88… deadline: April 28, 2026 In order to fill a fixed-term position in full-time (39.83 hours/week = 100%) by July 01, 2026 or at the earliest possible date, we are looking for one Post-doctoral researcher (m/f/x) The project EmDiCom 2 "Decompositional Semantics of Face Emojis in Digital Communication" will develop a formal semantics for emojis as a prime example of visual communication within the DFG priority program ViCom ("Visual Communication. Theoretical, Empirical, and Applied Perspectives"). The project is led by co-PIs Prof. Dr. Tatjana Scheffler (Ruhr-Universität Bochum, Germany) and Prof. Dr. Patrick Grosz (University of Oslo, Norway). The project plans to investigate emojis from a formal semantics perspective informed by experimental data. Our research primarily concentrates on the semantics of face emojis, and builds on the hypothesis that face emojis have partly conventionalized (symbolic) meaning, but also incorporate iconic meaning due to their resemblance to human faces, which they are meant to depict. We aim to delineate the formal semantic denotation of face emojis, and whether it can be decomposed into smaller meaning components. Second, we aim to empirically establish the semantic type of expressive face emojis, i.e., whether or not they serve as functions, taking a propositional argument that they comment on. Your tasks: - Independent research on the semantics and pragmatics of emojis in the reseearch project "Decompositional Semantics of Face Emojis in Digital Communication (EmDiCom 2)". - Participation in the interdisciplinary DFG priority program "Visual Communication" (e.g., participation in network meetings and short-term collaborations) - Planning and conducting linguistic online experiments (acceptability studies, reading time measurements) - Collaboration on publications and presentations at international conferences - Supervision of research assistants - Organization of scientific events (workshops) Your profile: - An above-average linguistics PhD is required - Background in theoretical linguistics, preferably formal semantics/pragmatics - Demonstrated experience in experimental linguistics and statistical methods - Knowledge of tools for creating and conducting online experiments is an advantage - Interest in visual communication (e.g., gestures, facial expressions, emojis, Sign Language) - Excellent English skills We offer: - Challenging and varied tasks with a high degree of personal responsibility - Exciting research on a current topic - International cooperation with the "Super Linguistics" group at the University of Oslo and within the "Visual Communication" DFG priority program - A friendly and enthusiastic team at the interface of formal, digital, and computational linguistics - The opportunity to join one of the largest universities in Germany, part of the University Alliance Ruhr - Professional development, mentoring, and training Further information: The position is salaried and based on the collective agreement of the Länder (TV-L). If the personal and collective agreement requirements are met, the employee will receive pay grade E13 TV-L. Further information can be found at https://oeffentlicher-dienst.info/ (in German). The place of work is Ruhr University Bochum. Interviews are expected to take place via Zoom in April and May 2026. Applicants should submit (1) a brief cover letter explaining their interest in the position, (2) a full CV including a list of publications, (3) two examples of their work, and (4) the names and email addresses of two potential referees, all as a single PDF document via the application portal. RUB sees itself as a university with an international presence. The campus languages are German and English. Competence in at least one of the two languages and the willingness to learn the other are a prerequisite. RUB provides corresponding free courses for employees. German language courses are offered by the University Language Center (ZFA) in the field of German as a Foreign Language (DaF). https://www.daf.ruhr-uni-bochum.de/sbgk/index.html.en The Staff Council has the right to participate in all selection interviews. At the request of a candidate (m/f/x), it will ensure its participation in the entire procedure. Please contact wpr(a)rub.de. The Ruhr-Universität Bochum is one of Germany’s leading research universities, addressing the whole range of academic disciplines. A highly dynamic setting enables researchers and students to work across the traditional boundaries of academic subjects and faculties. To create knowledge networks within and beyond the university is Ruhr-Universität Bochum’s declared aim. The Ruhr-Universität Bochum stands for diversity and equal opportunities. For this reason, we favour a working environment composed of heterogeneous teams, and seek to promote the careers of individuals who are underrepresented in our respective professional areas. The Ruhr-Universität Bochum expressly requests job applications from women. In areas in which they are underrepresented they will be given preference in the case of equivalent qualifications with male candidates. Applications from individuals with disabilities are most welcome. Contact persons for further information: Prof. Dr. Tatjana Scheffler +49 234 32 21471 Prof. Dr. Patrick Grosz Travel costs, accommodation costs and loss of earnings or other application costs for job interviews can unfortunately not be reimbursed. We look forward to receiving your application via our online application portal by 2026-04-28. Please make sure to mention the reference number ANR 5610. --- Tatjana Scheffler (she/her) GB 5/157 Ruhr-Universität Bochum Digital Forensic Linguistics Fakultät für Philologie, Germanistisches Institut Universitätsstraße 150 44780 Bochum Germany Mail: tatjana.scheffler(a)rub.de Web: http://staff.germanistik.rub.de/digitale-forensische-linguistik/ Mastodon: https://fediscience.org/@tschfflr Tel.: +49 234 32-21471

1 0

CLEF-2026 CheckThat! Lab -- 3rd Call for Participation
by Julia Maria Struß 08 Apr '26

08 Apr '26

(apologies for cross-posting) Dear colleague, We invite you to participate in the 2026 edition of the CheckThat! Lab at CLEF 2026. This year, we feature three tasks ---two follow-up and one new--- that correspond to important components within and around the full fact-checking pipeline in multiple languages: Task 1 Source Retrieval for Scientific Web Claims: Given a social media post that contains a scientific claim and an implicit reference to a scientific paper (mentions it without a URL), retrieve the mentioned paper from a pool of candidate papers. Available in English, German, and French. Task 2 Fact-Checking Numerical Claims:Given claims, potential evidence, and possible reasoning paths, rank the reasoning paths and provide an output of verdict. Available in Arabic, English, and Spanish. Task 3 Generating Full Fact-Checking Articles:Given a claim, its veracity, and a set of evidence documents consulted for fact-checking the claim, generate a full fact-checking article. Register and participate:https://clef-labs-registration.dipintra.it/ <https://clef-labs-registration.dipintra.it/> Further information:https://checkthat.gitlab.io/ <https://checkthat.gitlab.io/> Datasets:https://gitlab.com/checkthat_lab/clef2026-checkthat-lab <https://gitlab.com/checkthat_lab/clef2026-checkthat-lab>(training and validation materials released) Discord Server: https://discord.gg/PEMh4a2YHV <https://discord.gg/PEMh4a2YHV> Important Dates --------------------- - 23 April 2026: Lab registration closes - 24 April 2026: Beginning of the evaluation cycle (test sets release) - 7 May 2026 (23:59 AOE): End of the evaluation cycle (run submission) - 28 May 2026: Deadline for the submission of working notes [CEUR-WS] - 29 May – 27 June 2026: Review process of participant papers - 30 June 2026: Notification of Acceptance for working notes [CEUR-WS] - 6 July 2026: Camera Ready Copy of working note papers [CEUR-WS] - 10 July 2026: Regular Conference Registration Ends - 31 August 2026: Late Conference Registration Ends - 21-24 September 2026: CLEF 2026 Conference in Jena, Germany Best regards, The CLEF-2026 CheckThat! Lab Shared Task Organizers -- ___________________________ Prof. Dr. Julia Maria Struß Fachhochschule Potsdam University of Applied Sciences Fachbereich Informationswissenschaften Kiepenheuerallee 5 14469 Potsdam Telefon: +49 331 580 4532 Zoom:https://fh-potsdam.zoom-x.de/my/juliamstruss

1 0

HPSG 2026 -- Final CfP (and deadline extension)
by Antonio Machicao y Priemer 07 Apr '26

07 Apr '26

THIRD CfP: The 33rd International Conference on Head-Driven Phrase Structure Grammar (Norway) Short Title: HPSG 2026 Date: 03-Aug-2026 - 04-Aug-2026 Location: Western Norway University of Applied Sciences (Bergen, Norway) Contact: Petter Haugereid, Berthold Crysmann & Antonio Machicao y Priemer Email: hpsg2026(a)easychair.org Conference Website: https://petterha.github.io/hpsg2026/ Conference fee: 68€ (faculty) / 43€ (student) Linguistic Field(s): General Linguistics; Linguistic Theories; Computational Linguistics; Syntax; Morphology; Semantics; Cognitive Science; Meeting Description: The 33rd International Conference on Head-Driven Phrase Structure Grammar will be held on August 03-04 August 2026 at the Western Norway University of Applied Sciences (Bergen, Norway). The HPSG 2026 conference will be a two-day main conference (03-04 August). It will be co-located with the DELPH-IN meeting held over the preceding week (27-31 July). Anonymous abstracts are invited that address linguistic, foundational, or computational issues relating to or in the spirit of the framework of Head-Driven Phrase Structure Grammar. Submissions should be 4 pages long, + 1 page for data, figures & references. They should be submitted in PDF format. The submissions should not include the authors’ names, and authors are asked to avoid self-references. Presentations are in-person by default, although exceptions can be negotiated. All abstracts should be submitted by 17 April 2026 (deadline extended), via Easychair: https://easychair.org/conferences/?conf=hpsg2026 All abstracts will be reviewed anonymously by at least two reviewers. Each accepted abstract will be given 30 minutes for presentation. Additionally, 10 minutes will be reserved for discussion. Deadline for abstracts: 17 April 2026 (Old deadline: 10 April 2026) Reviews due: 10 May 2026 Notification of acceptance: 15 May 2026 Conference: 03-04 August 2026 Keynote speakers: * Dag Trygve Truslew Haug (Universitetet i Oslo, Norway) * Nurit Melnik (Open University, Israel) Conference proceedings submission: 15 October 2026 A call for contributions to the proceedings will be issued after the conference. The proceedings will undergo a separate (final) round of reviews (accept/reject), to enable indexing of the proceedings. The proceedings of previous conferences are available at: https://proceedings.hpsg.xyz/ Programme Committee: - Anne Abeillé (LLF, Université de Paris) - Gabrielle Aguila-Multner (Universität Zürich) - Emily M. Bender (University of Washington) - Gabriela Bîlbîie (University of Bucharest) - Felix Bildhauer (Institut für Deutsche Sprache Mannheim) - Olivier Bonami (Universite Paris Diderot) - Francis Bond (Palacký University) - Rui Chaves (University at Buffalo, SUNY) - Berthold Crysmann (CNRS - LLF, Université de Paris) - Petter Haugereid (Western Norway University of Applied Sciences) - Fabiola Henri (University at Buffalo) - Anke Holler (University of Göttingen) - Jong-Bok Kim (Kyung Hee University) - Jean-Pierre Koenig (University at Buffalo, The State University of New York) - Andy Lücking (Goethe University Frankfurt) - Antonio Machicao y Priemer (Humboldt-Universität zu Berlin) - Jakob Maché (Universidade de Lisboa) - Nurit Melnik (The Open University of Israel) - Luis Morgado Da Costa (Palacký University Olomouc) - Stefan Müller (Humboldt-Universität zu Berlin) - Tsuneko Nakazawa (The University of Tokyo) - Joanna Nykiel (UC Davis) - David Oshima (Nagoya University) - Gerald Penn (University of Toronto) - Frank Richter (Goethe Universität Frankfurt) - Manfred Sailer (Goethe Universität Frankfurt) - Frank Van Eynde (Katholieke Universiteit Leuven) - Giuseppe Varaschin (Humboldt-Universität zu Berlin) - Elodie Winckel (Friedrich-Alexander Universität Erlangen-Nürnberg) - Shûichi Yatabe (The University of Tokyo) - Eun-Jung Yoo (Seoul National University) - Olga Zamaraeva (Universidade da Coruña) -- Dr. Antonio Machicao y Priemer Phone (office): +49/30/2093-9702 Homepage: https://hu.berlin/aMyP Address (office): Dorotheenstr. 24 (Room: 3.305), 10117 Berlin Address (post): Humboldt-Universität zu Berlin (Institut für deutsche Sprache und Linguistik) Unter den Linden 6, D-10099 Berlin -- Dr. Antonio Machicao y Priemer Department of German Studies and Linguistics - Humboldt-Universität zu Berlin Homepage: https://hu.berlin/aMyP Project: Building register into the architecture of language – an HPSG account (CRC 1412, Project A04) Series: Textbooks in Language Science (https://langsci-press.org/catalog/series/tbls)

1 0

Open PhD position on ethical challenges in NLP at the University of Bergen, Norway
by Samia Touileb 07 Apr '26

07 Apr '26

[Apologies for cross-posting] There is a vacancy for a PhD position at the Department of Information Science and Media Studies, at the University of Bergen, Norway. We are seeking a highly motivated candidate for a 4-year PhD position focused on ethical challenges in NLP, including topics such as bias, fairness, safety, and value alignment. The position offers flexibility for the candidate to develop and shape their own research questions, while contributing to the broader goal of developing and advancing responsible and ethically grounded NLP systems. Check the full announcement and application details here: https://www.jobbnorge.no/en/available-jobs/job/298789/phd-position-at-the-d… Closing date: June 7th, 2026 If you have any questions or would like additional information, feel free to contact me. Kind regards, Samia --- Samia Touileb Associate Professor in Natural Language Processing Department of Information Science and Media Studies, University of Bergen MediaFutures: Research Center for Responsible Media Technology & Innovation Fagspråksenteret: Centre for Norwegian Professional Language

1 0

[CFP] AACL-IJCNLP 2026
by ACL Announcements 07 Apr '26

07 Apr '26

Dear all, AACL-IJCNLP 2026 (the 5th AACL & 15th IJCNLP) invites the submission of long and short papers featuring substantial, original, and unpublished research in all aspects of Computational Linguistics and Natural Language Processing. == CFP: https://2026.aaclnet.org/calls/main_conference_papers/ [1] The conference will be held in Hengqin, China, from November 6th to November 10th, 2026. Important Dates ARR submission deadline (long & short papers) May 25, 2026 Reviewer registration deadline for ALL authors May 27, 2026 Author response and author-reviewer discussion July 7 - 13, 2026 Meta review released July 30, 2026 Commitment deadline August 26, 2026 Notification of acceptance (long & short papers) September 7, 2026 Camera-ready papers due (long & short) September 30, 2026 Main Conference (dates for Workshops/Tutorials TBD) November 6 - 10, 2026 Note: All deadlines are 11:59PM UTC-12:00 ("anywhere on Earth"). AACL-IJCNLP 2026 aims to have a broad technical program. Relevant topics for the conference include, but are not limited to, the following areas: Safety and Alignment in LLMs AI/LLM Agents Human-AI Interaction/Cooperation Retrieval-Augmented Language Models Mathematical, Symbolic, and Logical Reasoning in NLP Computational Social Science, Cultural Analytics, and NLP for Social Good Code Models Interpretability, Model Editing, Transparency, and Explainability LLM Efficiency Generalizability and Transfer Dialogue and Interactive Systems Discourse, Pragmatics, and Reasoning Low-resource Methods for NLP Ethics, Bias, and Fairness Natural Language Generation Information Extraction and Retrieval Linguistic theories, Cognitive Modeling and Psycholinguistics Machine Translation Multilinguality and Language Diversity Multimodality and Language Grounding to Vision, Robotics and Beyond Neurosymbolic approaches to NLP Phonology, Morphology and Word Segmentation Question Answering Resources and Evaluation Semantics: Lexical, Sentence-level Semantics, Textual Inference and Other areas Sentiment Analysis, Stylistic Analysis, and Argument Mining Speech Processing and Spoken Language Understanding Summarization Hierarchical Structure Prediction, Syntax, and Parsing NLP Applications == Presentation at the Conference All accepted papers must be presented at the conference to appear in the proceedings. The conference will include both in-person and virtual presentation options. Links: ------ [1] https://2026.aaclnet.org/calls/main_conference_papers/

1 0

*SEM 2026: Direct Commitment for ARR-Reviewed Papers (Deadline: 10 April 2026)
by Nedjma Ousidhoum 07 Apr '26

07 Apr '26

Dear corpora list members, *SEM 2026 (The 15th Joint Conference on Lexical and Computational Semantics), co-located with ACL 2026, welcomes direct commitments of pre-reviewed papers from ARR. If your paper has already been reviewed through ARR and you would like it to be considered for *SEM 2026, you can submit it through the direct commitment process. Deadline: April 10, 2026 Commitment link: https://openreview.net/group?id=aclweb.org/StarSEM/2026/Conference Important Dates (All deadlines are 11:59 PM UTC-12h, Anywhere on Earth) * Notification of acceptance: May 5, 2026 * Camera-ready deadline: May 26, 2026 * Conference date: July 3, 2026 (co-located with ACL 2026) Following ACL and ARR policies, there is no anonymity period requirement. More information: Website: https://starsem2026.github.io/ Call for Papers: https://starsem2026.github.io/calls/ Blog post: https://starsem2026.github.io/blog/ We look forward to your submissions. Best regards, *SEM Program Chairs.

1 1

CFP: Data-driven Storytelling Workshop (DDS 2026) at ISWC – Submit by July 24th
by Pasquale Lisena 07 Apr '26

07 Apr '26

Dear colleagues, We are pleased to announce the Data-driven Storytelling: Bridging Semantics, AI, and Narrative (DDS 2026) workshop, co-located with ISWC 2026. This workshop aims to bring together researchers and practitioners working at the intersection of knowledge graphs, NLP, HCI, and generative AI to explore how semantic technologies can enhance narrative creation and engagement. Submission Deadline: July 24th, 2026 (23:59 AoE) Notifications: August 21st, 2026 Camera-ready Version: September 18th, 2026 Workshop Dates: October 25-26, 2026 We invite submissions of research papers, demos, and short papers that address (but are not limited to) the following topics: • Knowledge graphs and ontologies for storytelling • AI-driven narrative generation (LLMs, GenAI) • Benchmarking narrative quality and coherence • Interactive and participatory storytelling tools • Ethics and explainability in automated storytelling Submission Link: https://data-driven-storytelling-workshop.replit.app/ We encourage submissions from interdisciplinary fields, including semantic web, NLP, HCI, and creative industries. For more details, visit the workshop website. Looking forward to your contributions! Best regards, Pasquale Lisena, EURECOM, France Maria Angela Pellegrino, University of Salerno, Italy Lisa-Yao Gan, Technical University Munich, Germany Yihang Zhao, King's College London, UK Yiwen Xing, University of Oxford, UK

1 0

Miscelánea Joins EU Diamond Discovery Hub – Call for Papers
by Miguel-Angel Benitez-Castro 06 Apr '26

06 Apr '26

Dear colleagues, I hope this e-mail finds you well. I am writing in my capacity as Section Editor for English Language and Linguistics at_ Miscelanea: A Journal of English and American Studies_: https://papiro.unizar.es/ojs/index.php/misc/en. I am really pleased to share with you all that _Miscelanea _has been included in the European Union's _Diamond Discovery Hub_ platform for high quality open-access journals: https://ddh.edch.eu/en/journals/3198 _Miscelanea_, with 72 issues published to date, and with number 73 coming out in June, is one of the longest-running international journals on English Studies in Spain. It is published and produced at the University of Zaragoza, and more specifically, by the Department of English and German Philology. _Miscelanea _is a double-blind peer-reviewed journal published twice a year (in December and June), and publishes articles on English language and linguistics, on literatures written in English, and on cinema and cultural studies from the English-speaking world. We welcome submissions all year round. As Section Editor for English Language and Linguistics, I will welcome any submission that draws upon any of the following areas and/or methodological approaches (to name but a few): * Descriptive linguistics; * Applied linguistics; * Discourse analysis, Critical Discourse Analysis and Corpus-Assisted Discourse Analysis; * Sociolinguistics; * Systemic-Functional Linguistics; * Translation Studies; Etc. I hope you will consider our publication as a potential outlet for your research. Looking forward to receiving and reading your work. With all my best wishes, Miguel-Angel -- Dr. Miguel-Angel Benitez-Castro Departamento de Filología Inglesa y Alemana Facultad de Ciencias Sociales y Humanas, Universidad de Zaragoza (Spain) C/ Atarazanas, 4, 44003, Teruel https://orcid.org/0000-0001-8514-5943 https://www.researchgate.net/profile/Miguel-Angel-Benitez-Castro https://scholar.google.com/citations?user=wx8VaDcAAAAJ&hl=es [1] Member of: _IUI Biocomputation and Physics of Complex Systems (BIFI) [2]_ Universidad de Zaragoza Language and Linguistics Editor _- Miscelánea: A Journal of English and American Studies [3]_ Links: ------ [1] https://scholar.google.com/citations?user=wx8VaDcAAAAJ&hl=es [2] https://bifi.es/ [3] https://papiro.unizar.es/ojs/index.php/misc/index

1 0

2nd Call for SemEval Task Proposals 2027
by Ekaterina Kochmar 06 Apr '26

06 Apr '26

Introduction We invite proposals for tasks to be run as part of SemEval-2027. SemEval (the International Workshop on Semantic Evaluation) is an ongoing series of evaluations of computational semantics systems, organized under the umbrella of SIGLEX, the Special Interest Group on the Lexicon of the Association for Computational Linguistics. SemEval tasks investigate the nature of meaning in natural languages, exploring how to characterize and compute meaning. This is achieved in practical terms, using shared datasets and standardized evaluation metrics to quantify the strengths and weaknesses and possible solutions. SemEval tasks encompass a broad range of semantic topics from the lexical level to the discourse level, including word sense identification, semantic parsing, coreference resolution, and sentiment analysis, among others. For SemEval-2027, we welcome tasks that can test an automatic system for semantic analysis of text (e.g., intrinsic semantic evaluation, or an application-oriented evaluation). We especially encourage tasks for languages other than English, cross-lingual tasks, and tasks that develop novel applications of computational semantics. See the websites of previous editions of SemEval to get an idea about the range of tasks explored, e.g., SemEval-2020 (http://alt.qcri.org/semeval2020/) and SemEval-2021/2026 (https://semeval.github.io<https://semeval.github.io/>). We strongly encourage proposals based on pilot studies that have already generated initial data, evaluation measures, and baselines. In this way, we can avoid unforeseen challenges down the road that may delay the task. We suggest providing a reasonable baseline (e.g., providing a Transformer / LLM baseline for a classification task) apart from the majority vote / random guess. In case you are not sure whether a task is suitable for SemEval, please feel free to get in touch with the SemEval organizers at <semevalorganizers(a)gmail.com<mailto:semevalorganizers@gmail.com>> to discuss your idea. The submission webpage is: https://softconf.com/acl2026/semevaltasks2027/ Task Selection Task proposals will be reviewed by experts, and reviews will serve as the basis for acceptance decisions. Everything else being equal, more innovative new tasks will be given preference over task reruns. Task proposals will be evaluated on: Novelty: Is the task on a compelling new problem that has not been explored much in the community? Is the task a rerun, but covering substantially new ground (new subtasks, new types of data, new languages, etc. - one addition is not sufficient)? Interest: Is the proposed task likely to attract a sufficient number of participants? Data: Are the plans for collecting data convincing? Will the resulting data be of high quality? Will annotations have meaningfully high inter-annotator agreements? Have all appropriate licenses for use and re-use of the data after the evaluation been secured? Have all international privacy concerns been addressed? Will the data annotation be ready on time? Evaluation: Is the methodology for evaluation sound? Is the necessary infrastructure available, or can it be built in time for the shared task? Will research inspired by this task be able to evaluate in the same manner and on the same data after the initial task? Is the task significantly challenging (e.g., room for improvement over the baselines)? Impact: What is the expected impact of the data in this task on future research beyond the SemEval Workshop? Ethical – The data must be compliant with privacy policies. e.g. avoid personally identifiable information (PII). Tasks aimed at identifying specific people will not be accepted. Avoid medical decision making (compliance with HIPAA, do not try to replace medical professionals, especially if it has anything to do with mental health). These are representative and not exhaustive. Roles: Lead Organizer - main point of contact, expected to ensure deliverables are met on time and participate in contributing to task duties (see below). Co-Organizers - provide significant contributions to ensuring the task runs smoothly. Some examples include maintaining communication with task participants, preparing data, creating and running evaluation scripts, leading paper reviewing, and acceptance. Advisory Organizers - more of a supervisor role, may not contribute to detailed tasks, but will provide guidance and support. New Tasks vs. Task Reruns We welcome both new tasks and task reruns. For a new task, the proposal should address whether the task would be able to attract participants. Preference will be given to novel tasks that have not received much attention yet. For reruns of previous shared tasks (whether or not the previous task was part of SemEval), the proposal should address the need for another iteration of the task. Valid reasons include: a new form of evaluation (e.g., a new evaluation metric, a new application-oriented scenario), new genres or domains (e.g., social media, domain-specific corpora), or a significant expansion in scale. We further discourage carrying over a previous task and just adding new subtasks, as this can lead to the accumulation of too many subtasks. Evaluating on a different dataset with the same task formulation, or evaluating on the same dataset with a different evaluation metric, typically should not be considered a separate subtask. Task Organization We welcome people who have never organized a SemEval task before, as well as those who have. Apart from providing a dataset, task organizers are expected to: - Verify the data annotations have sufficient inter-annotator agreement. - Verify licenses for the data allow its use in the competition and afterwards. In particular, text that is publicly available online is not necessarily in the public domain; unless a license has been provided, the author retains all rights associated with their work, including copying, sharing and publishing. For more information, see: https://creativecommons.org/faq/#what-is-copyright-and-why-does-it-matter - Resolve any potential security, privacy, or ethical concerns about the data. - Commit to make the data available also after the task in a long-term repository under an appropriate license, preferably using Zenodo: https://zenodo.org/communities/semeval/ - Provide task participants with format checkers and standard scorers. - Provide task participants with baseline systems to use as a starting point (in order to lower the obstacles to participation). A baseline system typically contains code that reads the data, creates a baseline response (e.g., random guessing, majority class prediction), and outputs the evaluation results. Whenever possible, baseline systems should be written in widely used programming languages and/or should be implemented as a component for standard NLP pipelines. - Create a mailing list and website for the task and post all relevant information there. - Create a CodaLab or other similar competition for the task and upload the evaluation script. - Manage submissions on CodaLab or a similar competition site. - Write a task description paper to be included in SemEval proceedings, and present it at the workshop. - Manage participants’ submissions of system description papers, manage participants’ peer review of each other’s papers, and possibly shepherd papers that need additional help in improving the writing. - Review other task description papers. Desk Rejects - To ensure tasks have sufficient support, we require a minimum of two organizers at the time of proposal submission. A task proposal with only one organizer will be desk-rejected. Running a SemEval task is a significant time commitment; therefore, we highly recommend that a task have at least three-four organizers. - A person can be a lead organizer on only one task. The second mandatory organizer on the task must be committed to the task as a key co-organizer. Any other organizers (beyond the lead and co-organizer) can participate in other tasks. - All data should have a research-friendly license. The licensing must be provided in the proposal. - Task organizers must commit to keeping the data available after the task, either by keeping the task alive, by uploading it to Zenodo or some other public data storage location that will be permanent, and sharing the link with the organizers. === Important dates === - Task proposals due 13 April 2026 (Anywhere on Earth) - Task selection notification 25 May 2026 === Preliminary timetable === - Sample data ready 15 July 2026 - Training data ready 1 September 2026 - Evaluation data ready 1 December 2026 (internal deadline; not for public release) - Evaluation start 10 January 2027 - Evaluation end by 31 January 2027 (latest date; task organizers may choose an earlier date) - Paper submission due February 2027 - Notification to authors March 2027 - Camera ready due April 2027 - SemEval workshop Summer 2027 (co-located with a major NLP conference) Tasks that fail to keep up with crucial deadlines (such as the dates for having the task and CodaLab website up and dates for uploading sample, training, and evaluation data) may be cancelled at the discretion of SemEval organizers. While consideration will be given to extenuating circumstances, our goal is to provide sufficient time for the participants to develop strong and well-thought-out systems. Cancelled tasks will be encouraged to submit proposals for the subsequent year’s SemEval. To reduce the risk of tasks failing to meet the deadlines, we are unlikely to accept multiple tasks with overlap in the task organizers. Submission Details The task proposal should be a self-contained document of no longer than 3 pages (plus additional pages for references). All submissions must be in PDF format, following the ACL template: https://github.com/acl-org/acl-style-files Each proposal should contain the following: - Overview - Summary of the task - Why this task is needed and which communities would be interested in participating - Expected impact of the task - Data & Resources - How the training/testing data will be produced. Please discuss whether existing corpora will be reused. - Details of copyright and license, so that the data can be used by the research community both during the SemEval evaluation and afterwards - How much data will be produced - How data quality will be ensured and evaluated - An example of what the data would look like - Resources required to produce the data and prepare the task for participants (annotation cost, annotation time, computation time, etc.) - Assessment of any concerns with respect to ethics, privacy, or security (e.g., personally identifiable information of private individuals; potential for systems to cause harm) - Pilot Task (strongly recommended) - Details of the pilot task - What lessons were learned, and how these will impact the task design - Evaluation - The evaluation methodology to be used, including clear evaluation criteria - For Task Reruns - Justification for why a new iteration of the task is needed (see criteria above) - What will differ from the previous iteration - Expected impact of the rerun compared with the previous iteration - Task organizers - Names, affiliations, email addresses - (optional) brief description of relevant experience or expertise - (if applicable) years and task numbers of any SemEval tasks you have run in the past Proposals will be reviewed by an independent group of area experts who may not have familiarity with recent SemEval tasks, and therefore, all proposals should be written in a self-explanatory manner and contain sufficient examples. The submission webpage is: https://softconf.com/acl2026/semevaltasks2027/ === Chairs === Debanjan Ghosh, Analog Devices, USA Kai North, Cambium Assessment, USA Ekaterina Kochmar, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), UAE Mamoru Komachi, Hitotsubashi University, Japan Marcos Zampieri, George Mason University, USA Contact: semevalorganizers(a)gmail.com<mailto:semevalorganizers@gmail.com>

1 1

2026

2025

2024

2023

2022

Corpora April 2026