- Corpora - ELRA lists

PAN @ CLEF 2026: Shared Tasks on Authorship Analysis, Computational Ethics, and Originality Call for Participation
by Martin Potthast 09 Feb '26

09 Feb '26

We'd like to invite you to participate in the following shared tasks at PAN 2026 held in conjunction with the CLEF conference in Jena, Germany. 1. Voight-Kampff Generative AI Detection. Given a (potentially obfuscated) text, decide whether it was written by a human or an AI. https://pan.webis.de/clef26/pan26-web/generated-content-analysis.html 2. Text Watermarking. Insert a watermark into a given text. Then, after we have attacked the text, detect the inserted watermark. https://pan.webis.de/clef26/pan26-web/text-watermarking.html 3. Multi-Author Writing Style Analysis. Given a document, determine at which positions the author changes. https://pan.webis.de/clef26/pan26-web/style-change-detection.html 4. Generative Plagiarism Detection. Given a document and a collection of documents, your task is to identify all sources in the collection that the document plagiarizes. https://pan.webis.de/clef26/pan26-web/generated-plagiarism-detection.html 5. Reasoning Trajectory Detection. Detect the source and safety of LLM-generated and human-written reasoning trajectories. https://pan.webis.de/clef26/pan26-web/reasoning-trajectory-detection.html More information: https://pan.webis.de/clef26/pan26-web/index.html Important Dates -------------------------- now Training Data Released April 23, 2026: Registration closes May 07, 2026: Software submission deadline May 28, 2026: Participant paper submission June 30, 2026: Peer review notification July 06, 2026: Camera-ready participant papers submission September 21-24, 2026: CLEF Conference Links -------------------------- PAN: https://pan.webis.de Contact: pan(a)webis.de We are looking forward to your submission! The PAN team

1 0

Call for Papers: LxGr2026
by Costas Gabrielatos 09 Feb '26

09 Feb '26

11th Symposium on Corpus Approaches to Lexicogrammar (LxGr2026) CALL FOR PAPERS Deadline for abstract submission: 1 March 2026 The symposium will take place online on Thursday 2 and Friday 3 July 2026 Invited Speakers Stefan Gries<https://www.stgries.info/> (University of California, Santa Barbara, USA) Martin Hilpert<http://members.unine.ch/martin.hilpert> (University of Neuchâtel, Switzerland) Serge Sharoff<https://ssharoff.github.io/> (University of Leeds, UK) Organiser: Costas Gabrielatos<https://ehu.ac.uk/gabrielatos> (Edge Hill University) LxGr primarily welcomes papers reporting on corpus-based research on any aspect of the interaction of lexis and grammar -- particularly studies that interrogate the system lexicogrammatically to get lexicogrammatical answers. However, position papers discussing theoretical or methodological issues, as well as descriptions or demonstrations of tools or resources are also welcome, as long as they are relevant to both lexicogrammar and corpus linguistics. If you would like to present, send an abstract of 500 words (excluding references) to lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>. * Abstracts for research papers should specify the research focus (research questions or hypotheses), the corpus, the methodology (techniques, metrics), the theoretical orientation, and the main findings. * Abstracts for position papers should specify the theoretical orientation and the potential contribution to both lexicogrammar and corpus linguistics. * Abstracts for tools or resources should provide a clear description of the main functions, and specify the potential contribution to both lexicogrammar and corpus linguistics. Full papers will be allocated 35 minutes (including 10 minutes for discussion). Work-in-progress reports will be allocated 20 minutes (including 5 minutes for discussion). There will be no parallel sessions. Participation is free. For details, visit the LxGr website: https://sites.edgehill.ac.uk/lxgr If you have any questions, please contact lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>. ________________________________ Edge Hill University<http://ehu.ac.uk/home/emailfooter> Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter> University of the Year, Educate North 2021/21 ________________________________ This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>

1 0

PhD Position in project "Relating Probabilities of Words to Probabilities of Worlds" at University of Bamberg
by Sean Papay 09 Feb '26

09 Feb '26

Dear all, We are looking to hire a PhD student for a term of three years at the Chair for Fundamentals of Natural Language Processing (https://www.uni-bamberg.de/en/nlproc/) at the University of Bamberg in Germany, starting either May 1 2026 or as soon as possible thereafter. The position will be part of the project "Relating Probabilities of Words to Probabilities of Worlds (PoWPoW)", to be funded by the German Research Foundation (DFG) as part of the Priority Programme LaSTing (https://www.lasting-spp.org/). The project’s goal is to investigate how LLMs represent and reason about probabilistic world knowledge. Within this scope, the project will explore methods for eliciting probability judgments from LLMs, test how well such judgments agree with empirical probabilities, investigate the internal consistency of related probability judgments, and compare LLMs’ probabilistic reasoning to that of humans. The full details can be found here: https://www.uni-bamberg.de/fileadmin/abt-personal/Homepage_ab_2016-03/10_St… Applicants should have: * M.Sc. or comparable degree in Computer Science, Computational Linguistics, or a related area. * Experience working with NLP methods and LLMs. * Strong grasp of probability theory – experience with graphical models or formal logic a bonus. * Good knowledge and practical experience with deep learning. * Very good command of English; German knowledge is not essential. * Ability to work in a team, excellent communication skills, enthusiasm, and intrinsic motivation. If you would like to apply, please send your application to Sean Papay <sean.papay(a)uni-bamberg.de>. Your application should be a single .pdf file, comprising a brief motivation letter, your CV, the contact details of two references, and an academic writing sample -- for example, your Master's Thesis or an academic paper previously written by you. Best regards, --Sean Papay

1 0

2nd CFP-GITT-2026 at EAMT 2026: Gender-Inclusive Translation Technologies
by bsavoldi＠fbk.eu 09 Feb '26

09 Feb '26

2nd CALL FOR PAPERS Fourth International Workshop on Gender-Inclusive Translation Technologies (GITT) at EAMT 2026 15 June 2026, Tilburg, The Netherlands https://sites.google.com/view/gitt2026/ @gitt-workshop.bsky.social Important Dates (Time zone: Anywhere on Earth) Submission deadline: 20 April, 2026 Notification of Acceptance: 13 May, 2026 Camera Ready Copy due: 20 May, 2026 Workshop: 15 June, 2026 **Aim and scope** The Gender-Inclusive Translation Technologies Workshop (GITT) is set out to be the dedicated workshop that focuses on gender-inclusive language in translation and cross-lingual scenarios. The workshop aims to bring together researchers from diverse areas, including industry partners, MT practitioners, and language professionals. GITT aims to encourage multidisciplinary research that develops and interrogates both solutions and challenges for addressing bias and promoting gender inclusivity in MT and translation tools, including LMs applications for the translation task. **Topics** GITT invites technical as well as non-technical submissions, which consist of experimental, theoretical or methodological contributions. We explicitly welcome interdisciplinary submissions and submissions that focus on innovative, non-binary linguistic strategies and/or with sociolinguistically-informed perspectives. The topics of interest include, but are not limited to: - Models or methods for assessing and mitigating gender bias - New resources for inclusive language and gender translation (e.g., datasets, translation memories, dictionaries) - Social, cross-lingual, and ethical implications of gender bias - Qualitative and quantitative analyses on the potential limits of current approaches to gender bias in translation and MT, error taxonomies as well as best practices and guidelines - User-centric case studies on the impact of biased language and/or mitigating approaches which can include translators, post-editors, or monolingual MT users GITT is also open to other non-listed topics aligned with the scope of the workshop and works focusing on non-textual modalities (e.g., audiovisual translation) **Submission** We welcome four types of submissions, two archival and two non-archival. ARCHIVAL - Research papers: of at least 4 up to 10 pages (excluding references) - Extended Abstracts: up to 2 pages (including references) Accepted papers and extended abstracts consisting of novel work will be published online as proceedings in the ACL Anthology. NON-ARCHIVAL - Research Communications: up to 2 pages (including references). We include a parallel submission policy in the form of Research Communications for papers related to the topic of GITT that were accepted in other venues in 2025 and 2026. - Potluck Communications: short abstract up to 500 words (including references). Potluck Communications offer a space for anyone—especially students and early career researchers—to discuss bold new ideas for collaboration, brainstorm about ongoing work, and explore future research directions. The communications will not be included in the proceedings, but will serve to promote the dissemination of research aligned with the scope of the workshop. All submissions should adhere to the EAMT 2026 guidelines and style templates (PDF, LaTeX, Word) and be uploaded on Easychair ( https://easychair.org/conferences?conf=eamt2026) **Workshop organizers** Manuel Lardelli, University of Padova Janiça Hackenbuchner, University of Ghent Luisa Bentivogli, Fondazione Bruno Kessler Joke Daems, University of Ghent Beatrice Savoldi, Fondazione Bruno Kessler Eleni Gkovedarou, University of Ghent

1 0

Call for Contributions: HUMIC – Humans and Machines in Conversation: Linguistic, Social and Relational Perspectives on Conversational AI
by Constantin Orasan 09 Feb '26

09 Feb '26

HUMIC – Humans and Machines in Conversation: Linguistic, Social and Relational Perspectives on Conversational AI https://www.ias.surrey.ac.uk/event/humic-humans-and-machines-in-conversatio… University of Surrey | In-person Workshop 16th June 2026 As generative AI and large language models reshape how we interact with chatbots, voice assistants and conversational agents, HUMIC focuses on the linguistic, social and relational dimensions of these technologies—areas often overlooked in technical development. HUMIC<https://www.ias.surrey.ac.uk/event/humic-humans-and-machines-in-conversatio…>, led by Dr. Doris Dippold and supported by the Surrey Institute for Advanced Studies, the BAAL Special Interest Group ‘Humans, Machines, Languages’ and the Surrey Institute for People-Centred AI, aims to foster interdisciplinary dialogue and connect academic insights with industry practice. Such insights are vital for developing conversational technologies that are context-aware, socially responsive, and cater for their users’ rapport needs. We invite contributions that explore the complex interplay between humans and machines with reference to these factors. We welcome submissions from researchers working across the disciplines, for example but not limited to linguistics, psychology, sociology, natural language processing, UX research, and conversation design. Submissions may focus on any domain. We particularly welcome submissions from industry, focusing for example on common challenges and practices in designing conversational systems with linguistic, social and relational perspectives in focus. During the workshop, participants will be invited to participate in a collaborative session. The session will encourage the generation of new research ideas and explore how research can respond to industry challenges. Selected works resulting from this workshop will be considered for a potential special issue. Keynote Speakers: * Maaike Groonewege (ConvoCat, Netherlands)<https://www.linkedin.com/in/maaikegroenewege/?originalSubdomain=nl> – Linguist and Conversation Designer * Bettina Migge (University College Dublin, Ireland)<https://people.ucd.ie/bettinamigge> – Language and AI Technology * Christian Hildebrand (University of St Gallen, Switzerland)<https://www.ibt.unisg.ch/team/christian-hildebrand/> – AI and Language in Consumer Behaviour We invite 300-word proposal on topics related to the workshop. Themes of interest include, but are not limited to: * Linguistic and pragmatic dimensions of human-machine dialogue * Social and relational dynamics in conversational AI, including rapport-building, empathy and trust * Designing inclusive and accessible conversational systems that account for the needs of diverse users (linguistic, cultural, neurodiverse) * Linguistic choices and their role in shaping user expectations, satisfaction, engagement and decision-making * Evaluation methods and metrics for linguistic, social and relational outcomes in human–machine interaction (qualitative, quantitative, or mixed-methods) * Model training and fine-tuning strategies for enhancing linguistic, social and relational outcomes in human–machine interaction. * Interdisciplinary and academic-industry collaboration in the development of linguistically, socially and relationally aware conversational technologies Accepted submissions will be assigned to oral or poster presentation formats according to the mode of presentation best suited to their content. Submission Details: * Abstract length: 300 words (excluding title, authors and references) * Deadline: 16th March 2026 * Notification of acceptance: 27th April 2025 * Submission: HUMIC – Humans and Machines in Conversation<https://forms.office.com/e/gyLXEs9QFi> ORGANISERS Dr Doris Dippold<https://www.surrey.ac.uk/people/doris-dippold>, Literature and Languages, University of Surrey Dr Fabio Fasoli<https://www.surrey.ac.uk/people/fabio-fasoli>, School of Psyschology, University of Surrey Dr Di Fu<https://www.surrey.ac.uk/people/di-fu>, School of Psychology, University of Surrey Dr Richard Green<https://www.surrey.ac.uk/people/richard-green>, School of Health Sciences, University of Surrey Assistant Professor Amal Haddad<https://www.ugr.es/personal/amal-haddad-haddad>, University of Granada Prof Constantin Orasan<https://www.surrey.ac.uk/people/constantin-orasan>, Literature and Languages, University of Surrey Dr Valentina Pitardi<https://www.surrey.ac.uk/people/valentina-pitardi>, Strategy, Marketing and International Business, University of Surrey --- Prof Constantin Orăsan Professor of Language and Translation Technologies Centre for Translation Studies<https://www.surrey.ac.uk/centre-translation-studies> Personal page: https://www.surrey.ac.uk/people/constantin-orasan Office: 06LC03, Phone: +44 (0) 1483 68 4115 Library and Learning Centre, University of Surrey, Guildford, Surrey, GU2 7XH, UK

1 0

Third CFP: Workshop on Structured Linguistic Data and Evaluation (SLiDE) at LREC 2026
by Petya Osenova 09 Feb '26

09 Feb '26

Workshop on Structured Linguistic Data and Evaluation (SLiDE) A full-day workshop at <https://lrec2026.info/> LREC 2026<https://lrec2026.info/>, 11-16 May 2026, Palma, Mallorca (Spain) The workshop will be held on May 11, 2026 Webpage: https://www.slide-workshop.org/ Third Call For Papers In the last ten years, significant advances in deep learning models and the development of Large Language Models (LLMs) have revolutionized the fields of computational linguistics (CL) and natural language processing (NLP). In turn, this has led to a complete re-assessment of the language resources and evaluation practices necessary for training LLMs and analyzing their outputs. In particular, the availability of very large amounts of unstructured data for training foundational models has come into focus, while the value of high-quality structured linguistic data with rich annotations at various levels of linguistic analysis has been downplayed by comparison. However, as CL and NLP practitioners engage further with LLMs and debate their strengths and weaknesses, the importance of high-quality, structured linguistic data has been re-emphasized. The proposed workshop can be seen as related to the Treebanks and Linguistic Theories (TLT) conference series and the more recent SyntaxFest venue. Over the years, these venues have provided a central forum for high-quality research on treebanks, syntactic theory, syntax-semantics interface, structured meaning representations, and annotated linguistic resources. With record participation in recent years, they demonstrate the vitality and relevance of this line of work. The Workshop on Structured Linguistic Data is conceived as both a continuation of this tradition and an adaptation to the new realities of an LLM-dominated research landscape. The workshop will bring together researchers from these overlapping traditions to advance methods, resources, and practices for integrating structured linguistic data into the LLM era. Topics of interest include but are not limited to: Linguistic Data Analyses, Language Resources, and Evaluation * Grammar processing with NLP and LLM-based tools * Phonological and morphological analysis and LLM tokenization * Annotation strategies with LLM-empowered methodologies and tools * Design principles and annotation schemes for structured linguistic data * Multi-lingual and cross-lingual settings * Mapping of structured linguistic data to Linked Open Data resources * Evaluation informed by language typology * Language resources for underresourced and endangered languages * The use of structured linguistic data for NLP applications * The use of structured linguistic data in acquiring linguistic knowledge * (Semi-)automatic methods for creating structured linguistic data Spoken language Data * Speech-to-text applications * Speech Generation techniques * Speech data preparation, curation and evaluation Multimodality and Situated Dialogue * Structured multimodal resources: gesture AMR (GAMR), gaze and posture annotation, multimodal dialogue corpora. * Multimodal grounding: linking language with visual, gestural, and action representations * Structured representations for co-attention and alignment in multiparty dialogue * Multimodal evaluation resources for LLMs Pragmatics and Discourse * Structured data for discourse and dialogue: discourse relation annotation, coherence structures, dialogue acts * Pragmatic annotation (speech acts, presupposition, implicature, politeness, stance) * Structured approaches to common ground tracking and Theory of Mind in LLMs Semantics and Lexical Meaning * Dependency analysis and semantic parsing * Annotation beyond syntax: semantics, pragmatics and discourse * Structured data for lexical semantics: sense inventories, semantic frames, qualia structure, and type-theoretic resources * Computational semantics resources: Abstract Meaning Representation (AMR), Universal Meaning Representation (UMR), Discourse, Representation Structures, Minimal Recursion Semantics (MRS), Type Theory with Records (TTR) * Distributional and neural-symbolic representations of lexical meaning: (e.g., Holographic Reduced Representations (HRR), hyperdimensional computing) for structured LLM grounding * Aligning vector-based meaning representations with symbolic/typed structures We invite paper submissions in two distinct tracks: * regular papers on substantial and original research, including empirical evaluation results, where appropriate – 8 pages excluding references and potential ethics statements; * short papers on smaller, focused contributions, work in progress, negative results, surveys, or opinion pieces – 4 pages excluding references and potential ethics statements. Invited speakers Naiara Perez (University of the Basque Country) Shira Wein (Amherst College) Paper Submission and Templates * Submission follows the LREC 2026 conference instructions, using the Softconf START conference management system accessible through the following link: https://softconf.com/lrec2026/SLiDE/ * Submissions should follow the LREC stylesheet, available on the conference website on the <https://lrec2026.info/authors-kit/> Author’s kit<https://lrec2026.info/authors-kit/> page. Papers must be anonymized to support double-blind reviewing. Important Dates February 22, 2026: Paper submission deadline March 15, 2026: Notification of acceptance March 25, 2026: Camera-ready papers May 2026: Workshop at LREC 2026 All deadlines are 11.59 pm UTC -12h (“anywhere on Earth”). Workshop Organizers Jan Hajič (Prague University, Czech Republic) Erhard Hinrichs (Tübingen University, Germany) Sandra Kübler (Indiana University, USA) Joakim Nivre (Uppsala University, Sweden) Petya Osenova (Sofia University, Bulgaria) James Pustejovsky (Brandeis University, USA)

1 0

Second CfP LANLP: Bridging Ibero and Latin American NLP communities Co-located Networking Symposium @ LREC 2026
by GAMALLO OTERO PABLO 09 Feb '26

09 Feb '26

Second Call for Papers LANLP: Bridging Ibero and Latin American NLP communities 16 May 2026, Palma de Mallorca, Spain http:<http://lanlp>https://sites.google.com/view/lanlp2026/home Co-located Networking Symposium @ LREC 2026 https://lrec2026.info/ Description and Goals We organise a Networking Symposium on Latin American NLP (LANLP), focusing on natural language processing for the diverse languages of the Iberian Peninsula and Latin America. This region includes major world languages (e.g. Spanish (~558M speakers), Portuguese (~267M) as well as regional and indigenous languages. For example, Latin America alone hosts tens of millions of speakers of Quechua (~10M), Guaraní (>6M), Nahuatl (~2M), Aymara (~2M), among many others. Such languages are highly under‐resourced: over 88% of the world’s languages remain largely unsupported by language technologies. This networking event addresses that gap by promoting collaboration on ethically and culturally sensitive resource creation, evaluation, and novel methods for low-resource multilingual NLP in Iberian and Latin American languages and varieties. Our goal is to bring together communities (SEPLN<http://www.sepln.org/>, CLARIAH-ES<https://www.clariah.es/>, PROPOR<https://propor2024.citius.gal/>, AmericasNLP<https://turing.iimas.unam.mx/americasnlp/index.html>, and SomosNLP<https://somosnlp.org/>) to share cutting-edge research, language resources, and best practices. LANLP focuses on community-driven resource development and evaluation for Iberian languages, and diverse Latin American languages (including indigenous and minority languages). We aim to bridge regional communities: for instance, past forums like OpenCor note that “Latin American and Iberian communities... did not have an established event” to share initiatives, corpora and tools. LANLP fills this gap, fostering new contacts between Iberian and Latin American NLP research groups. The goals are to (1) highlight challenges in processing these languages, (2) share novel datasets and models, and (3) catalyze future collaborations and shared tasks. We emphasize both academic rigor and community inclusivity, encouraging contributions from established researchers and grassroots language advocates alike. Topics of Interest We invite submissions on topics including (but not limited to): * Language resource creation: Corpora, lexicons, and annotations for Iberian and Latin American languages (text, speech, multimodal). * LLMs opportunities and challenges: Small Language Models, synthetic data, mitigating biases, linguistic inequalities, data scarcity, language domination. * Multilingual transfer & modeling: Cross-lingual and multilingual representations, transfer learning, and embedding methods that bridge Spanish, Portuguese, varieties and minority languages. * Machine translation & generation: MT, summarization, and language generation for Spanish, Portuguese, and low-resource languages (e.g., Quechua, Aymara, Nahuatl). * Speech and audio processing: ASR, TTS, and spoken language resources for under-resourced languages and regional dialects (e.g. indigenous languages, Brazilian Portuguese, Latin American Spanish). * Dialectal and code-switching NLP: Identification and handling of dialectal variation and code-switching (e.g. Spanish–Portuguese code-mixing, Spanish–indigenous language contact). * Morphology and syntax: Analysis and tagging for morphologically rich or under-documented languages (e.g. Basque, Mapudungun, Bribri) using universal dependencies or other frameworks. * Domain-specific NLP: Social media, sentiment, hate-speech detection, and other tasks in Iberian and Latin American language contexts (e.g. Latin American social media analysis). * Digital humanities & cultural heritage: NLP for historical texts, literature, and cultural content in Spanish, Portuguese, and regional languages. * Community-driven methods: Crowdsourcing, citizen science, and participatory approaches for data collection and annotation in these languages. * Evaluation and benchmarks: Development of evaluation metrics and benchmarks tailored to low-resource Iberian/Latin languages. * Ethical and social issues: Fairness, bias, and indigenous language rights in NLP; collaboration with native speaker communities; data governance and sustainability of resources. Important dates * February 18, 2026: Paper submission deadline * March 20, 2026 Notification of acceptance * March 30, 2026: Camera-ready deadline * May 16, 2026: Networking Symposium Date Submission Instructions We invite non anonymous submissions in English, Spanish or Portuguese on the topics of interest between 4 and 8 pages of content. The page limit of 8 pages does not include acknowledgements, references, potential Ethics Statements and discussion on Limitations in line with the policy of the main LREC conference. All submissions must follow the LREC stylesheet (https://lrec2026.info/authors-kit/). Any submissions which are over-length, poorly formatted or make excessive use of appendices to circumvent page limits are liable to desk-rejection. At the time of submission, authors are offered the opportunity to share related language resources with the community. All repository entries are linked to the LRE Map (https://lremap.elra.info/), which provides metadata for the resource. Organizing Committee * Luis Chiruzzo Inco (AmericasNLP, luischir(a)fing.edu.uy<mailto:luischir@fing.edu.uy>) * Pablo Gamallo (PROPOR, CiTIUS, pablo.gamallo(a)usc.gal<mailto:pablo.gamallo@usc.gal>) * María Grandury (SomosNLP, EPFL, mariagrandury(a)gmail.com<mailto:mariagrandury@gmail.com>) * Rafael Muñoz Guillena (SEPLN, CENID, UA, rafael(a)dlsi.ua.es<mailto:rafael@dlsi.ua.es>) * German Rigau Claramunt (CLARIAH-ES. HiTZ Center, EHU, german.rigau(a)ehu.eus<mailto:german.rigau@ehu.eus>)

1 0

Four PhD and four Postdoc positions in Copenhagen, Denmark
by Johannes Bjerva 09 Feb '26

09 Feb '26

The 𝐍𝐚𝐭𝐮𝐫𝐚𝐥 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐏𝐫𝐨𝐜𝐞𝐬𝐬𝐢𝐧𝐠 𝐠𝐫𝐨𝐮𝐩 𝐚𝐭 Aalborg University Copenhagen (AAU-NLP) is growing fast in 2026, and we’re excited to announce that we are hiring 𝐟𝐨𝐮𝐫 𝐏𝐡𝐃 𝐬𝐭𝐮𝐝𝐞𝐧𝐭𝐬 (3 years), and 𝐟𝐨𝐮𝐫 𝐩𝐨𝐬𝐭𝐝𝐨𝐜𝐬 (1–3 years each), to join us at the beautiful waterfront Copenhagen Campus of Aalborg University. This new cohort will join our expanding research efforts on 𝐒𝐞𝐜𝐮𝐫𝐢𝐭𝐲, 𝐒𝐚𝐟𝐞𝐭𝐲 𝐚𝐧𝐝 𝐏𝐫𝐢𝐯𝐚𝐜𝐲 𝐢𝐧 𝐀𝐈, with a strong emphasis on 𝐍𝐋𝐏 𝐚𝐧𝐝 𝐋𝐋𝐌𝐬 across diverse cultural and linguistic settings. We’re interested in candidates with solid backgrounds in 𝐦𝐚𝐜𝐡𝐢𝐧𝐞 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠, 𝐍𝐋𝐏, 𝐚𝐧𝐝 𝐬𝐞𝐜𝐮𝐫𝐢𝐭𝐲 as well as those coming from 𝐜𝐨𝐦𝐩𝐮𝐭𝐚𝐭𝐢𝐨𝐧𝐚𝐥 𝐥𝐢𝐧𝐠𝐮𝐢𝐬𝐭𝐢𝐜𝐬 who want to work at the intersection of language and AI. Some example areas of interest include: • NLP/LLM security, adversarial robustness, and backdoor detection • Memorization and Privacy in LLMs • Safety, factuality, and trustworthy multilingual language technologies • Linguistically informed modelling and cross-cultural evaluation • Interpretability, semantics, and behaviour analysis of LLMs • Low-resource language processing and multilingual robustness Within these areas, the projects offer 𝐬𝐮𝐛𝐬𝐭𝐚𝐧𝐭𝐢𝐚𝐥 𝐟𝐫𝐞𝐞𝐝𝐨𝐦 𝐭𝐨 𝐞𝐱𝐩𝐥𝐨𝐫𝐞 𝐲𝐨𝐮𝐫 𝐨𝐰𝐧 𝐢𝐝𝐞𝐚𝐬 - we believe the best research is driven by curiosity and creativity. Apply here: * PhDs (deadline 17 February): https://www.vacancies.aau.dk/phd-positions/show-vacancy/vacancyId/884796 * Postdocs (deadline 4 March): https://www.vacancies.aau.dk/scientific-positions/show-vacancy/vacancyId/88… Interested applicants are encouraged to reach out to the project PI, Johannes Bjerva, at jbjerva(a)cs.aau.dk

1 0

Last CFP: *SEM 2026: The 15th Joint Conference on Lexical and Computational Semantics [3 July 2026, Co-located with ACL 2026]
by Nedjma Ousidhoum 09 Feb '26

09 Feb '26

Dear corpora-list members, Apologies for cross-posting. The 15th Joint Conference on Lexical and Computational Semantics (*SEM2026), will be co-located with ACL 2026 on the 3rd of July. The call for papers can be found here<https://starsem2026.github.io/calls/> and below. Important Dates (All deadlines are 11:59pm UTC-12h, AoE) - Direct submission deadline (long & short papers): Feb 13, 2026 - Notification of acceptance: May 5, 2026 - Camera-ready deadline: May 26, 2026 - Conference date: July 6, 2026 (co-located with ACL 2026) - Following the ACL and ARR policies<https://www.aclweb.org/portal/content/report-acl-committee-anonymity-policy>, there is no anonymity period requirement. Website https://starsem2026.github.io/ Direct submission link https://openreview.net/group?id=STARSEM/2026/Conference Call for Papers *SEM brings together researchers interested in the semantics of natural languages and its computational modelling. The conference embraces a wide range of approaches including data-driven, neural, probabilistic and symbolic; practical applications as well as theoretical contributions are welcome. The long-term goal of *SEM is to provide a forum for NLP researchers working on any aspect of natural language semantics. *SEM invites submissions related to the computational modelling of natural language semantics (understood broadly) and its application. Relevant areas include (but are not limited to) theoretical aspects of computational semantics, empirical and data-driven approaches, resources, evaluation, and applications/tools. *SEM encourages authors to consider ethical aspects of their work, and to address and discuss ethical questions and implications relevant to their research. *SEM also values reproducibility and particularly welcomes submissions that adhere to the reproducibility guidelines as specified here. Please fill out this form<https://forms.gle/634oW3yvtTkur6qL9> if you would like to volunteer as a reviewer or as an Area Chair. Questions may be directed to: startsem-2026-pcs(a)googlegroups.com<mailto:startsem-2026-pcs@googlegroups.com> New for *SEM 2026 1. One Day Conference: Unlike past iterations, *Sem 2026 will be a one-day conference. (ACL has informed us that this is due to venue size limitations.) 2. Centering Research Questions: Research questions in *Sem, and NLP generally, can be roughly categorized into those that address: * new findings about language (linguistic phenomena, semantic patterns), * new findings about people (language use, behavior, health, ethics, etc.), * new findings about automatic language processing (advancing language understanding through ML/AI and other approaches). Centering and explicitly articulating the research question helps authors frame and present their contribution more clearly. It also helps reviewers and Area Chairs evaluate the work within the appropriate context. For example, a paper that centers a compelling linguistic or behavioral research question and offers meaningful new insights need not also introduce methodological novelty or rely on the latest models (including LLMs). A simple and interpretable approach may make good sense. To support this, the *SEM 2026 submission form asks authors to explicitly identify the predominant research question type for their work, as well as any additional categories that apply. There are no quotas for accepted papers of different types, and submissions will not receive preferential treatment based on category selection. Including this information also allows *SEM to track the kinds of research questions authors pursue and how the conference’s focus evolves over time. 3. Lasting Impact Modern NLP and ML papers have often been criticized for being overly incremental or becoming obsolete shortly after publication. To encourage work with broader scientific value and longer-term relevance, reviewers of *Sem 2026 will be asked to explicitly assess the potential lasting impact of each submission. This assessment will be included as a short-written justification and will factor into the overall recommendation. Importantly, a healthy research ecosystem requires diversity in the time horizons of research contributions. Some papers offer immediate practical value; others generate insights or resources whose importance unfolds over years. *Sem 2026 welcomes this full spectrum. Reviewers should evaluate the potential for lasting influence—not only immediate performance gains. Work can have a lasting impact in many ways. See our blog post<https://starsem2026.github.io/blog/> on this. Topics of Interest (non-exhaustive) * Compositional semantics and sentence representations * Statistical, machine learning, and deep learning methods in semantic tasks * Multilingual and cross-lingual semantics * Word sense disambiguation and induction * Sentiment Analysis, Computational Affective Science, Stylistic Analysis, and Argument Mining * Computational Social Science, Digital Humanities, and Cultural Analytics * Semantic parsing, and syntax-semantics interface * Frame semantics and semantic role labeling * Textual inference, textual entailment, and question answering * Formal approaches to semantics * Extraction of events and of causal and temporal relations * Entity linking, pronouns and coreference * Discourse, pragmatics, and dialogue * Machine reading * Abusive language detection, Fact verification and related tasks * Extra-propositional aspects of meaning * Multiword and idiomatic expressions * Metaphor, irony, and humor processing * Knowledge mining and acquisition * Common sense reasoning * Language generation * Multidisciplinary research on semantics * Grounding and multimodal semantics * Psycholinguistics * Interpretability and Explainability * Human semantic processing * Semantic annotation, evaluation, and resources * NLP Applications * Ethical aspects and bias in semantic representations Submission Instructions Submissions must describe unpublished work and be written in English. We solicit both long and short papers. Long papers describe original research and may consist of up to eight (8) pages of content, plus unlimited pages for references. Appendices are allowed after the references, but the paper should be self-contained, and reviewers will not be required to check the appendices, if any. Final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers' comments can be taken into account. Short papers describe original focused research and may consist of up to four (4) pages, plus unlimited pages for references. Upon acceptance, short papers will be given five (5) content pages in the proceedings. Authors are encouraged to use this additional page to address reviewers' comments in their final versions. Limitations and Ethics Statement sections are allowed and encouraged but are not mandatory. These sections should be placed after the conclusion and will not count towards the overall page limit. Submissions should follow the ARR formatting requirements<https://github.com/acl-org/acl-style-files>. Submission routes and deadlines *SEM solicits direct submissions (not through ARR). The deadline for direct submissions is Feb 13, 2026, and these submissions will be reviewed by the *SEM2025 program committee. Submissions are made through OpenReview. Multiple submission policy *SEM does not prohibit the submission of work that is under consideration for another venue at the same time as the *SEM review period. However, authors of such papers will be asked to declare this at submission time. Best regards, The *SEM 2026 Program Chairs.

1 0

Second call for participation: UMR parsing shared task
by Dan Zeman 08 Feb '26

08 Feb '26

Second Call for Participation: Shared Task in Parsing into Uniform Meaning Representation (UMR) https://ufal.mff.cuni.cz/umr-parsing Uniform Meaning Representation <https://umr4nlp.github.io/web/> (UMR) is designed as a typologically-aware successor of AMR, with various modifications and additions, including explicit token-node alignment and document-level relations. Parsers previously proposed for AMR are thus not directly applicable to UMR, although they may serve as a baseline for development of UMR parsers. The few attempts at UMR parsing published so far focused mostly on English. The first shared task in UMR parsing aims to perform evaluation of parsing systems on multiple languages and to assess the current state of the art. The shared task is collocated with the DMR 2006 workshop <https://dmr2026.github.io/> (and with LREC), to be held on 11 May 2026 in Palma de Mallorca, Spain. The shared task is now in its development phase, with training data available for six languages (Arapaho, Chinese, Czech, English, Latin, Navajo). The test phase is schedule to run between 16 and 27 February 2026. Registration is open until the end of the test phase. For more details, see the shared task homepage at https://ufal.mff.cuni.cz/umr-parsing. Dan Zeman, Jan Štěpánek & the UMR community

1 0

2026

2025

2024

2023

2022