First Call for Papers
13th Web-as-Corpus (WaC-13) Workshop @EMNLP2026, Budapest, Hungary, 24-29 Oct, 2026
https://wacky-workshop.github.io/
The World Wide Web has evolved from a resource for building linguistic corpora into the central data infrastructure powering modern natural language processing and Large Language Models (LLMs). As web-scale data increasingly shapes AI systems’ knowledge and capabilities, understanding its quality, representativeness, and ethical implications has become critical.
At the same time, the “more is better” paradigm is being challenged by issues such as machine-generated content, data toxicity, limited metadata, and the under-representation of many languages and domains. These challenges call for a shift toward Data-Centric AI, focusing on the curation, analysis, and responsible use of web-derived data.
The 13th Web-as-Corpus (WaC-13) workshop provides a multidisciplinary forum for research addressing the full lifecycle of web data. We invite submissions on methods, resources, and applications related to web corpora, with special emphasis on multilingual data and less-resourced languages.
Topics of interest include (but are not limited to):
* Creation and evaluation of high-quality datasets for foundation models (e.g., data collection, filtering, enrichment, language identification)
* Use of web data in empirical linguistic research
* Analysis of web-scale corpora for quality, representativeness, and societal insights
* Ethical and legal aspects of collecting, sharing, and using web data
By bringing together researchers from NLP, linguistics, and the social sciences, WaC aims to advance best practices for one of the field’s most influential data sources.
Important dates:
Direct paper submission deadline: 7 August, 2026
Pre-reviewed ARR commitment deadline: 1 September, 2026
Notification of acceptance: 5 September, 2026
Camera-ready paper due: 20 September, 2026
Conference dates: 24-29 Oct, 2026
Submissions:
Submissions will be possible through ARR commitment and through openreview.net (more details to follow on https://wacky-workshop.github.io/).
Workshop Organizers:
Nikola Ljubešić, Jožef Stefan Institute, Slovenia
Yves Scherrer, University of Oslo, Norway
Laurie Burchell, Common Crawl
Veronika Laippala, University of Turku, Finland
Pedro Ortiz Saurez, Common Crawl
Jen English, Common Crawl
Vuk Dinić, Jožef Stefan Institute, Slovenia
Dear colleagues in Corpora list,
good afternoon
Today, May 5, *World Portuguese Language Day* is being celebrated.
We are joining in this celebration by expanding our family of AI agents Evaristo.ai <https://evaristo.ai/>.
Nurtured within the *PORTULAN <https://portulanclarin.net/>* node of CLARIN Research Infrastructure, and developed in partnership with the *Academy of Sciences of Lisbon* and its grand reference dictionary <https://dicionario.acad-ciencias.pt/>, we are launching the new chatbot *Evaristo.ai – Linguagem clara* <https://evaristo.ai/linguagem-clara/>, which specialises in rewriting texts into plain language in accordance with the international standard ISO 24495-1.
This chatbot joins other chatbots we launched in recent months:
Evaristo.ai - Generalist <https://evaristo.ai/>
A generalist chatbot, powered by the knowledge in Gervásio 70B <https://huggingface.co/PORTULAN>, the largest open-source LLM specialising in Portuguese
(June 2025)
Evaristo.ai - Public services <https://evaristo.ai/servicos-publicos/>
A chatbot specialising in Portuguese public administration services, based on the gov.pt <https://www.gov.pt/> official website
(November 2025)
We invite you to try it out. Your feedback is most welcome.
Kind regards,
António Branco
Summer Schools: CDHSummer2026 (4th Corpus and Digital Humanities Summer School 2026) (China)
Host Institution: Nanjing Normal University, Nanjing, China
Coordinating Institution: Faculty of Arts and Humanities, University of Macau; School of Humanities and Social Science, The Hong Kong University of Science and Technology; Beijing Normal-Hong Kong Baptist University; College of Information Management, Nanjing Agricultural University
Dates: 25-Jul-2026 - 04-Aug-2026
Location: Nanjing, China
Minimum Education Level: Linguistic/History Students/teachers at University
Special Qualifications: Linguistic/History Students/teachers at University with little knowledge of computer science.
APPLICATIONS AND SUBMISSION GUIDELINES
SUBMISSION WINDOW: May 5 - May 12, 2026 (Beijing Time)
The offical language will be Chinese.
Applicants are required to complete the official application form and submit their Curriculum Vitae (CV), research background, learning objectives, and academic recommendation letters.
The organizing committee will conduct a merit-based selection process. Admission results will be announced by June 1, 2026, via email. Admitted participants are required to sign a letter of commitment. Once successfully enrolled, withdrawing from the program or switching tracks is not permitted without valid, exceptional reasons.
PARTICIPANT CAPACITY AND GRADUATION CRITERIA
- CAPACITY: Each track is strictly limited to 40 on-site and 40 online participants. The total program capacity is 240 participants.
- GRADUATION ASSESSMENT: Each participant must complete an independent research project by the end of the program, such as a humanities database, a statistical analysis report, or a prototype LLM tool.
- CERTIFICATION: Participants who successfully pass the final assessment will receive an official Certificate of Completion. Outstanding projects will be awarded an Excellence Certificate.
Focus: Driven by the rapid expansion of large-scale data ecosystems and Large Language Models (LLMs), research across the humanities and social sciences is undergoing a significant transformation. Traditional disciplines, including linguistics, literature, history, and philology, are increasingly adopting computational technologies to develop innovative, data-driven methodologies.
Central to this methodological shift is the development of reliable data infrastructure built upon well-annotated corpora. Furthermore, adapting and deploying artificial intelligence — particularly fine-tuned LLMs — for humanistic inquiry has become an essential skill for the next generation of scholars. To advance interdisciplinary research, empower scholars with both humanistic depth and computational expertise, and foster global academic exchange, Nanjing Normal University is proud to launch this intensive summer program in partnership with our esteemed co-hosts.
This program is dedicated mainly (but not exclusively) to undergraduates, postgraduates, and young researchers specializing in Digital Humanities, Computational Linguistics, Chinese Language and Literature, History, Philology, and related disciplines. Scholars focusing on the intersection of the humanities with large-scale datasets and LLMs are particularly welcome.
Description:
SUMMER SCHOOL ACTIVITIES AND CURRICULUM
The curriculum is designed to provide a deep dive into four core modules: Digital Humanities Theory, Cutting-edge Technologies, Corpora Construction and Standards, and Quantitative Statistical Methods.
1. PARALLEL TRAINING WORKSHOPS
Applicants must choose exactly ONE of the three parallel tracks. Each track consists of eight systematic lectures and hands-on coding practices, supported by 4 dedicated teaching assistants.
- TRACK A: DATABASE PROGRAMMING WORKSHOP - Instructor: Prof. Bin Li (Nanjing Normal University, China). Oriented toward beginners with no prior programming experience. Utilizing MySQL and PHP as the core development platform, the workshop uses classical texts, such as the "Complete Tang Poems", to teach data structuring, SQL querying, and interactive website development.
- TRACK B: LINGUISTIC STATISTICAL METHODS WORKSHOP - Instructor: Prof. Wei Shen (Central China Normal University, China). Focusing on quantitative analysis for linguistic and textual corpora, this track covers foundational statistics using SPSS software. Topics include parametric and non-parametric tests, clustering, correlation, chi-square tests, and regression models.
- TRACK C: PYTHON LARGE LANGUAGE MODEL PROGRAMMING WORKSHOP - Instructors: Prof. Dongbo Wang and Prof. Liu Liu (Nanjing Agricultural University, China). Designed for participants with foundational Python knowledge, this advanced track connects ancient texts with artificial intelligence. Using the ancient text LLM “Xunzi” as a case study, it covers prompt engineering, instruction fine-tuning, intelligent agent development, and LLM deployment in humanities contexts.
2. SUPPORTING ACADEMIC AND PRACTICAL EVENTS
- EXPERT LECTURE SERIES: 20 top-tier international and domestic scholars will deliver 20 premium academic lectures on cutting-edge algorithms and humanities research methodologies.
- THEMATIC ROUNDTABLE FORUMS: Specialized panels will host in-depth dialogues on "Opportunities and Challenges for Humanities in the LLM Era" and "The Future Trajectory of Linguistics and Digital Humanities".
- CULTURAL EXCURSION AND SEMINARS: Digital humanities field trips in the historical city of Nanjing, complemented by academic networking seminar sessions.
Linguistic Field(s): Applied Linguistics
Computational Linguistics
Text/Corpus Linguistics
Registration: 05-May-2026 to 12-May-2026
Apply by Email: dhbase2026(a)126.com
Apply on the web: https://v.wjx.cn/vm/rwUUUJ7.aspx
Registration Instructions:
The lectures and courses will be given in Chinese. The courses are free of charge. All travel, accommodation and catering expenses shall be self-funded.
Dear colleagues,
In the context of the AWASES project (AI-Aware Pathways to Sustainable Semiconductor Process and Manufacturing Technologies), we are organizing a two-day workshop titled “AI-Driven Semiconductor Discovery”, taking place on May 21–22, 2026 in Hannover, Germany. More information is available here: https://sites.google.com/view/aisem/home
We would like to extend an invitation via our call for posters and welcome submissions from your groups, particularly on AI for science topics. The call for posters can be found here: https://sites.google.com/view/aisem/call-for-posters
We welcome a wide range of contribution types, including original research, early-stage ideas, system demos, use cases, vision papers, and mature preprints. In the spirit of collaborative learning, we especially encourage submissions from early-career researchers and welcome multiple contributions. Participation is open to all career stages, and we look forward to an engaging exchange at the workshop.
Please feel free to share this call within your networks. If you have any questions about the workshop or venue, I would be happy to provide further information.
Best regards,
Jennifer D'Souza
(on behalf of the AISem'26 organizing team)
[Apologies for cross-posting]
There is a vacancy for a postdoctoral position at the Centre for Norwegian Professional Language, at the University of Bergen, Norway (https://www4.uib.no/en/research/research-centres/centre-for-norwegian-profe…). We are seeking a highly motivated candidate for a 3-year postdoctoral researcher position focused on the development of resources and models, including post-training of large language models.
The position offers flexibility for the candidate to develop and shape their own research directions. The work will be in close collaboration with the work package on terminology and professional language in particular, and the centre as a whole.
The position is affiliated with the Department of Linguistic, Literary, and Aesthetic Studies at the University of Bergen.
Check the full announcement and application details here: https://www.jobbnorge.no/en/available-jobs/job/300731/postdoctoral-fellow-i…
Closing date: June 1st, 2026
If you have any questions or would like additional information, feel free to contact me.
Kind regards,
Samia
---
Samia Touileb
Associate Professor in Natural Language Processing
Department of Information Science and Media Studies, University of Bergen
MediaFutures: Research Center for Responsible Media Technology & Innovation
Fagspråksenteret: Centre for Norwegian Professional Language
*** With apologies for multiple postings ***
Tenure track assistant professorship in Language Technology and Natural Language Processing, Centre for Language Technology, Department of Nordic Studies and Linguistics, UCPH
The Department of Nordic Studies and Linguistics, Faculty of Humanities, University of Copenhagen (UCPH), Denmark, invites applications for a position as tenure track assistant professor in language technology and natural language processing to be filled by the 1st of November 2026 or as soon as possible thereafter.
The successful candidate will be attached to the Centre for Language Technology (CST), which is one of the research centres of the department. CST conducts research in different areas of interest for language technology, such as Natural Language Processing, construction of NLP resources and benchmarks, Computational Cognitive Modeling and Multimodality, NLP infrastructure and policy, Representation Learning for NLP and Digital Humanities, among others, see also: https://cst.ku.dk/english/.
The Centre has a strong international profile, at the same time as pursuing the development of language technology methods and resources for the Danish language. The Centre has considerable experience managing international research projects, frequently attracts visiting researchers and has organised major conferences in the field. Together with the Department of Computer Science of the University of Copenhagen, it offers an international MSc programme in ‘IT and Cognition’. The program, which currently admits about 30 students a year, consists of a range of courses in the areas of Natural Language Processing and Cognitive Science. In addition, we teach at several BA programs and electives, such as the BA for Danish-speaking students ‘Kognitions- and Datavidenskab’ (Cognition and Data Sciences) hosted at the Department of Psychology, and the BA elective package ‘AI, Programmering og Sprogteknologi’ (AI, Programming, and Language Technology) for humanities students.
The successful candidate will contribute to one or more of the Centre’s research areas and will teach courses belonging to the above-mentioned study programs. We expect the candidates to have a research profile with a strong publication record from relevant, high impact conferences and journals.
Note that the position comes with an opportunity to negotiate a start package that will help leverage the candidate’s scientific deliverables in the initial years of employment. For more information on the position, and on how to apply for it, please see https://jobportal.ku.dk/tenure-track/?show=160799https://jobportal.ku.dk/te….
The closing date for applications is 23:59 CEST, 10 Juni 2026.
***************
Patrizia Paggio
Associate Professor
University of Copenhagen
Centre for Language Technology
paggio(a)hum.ku.dk
Professor (retired)
University of Malta
Institute of Linguistics and Language Technology
patrizia.paggio(a)um.edu.mt
25rd EDITION OF THE SEPLN AWARD TO THE BEST DOCTORAL THESIS IN NATURAL LANGUAGE PROCESSING
[Submission deadline: May 29th, 2026]
The Spanish Society for Natural Language Processing announces the 25rd Edition of the SEPLN Award for the Best Doctoral Thesis in Natural Language Processing, which will be governed by the following bases:
The purpose of this award is the promotion and dissemination of research in the field of natural language processing.
The thesis will be awarded with a compact laptop (tablet) and 300€ grant to help cover the cost of attending the conference. The award will be presented at the 42th International Congress of the Spanish Society for Natural Language Processing (SEPLN 2026), after a brief presentation of the award-winning work by the author.
In order to compete, the author of the doctoral thesis must be a member of the SEPLN at the time of submitting the work. No contestant may participate as an author in more than one work.
Doctoral theses read during the year 2025, written in a language of the Spanish State or in English, may be submitted to competition.
In addition to the complete thesis, it is essential to send:
a 4-page summary of the thesis, clearly describing the topic and the relevance of the research, the objectives, methods, results achieved and contributions.
a brief description of the scientific career of the author of the thesis, detailing the participation in scientific activities such as organization of competitive tasks, congresses, generation of open access resources such as sets of data, language models, etc., and participation in projects, contracts, and/or patents.
The quality of the presentation, the technical and methodological correctness, the relevance, originality, the generation, evaluation and publication of resources, as well as the research trajectory during the pre-doctoral period will be the criteria used for the award of the prize by the jury.
The works will be submitted through the website of the Society's magazine (http://journal.sepln.org) in PDF format before May 29th 2026.
The final decision will be communicated during the 41th International Congress of the Spanish Society for Natural Language Processing (SEPLN 2026).
Submission instructions (here)
For more information aitziber.atucha(a)ehu.eus
EDICIÓN XXV PREMIO SEPLN A LA MEJOR TESIS DOCTORAL EN PROCESAMIENTO DEL LENGUAJE NATURAL
[Plazo de presentación: 29 de mayo de 2026]
La Sociedad Española para el Procesamiento del Lenguaje Natural convoca la Edición XXV del Premio SEPLN a la Mejor Tesis Doctoral en Procesamiento del Lenguaje Natural, que se regirá por las siguientes bases:
La finalidad de este premio es la promoción y divulgación de la investigación en el campo del procesamiento del lenguaje natural.
La tesis será premiada con una computadora portátil compacta (tablet) y 300€ para la asistencia al congreso. Se dará entrega del premio en el XLII Congreso Internacional de la Sociedad Española del Procesamiento del Lenguaje Natural (SEPLN 2026), tras una breve presentación del trabajo premiado por parte del autor.
Para poder concursar, el autor de la tesis doctoral debe ser socio de la SEPLN en el momento de presentar el trabajo. Ninguna persona concursante podrá participar como autora en más de un trabajo.
Se podrán presentar a concurso tesis doctorales leídas durante el año 2025, escritas en una lengua del Estado español o en lengua inglesa.
Además de la tesis completa, es imprescindible enviar:
Un breve resumen de 4 páginas donde claramente se indique el tema y la relevancia de la investigación, los objetivos, métodos, resultados alcanzados y contribuciones.
Una breve descripción de la trayectoria científica del autor de la tesis, en la que se describa la participación en actividades científicas como organización de de tareas competitivas, congresos, generación de recursos open access como conjuntos de datos, modelos de lenguaje, etc, y participación en proyectos, contratos, y/o patentes.
La calidad de la presentación, la corrección técnica y metodológica, la relevancia, originalidad, la generación, evaluación y publicación de recursos, así como la trayectoria investigadora durante el periodo predoctoral serán los criterios empleados para la adjudicación del premio por parte del jurado.
Los trabajos se enviarán a través de la web de la revista de la Sociedad (http://journal.sepln.org) en formato PDF antes del 29 de mayo de 2026.
La resolución del premio se comunicará durante el 42 Congreso Internacional de la Sociedad Española del Procesamiento del Lenguaje Natural (SEPLN 2026).
Documento con las instrucciones (aquí)
Para más información dirigirse a aitziber.atucha(a)ehu.eus
<Apologies for cross-postings>
--------------------------------------------------------------
*CALL FOR PARTICIPATION - Final Phase Started *
--------------------------------------------------------------
MIRROR@IberLEF20206: Motivational Interviewing Response & Rating via
Synthetic cOnversational tuRns
Challenge platform: https://www.mirror-iberlef.lat/
The final phase of MIRROR has now started, make your submissions through
the platform to be considered for the final ranking.
-------------------------------
****Task description****
-------------------------------
We invite the community to develop Generative AI (GenAI) methods for
creating synthetic conversation turns that can substantially improve the
performance of models trained to recognize behavior codes (BCs) in the
context of motivational interviews. A BC is a discrete, observable
clinician action (e.g., asking a question, giving information) that is
counted during coding of a motivational interviewing session to quantify
specific techniques used. These codes allow raters to tally how often
particular clinician behaviours occur, which helps assess adherence to
MI-consistent versus MI-inconsistent practice. Our ultimate goal is to
generate valuable data for training models for the automatic assessment
of clinicians’ motivational-interviewing skills. These skills — crucial
for promoting behavior change among patients — can be evaluated by using
the “Motivational Interviewing Treatment Integrity (MITI)” rubric
(https://tinyurl.com/38byjrwy).
*
*
*This is a data-centric competition: *participants are expected to
produce high-quality datasets representing a wide range of clinical
conversations (rather than training a model) to enhance the performance
of a frozen baseline model used for BC classification. We encourage
participants to include samples featuring clients from diverse
backgrounds, varied conversation topics, and conversing with different
types of health professionals.
Participants in this competition should provide three datasets (one per
pair of considered BCs) of at most 100 labeled conversation turns that
will be used to fine-tune pretrained models; the fine-tuned models will
then be used to make predictions for a hold-out dataset. The performance
of the fine-tuned model will be used as the leading evaluation metric to
rank participants. The considered pairs of BCs are:
(1) Simple reflection vs. Complex reflection;
(2) Open question vs. Closed question;
(3) Persuasion vs. Giving Information.
Sample submissions, and detailed instructions on the formatting,
evaluation criteria and competition platform will be available at the
MIRROR website.
-------------------------------
****Important dates****
-------------------------------
* Mar 9th: Start of the development phase (platform starts receiving
submissions for the validation set)
* May 1st: Start of the final phase (platform starts receiving
submissions for the test set)
* May 11th: End of evaluation campaign (deadline for submission of runs)
* May 22nd; Publication of official results
* Jun 8th: Deadline for paper submission
* Jun 23th: Acceptance notification
* Jun 30th: Camera-ready submission deadline
* Sep, TBD: Publication of proceedings
* Sep, TBD: Workshop with SEPLN 2026
-------------------------------
****Organizing team****
-------------------------------
* Luis J. Arellano INAOE, Mexico
* Carlos Olachea INAOE, Mexico
* John Piette, University of Michigan, USA
* Hugo Jair Escalante, INAOE, Mexico
* Delia Irazú Hernández, INAOE, Mexico
* Luis Villaseñor, INAOE, Mexico
* Manuel Montes, INAOE, Mexico
Contact: Hugo Jair Escalante (hugo.jair(a)gmail.com)
*********
AVISO DE CONFIDENCIALIDAD: Este correo electrónico, incluyendo en su caso, los archivos adjuntos al mismo pueden contener información de carácter confidencial y/o privilegiada, y se envían a la atención única y exclusivamente de la persona y/o entidad a quien va dirigido. La copia, revisión, uso, revelación y/o distribución de dicha información confidencial sin la autorización por escrito del Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE) está prohibida. Si usted no es el destinatario a quien se dirige el presente correo, favor de contactar al remitente respondiendo al presente correo y eliminar el correo original incluyendo sus archivos, así como cualquiera copia de este.
Mediante la recepción del presente correo usted reconoce y acepta que en caso de incumplimiento de su parte y/o de sus representantes a los términos antes mencionados, este Centro Público de Investigación tendrá el derecho de reclamar los daños y perjuicios que dicha vulneración le cause; asimismo se hace de su conocimiento que el Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE) está obligado a salvaguardar los datos personales que le sean proporcionados por terceros, en los términos de la Ley General de Protección de Datos Personales en Posesión de Sujetos Obligados.
AVISO DE PRIVACIDAD, En cumplimiento con la Ley General de Protección de Datos Personales en Posesión de Sujetos Obligados, al recibir datos de carácter personal a través de este medio, se entiende el consentimiento expreso del titular de los datos personales para utilizarlos en actividades propias del Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE). Para mayor información, lo invitamos a consultar el Aviso de Privacidad en nuestro portal: https://www.inaoep.mx
Call for Participation
Workshop on Learning Non-Literal Expressions with Small Data (NLE 2026)
To be held in conjunction with LREC 2026 on 11 May 2026,
https://www.elra.info/lrec2026
Conference venue: Palau de Congressos de Palma, Palma de Mallorca
(Spain)
Website: https://sites.google.com/view/nle2026/home
Overview
Non-Literal Expressions (NLEs) in natural language are a reflection of
fundamental cognitive processes such as analogical reasoning and
categorisation, and are deeply rooted in everyday communication. NLEs
understanding is therefore an essential task for language modeling. This
task is especially challenging because it cannot be tackled by falling
back on individual word meanings, but requires taking into account
larger chunks of surrounding text or even contextual information. At the
same time, it is important because the reliable processing of NLEs is
relevant for optimizing downstream tasks like translation and
summarization.
This workshop focuses on understanding of Non-Literal Expressions. While
most of the earlier work on NLEs had been devoted to metaphor and
metonymy, recent activities target other forms of NLEs as well, e.g.,
hyperbole (deliberate exaggeration), litotes (understatement),
rhetorical questions, and irony. Humanly annotated corpora for NLEs have
very recently started becoming available to the research community and
may serve as the basis for data-driven approaches to NLEs processing,
with the interrelated goals of first identifying and then interpreting
such expressions. Such data is mostly of high linguistic quality, but
still very limited in size. Thus, the workshop's focus is on adaptation
of Language Models (LMs) and Deep Learning (DL) for processing of
Non-Literal Expressions with limited high-quality data, since such
constructs still pose big identification and processing challenges in
natural language analysis tasks.
The workshop focuses on the use of techniques like self-training for
leveraging unlabelled data, as well as in work that focuses on the
incorporation of external linguistic resources and knowledge injection
to enrich features, and also in research that describes work on
utilisation of multitask learning with the aim to benefit from related
tasks. The workshop highlights the necessity of high-quality data, as
well as cross-lingual datasets.
Invited Speaker
- Debanjan Ghosh, Princeton, USA
Workshop Program
Monday, May 11, 2026
9:00–13:00 Learning Non-Literal Expressions with Small Data
Room: 4
Chair: Valia Kordoni
9:00–9:10 Introduction
Oral Session 1
9:10–9:50 Challenges in Japanese Euphemism Classification: An Analysis
of Pretrained
Japanese and Multilingual Models
Noriko Takahashi, Whitney Poh, Libby Barak, JIng Peng and Anna
Feldman
9:50–10:10 Steering Pragmatic Interpretation in LLMs: A Diagnostic
Evaluation of Few-
Shot and Reasoning-Based Prompting for Indirect Speech Acts.
Massimiliano Orsini and Dominique Brunato
10:10–10:30 Injecting Structured Lexicographic Knowledge into LLMs for
Non-Literal
Expression Disambiguation: A Controlled Study on Croatian
Slobodan Beliga, Ivana Filipovic Petrović and Ana Meštrović
10:30–11:00 Coffee break
11:00–11:40 Poster session
- Metaphor Identification in Spanish Oncological Discourse: The Role of
Explicit Meaning in Low-Resource Settings
Lucia Pitarch, Jordi Bernad and Gemma Bel-Enguix
- Exploring Detection of Complex, Non-Literal Expressions of Cultural
Motifs
Ibrahim H. Alyami and Mark A. Finlayson
- Artful Writing, Authentic Emotions: Distinguishing Human-Written from
LLM-Generated Metaphors by Annotation
and Classification
Michaela Regneri, Nooshin Aghajari and Thomas Kroedel
- Creation and Validation of a Monolingual Spanish NLI Dataset for
MetaphorInterpretation via Model-in-the-Loop
Alec Sanchez-Montero, Gemma Bel-Enguix and SERGIO LUIS OJEDA TRUEBA
- A Hybrid Architecture for Metonymy Detection in Marathi
Pratibha Dongare
- Contextualising (Im)plausible Events Triggers Figurative Language
Annerose Eichel, Tonmoy Rakshit and Sabine Schulte im Walde
Oral Session 2
11:40–12:00 A Novel Dataset and Three Ways to Approach Automatic
Metaphor Detection in
German Religious Online Forums
Sebastian Reimann and Tatjana Scheffler
12:00–12:20 Decomposing Creativity: Two Small Datasets Combining
Originality Ratings and
Metaphor Annotations
Emilie Sitter, Sina Zarrieß, Omar Momen and Berenike Herrmann
Invited Talk
12:20–13:00 Unveiling Reasoning in Small Language Models: Insights into
Literal and Non-Literal Understanding
Debanjan Gosh
Endorsements
The workshop is endorsed by: Collaborative Research Centre 1412
"REGISTER" funded by the DFG Deutsche Forschungsgemeinschaft (German
Research Foundation)
Programme Committee
- Beata Beigman Klebanov, ETS, USA
- Maria Berger, Ruhr-Universität Bochum, Germany
- Yuri Bizzoni, Aarhus University, Denmark
- Kenneth Church, VecML Inc., USA
- Stefanie Dipper, Ruhr-Universität Bochum, Germany
- Markus Egg, Humboldt-Universität zu Berlin, Germany
- Anna Feldman, Montclair State University, USA
- Debanjan Ghosh, Princeton, USA
- Valia Kordoni, Humboldt-Universität zu Berlin, Germany
- Emmy Liu, CMU, USA
- Petya Osenova, Sofia University "St. Kl. Ohridski", Bulgaria
- Sebastian Padó, IMS Stuttgart, Germany
- Gudrun Reijnierse, Vrije Universiteit Amsterdam, The Netherlands
- Sebastian Reimann, Ruhr-Universität Bochum, Germany
- Adam Roussel, Ruhr-Universität Bochum, Germany
- Tatjana Scheffler, Ruhr-Universität Bochum, Germany
- Sabine Schulte im Walde, Universität Stuttgart
- Vered Shwartz, The University of British Columbia, Canada
- Caroline Sporleder, Georg-August-Universität Göttingen, Germany
- Egon Stemle, EURAC, Italy
Organizers
• Markus Egg — Humboldt-Universität zu Berlin, Germany
• Valia Kordoni - Humboldt-Universität zu Berlin, Germany
Contact: kordonie at rz.hu-berlin.de