Workshop on Learning Non-Literal Expressions with Small Data
To be held in conjunction with LREC 2026, Palma de Mallorca, Spain on 11 May 2026.
Overview Non-Literal Expressions (NLEs) in natural language are a reflection of fundamental cognitive processes such as analogical reasoning and categorisation, and are deeply rooted in everyday communication. NLEs understanding is therefore an essential task for language modeling. This task is especially challenging because it cannot be tackled by falling back on individual word meanings, but requires taking into account larger chunks of surrounding text or even contextual information. At the same time, it is important because the reliable processing of NLEs is relevant for optimizing downstream tasks like translation and summarization.
This workshop focuses on understanding of Non-Literal Expressions. While most of the earlier work on NLEs had been devoted to metaphor and metonymy, recent activities target other forms of NLEs as well, e.g., hyperbole (deliberate exaggeration), litotes (understatement), rhetorical questions, and irony. Humanly annotated corpora for NLEs have very recently started becoming available to the research community and may serve as the basis for data-driven approaches to NLEs processing, with the interrelated goals of first identifying and then interpreting such expressions. Such data is mostly of high linguistic quality, but still very limited in size. Thus, the workshop’s focus is on adaptation of Language Models (LMs) and Deep Learning (DL) for processing of Non-Literal Expressions with limited high-quality data, since such constructs still pose big identification and processing challenges in natural language analysis tasks.
Topics of Interest We are interested in contributions which focus on the use of techniques like self-training for leveraging unlabelled data, as well as in work that focuses on the incorporation of external linguistic resources and knowledge injection to enrich features, and also in research that describes work on utilisation of multitask learning with the aim to benefit from related tasks.
The workshop also wants to discuss alternative approaches which may elaborate on the use of pre-trained Language Models (LMs) as a foundation and the application of techniques like contrastive learning and clustering to identify challenging examples within the data, the ultimate aim of the workshop being to highlight the necessity of high-quality data, as well as cross-lingual datasets.
Invited Speakers
- Prof. Barbara Plank, LMU Munich (https://bplank.github.io/)
- Dr. Debanjan Ghosh, Princeton, USA
Details will be announced on the workshop website (tba).
Submission Guidelines Papers must be submitted electronically through Softconf: [link to come]. Submissions should: • Be 4–8 pages, excluding references and optional Ethics Statements • Follow the LREC 2026 style guidelines, available on the conference website: https://lrec2026.info/authors-kit/ • Use templates provided here: https://lrec2026.info/calls/second-call-for-papers/
Authors will be asked to supply information on any language resources (broadly defined — data, tools, standards, evaluation sets, etc.) used in or resulting from their work. ELRA strongly encourages sharing such resources to support reproducibility and reuse.
Accepted papers will appear in the workshop proceedings. Presentation format (oral/poster) will be based solely on how best to communicate the work.
Important Dates • 20 February 2026 — Submission Deadline • 11 March 2026 — Notification of Acceptance • 28 March 2026 — Camera-ready Papers Due
Endorsements The workshop is endorsed by: Collaborative Research Centre 1412 "REGISTER" funded by the DFG Deutsche Forschungsgemeinschaft (German Research Foundation)
Organizers • Markus Egg — Humboldt-Universität zu Berlin, Germany • Valia Kordoni - Humboldt-Universität zu Berlin, Germany
Contact: kordonie at rz.hu-berlin.de