The 2nd LLMs4Subjects Shared Task: LLM-based Subject Tagging for the TIB Technical Library's Digital Catalog
Theme: The Development of Energy- and Compute-Efficient LLM Systems
Organized as part of the German Evaluation (GermEval 2025) Shared Task Series
10. - 12. September, 2025
Hildesheim, Germany
(co-located with KONVENS 2025 - Conference on Natural Language Processing)
2nd LLMs4Subjects Shared Task: https://sites.google.com/view/llms4subjects-germeval/
KONVENS 2025: https://konvens-2025.hs-hannover.de/about/
Task Overview
LLMs4Subjects challenges the research community to develop cutting-edge LLM-based solutions for subject tagging of technical records from Leibniz University's Technical Library (TIBKAT). Participants are tasked with leveraging large language models (LLMs) to tag technical records using the GND taxonomy. The task involves bilingual language modeling, as systems must process technical documents in both German and English. Successful solutions may be integrated into the operational workflows of TIB, the Leibniz Information Centre for Science and Technology.
With the rapid advancements in LLMs, the focus is shifting toward making these models more energy- and compute-efficient while maintaining high performance. Recent innovations, such as the DeepSeek series, have demonstrated how techniques like mixture-of-experts (MoE) and model distillation can significantly reduce computational costs without sacrificing effectiveness.
The 2nd LLMs4Subjects shared task highlights the importance of efficiency in LLMs, encouraging participants to explore strategies that enhance model performance while optimizing for energy consumption and inference speed. We welcome approaches (but not limited to) that leverage model compression, quantization, efficient fine-tuning, and adaptive computation techniques to push the boundaries of sustainable AI development.
Subtasks
The 2nd LLMs4Subjects shared task organizes the following two subtasks:
Subtask 1 - Multi-Domain Classification of Library Records
Subtask 2 - Large-scale Multilabel Subject Indexing of Library Records
Important Dates
* Release of training data: March 8, 2025
* Release of testing data: May 23, 2025
* Deadline for system submissions: June 2, 2025
* Evaluation end: June 27, 2025
* Paper submission deadline: July 7, 2025
* Notification of acceptance: June 28, 2025
* Camera-ready paper due: August 15, 2025
* Workshop/KONVENS: September 10 - 12, 2025 (TBA)
A PhD position at the University of Groningen, the Netherlands:
- Fully-funded 4-year position
- Research focus: using computational models (including small probabilistic
models and neural network language models) to study the acquisition of
modal verbs
- Programming skills and background in linguistics / language acquisition
required
- Supervisors: Annemarie van Dooren, Yevgen Matusevych, Arianna Bisazza
- Application deadline: 24 April 2025
- More details and application:
https://www.rug.nl/about-ug/work-with-us/job-opportunities/?details=00347-0…
--
Yevgen Matusevych
Assistant Professor
Computational Linguistics, University of Groningen
https://yevgen.web.rug.nl
ClimateCheck: Shared Task on Scientific Fact-Checking of Social Media Claims on Climate Change
Hosted as part of the SDP 2025 Workshop at ACL 2025
31 July 2025
Vienna, Austria
ClimateCheck Shared Task: https://sdproc.org/2025/climatecheck.html
Competition on Codabench: https://www.codabench.org/competitions/6639/
Shared Task Overview
Social media facilitates discussions on critical issues such as climate change, but it also contributes to the rapid dissemination of misinformation, which complicates efforts to maintain an informed public and create evidence-based policies. In this shared task, we emphasise the need to link public discourse to peer-reviewed scholarly articles by gathering claims from social media about climate change (both real-life and automatically generated ones) as well as a corpus of about 400,000 abstracts of publications from climate science research. The participants will be asked to retrieve relevant abstracts for each claim (subtask I) and classify the relation between the claim and abstract as ‘supports’, ‘refutes’, or ‘not enough information’ (subtask II). Participants are allowed to take part either in subtask I only, or in both subtasks.
Subtask I: Abstracts Retrieval
Task: given a claim from social media about climate change and a corpus of abstracts, retrieve the top 10 most relevant abstracts.
Evaluation: Recall@K (K=2, 5, 10) and B-Pref based on annotated gold data; additional unjudged documents will be reviewed by annotators.
Subtask II: Claim Verification
Task: given the claim-abstract pairs received from the previous subtask, classify their relation as ‘supports’, ‘refutes’, or ‘not enough information’.
Evaluation: Precision, Recall, and F1 weighted scores based on annotated documents.
Important dates
Release of training data: April 1, 2025
Release of testing data: April 15, 2025
Deadline for system submissions: May 16, 2025
Paper submission deadline: May 23, 2025
Notification of acceptance: June 13, 2025
Camera-ready paper due: June 20, 2025
Workshop: July 31, 2025
We encourage and invite participation from junior researchers and students from diverse backgrounds. Participants are also encouraged to submit a paper describing their systems to the SDP 2025 workshop.
Organisers
Raia Abu Ahmad (DFKI, Berlin, Germany)
Aida Usmanova (Leuphana University, Lüneburg, Germany)
Georg Rehm (DFKI, Berlin, Germany)
SciVQA: Scientific Visual Question Answering Shared Task
Hosted as part of the SDP 2025 Workshop at ACL 2025
31 July 2025
Vienna, Austria
SciVQA Shared Task: https://sdproc.org/2025/scivqa.html
Competition on Codabench: https://www.codabench.org/competitions/5904
Shared Task Overview
Scholarly articles convey valuable information not only through unstructured text but also via (semi-)structured figures such as charts and diagrams. Automatically interpreting the semantics of knowledge encoded in these figures can be beneficial for downstream tasks such as question answering (QA). In SciVQA, challenge participants will develop multimodal QA systems using a dataset of 3000 images of scientific figures from ACL Anthology and arXiv papers. Each figure is annotated with seven QA pairs (21000 in total) and includes metadata such as caption, ID, type (e.g., compound, line graph, bar chart, scatter plot), and QA pair type. This shared task specifically focuses on closed-ended visual (i.e., addressing visual attributes such as colour, shape, size, height, etc.) and non-visual (not addressing figure visual attributes) questions.
Evaluation
Systems will be evaluated using BERTscore, ROUGE-L, and ROUGE-1 metrics. Automated evaluations of submitted systems will be performed through Codabench.
Important Dates
Release of training data: April 1, 2025
Release of testing data: April 15, 2025
Deadline for system submissions: May 16, 2025
Paper submission deadline: May 23, 2025
Notification of acceptance: June 13, 2025
Camera-ready paper due: June 20, 2025
Workshop: July 31, 2025
Participants are invited to submit papers on their systems. Successful submissions will be published in the proceedings of the SDP 2025 workshop.
Organizers
Ekaterina Borisova (DFKI, Berlin, Germany)
Georg Rehm (DFKI, Berlin, Germany)
The *8th International Conference on Natural Language and Speech Processing
(ICNLSP 2025)* <https://www.icnlsp.org/2025welcome/> welcomes contributions
on both the *theoretical foundations and applied aspects of Natural
Language Processing (NLP) and Speech Processing*. The conference will
feature regular sessions, along with keynote talks delivered by
distinguished international researchers.
We invite authors to submit their work on topics relevant to *ICNLSP 2025*
<https://www.icnlsp.org/2025welcome/> and contribute to advancing research
in the field.
*The conference will be hybrid.*
Topics
(ICNLSP 2025) invites contributions on a wide range of topics, including
but not limited to:
-Natural Language Processing (NLP):
Cognition and NLP
Machine translation
Text categorization
Summarization
Sentiment analysis and opinion mining
Computational social web
Under-resourced languages: tools and corpora
Large language models (LLMs)
NLP tools for software requirements and engineering
Text annotation tools
Knowledge fundamentals and knowledge management systems
Information extraction
Data mining and information retrieval
Lexical semantics and knowledge representation
Visualization for nlp
-Speech Processing:
Signal processing and acoustic modeling
Speech recognition (architecture, search methods, lexical modeling,
language modeling, adaptation, multimodal systems, applications in
education and learning, zero-resource speech recognition, etc.)
Speech analysis
Paralinguistics in speech and language (perception of paralinguistic
phenomena, speaker states and traits analysis, etc.)
Spoken dialogue systems and conversational analysis
Speech translation
Speech synthesis
Speaker verification and identification
Language identification
Speech coding, enhancement, and intelligibility
Speech perception and production
Brain studies on speech
Phonetics, phonology, and prosody
Speech and hearing disorders
Paralinguistics of pathological speech and language
Speech technology for disordered speech and hearing
*Important dates*
All deadlines are 11:59 PM UTC-12:00 (“anywhere on Earth”).
*Submission deadline:* May 25, 2025 11:59 PM (GMT).
*Notification of acceptance:* July 20, 2025.
*Camera-ready paper due:* August 3, 2025.
*Conference dates:* August 25-27, 2025.
Keynote speakers- Prof. Dr. Barbara Plank, LMU Munich, Germany.
- Prof. Peter Schneider-Kamp, SDU, Denmark.
*Publication*
1- *All accepted papers will be published in ACL Anthology
<https://aclanthology.org/>.*
*2- **Selected** papers will be published (after extension) in:* Signals
and Communication Technology (Springer), indexed in *Scopus*
<https://www.scopus.com/> and *zbMATH* <https://zbmath.org/>.
*Best paper award*
To recognize outstanding scientific contributions, *ICNLSP 2025* will
present two awards:
- *Best Full Paper Award*
- *Best Short Paper Award*
These awards will be judged by the *Scientific Committee*, based on
recommendations from the *Program Committee*. The selection process will
consider the *originality, significance, and quality* of the research, as
well as the clarity of presentation.
We look forward to honoring exceptional contributions to the field of *Natural
Language and Speech Processing*!
For more information check the conference website :
*icnlsp.org/2025welcome* <http://icnlsp.org/2025welcome>
apologies for cross-posting
We are pleased to announce the *GermEval Shared Task on Candy Speech
Detection („Flausch-Erkennung“)*
This is the third call to participate in the shared task on candy speech
detection („Flausch-Erkennung“).
We invite everyone from academia and industry to participate in the
shared task.
The workshop discussing the results of this shared task is planned to be
held in conjunction with the Conference on Natural Language Processing
(KONVENS) in September 2025.
*Introduction*
Numerous methods have been developed for detecting and censoring
negative speech (e.g., hate speech or offensive or harmful language) on
social media platforms. However, there is much less focus on identifying
and promoting positive supportive discourse in online communities. Our
shared task aims to address this gap and encourage researchers to focus
on such positive expressions.
The task is to identify expressions of candy speech (Flausch) in online
posts (YouTube comments). We define candy speech as expression of
positive attitudes on social media toward individuals or their output
(videos, comments, etc.). The purpose of candy speech is to encourage,
cheer up, support and empower others. It can be viewed as the
counterpart to hate speech, as it also aims to influence the self-image
of the target person or group, but in a positive way.
*Data*
We will provide the participants with annotated training (and
development) and unlabeled test datasets containing complete written,
German language comment threads under YouTube videos posted by different
content creators. The content creators and communities vary in topic,
style, age group, etc. The training and test datasets do not overlap in
terms of YouTube videos. Furthermore, the test dataset mostly contains
(comments on) videos from content creators that are different from those
in the training dataset. The communities commenting on these videos can
therefore be expected to differ.
*Task Details*
Candy speech detection is the task of identifying the presence of candy
speech (at the span level) in a given YouTube comment and classifying
each expression in one of the predefined categories. This shared task
focuses on German speaking YouTube communities. Participants will be
provided with a dataset of YouTube comments manually annotated for
different types of candy speech expressions.
We offer the following two subtasks. Participants in this year's shared
task may choose to participate in either subtask:
Subtask 1: Coarse-Grained Classification
The goal of this subtask is to identify whether the given comment
contains candy speech ("Flausch") or not. The dataset is manually
annotated for the presence of candy speech.
Subtask 2: Fine-Grained Classification
The goal of this subtask is to identify the span of each candy speech
expression in a given text and classify it in one of the predefined
categories. The dataset is manually annotated for 10 different types of
candy speech expressions, such as “positive feedback”, “compliment”,
“group membership” etc.
More details on the subtasks (including examples) can be found at the
website of the shared task (see link below).
*Important dates*
Trial data available: February 15, 2025
Training data available: March 3, 2025
Test data available: June 10, 2025
Evaluation start: June 16, 2025
Evaluation end: June 27, 2025
Paper submission due: July 11, 2025
Camera ready due: August 15, 2025
GermEval workshop: September 10, 2025 (co-located with KONVENS)
*Website*
https://yuliacl.github.io/GermEval2025-Flausch-Erkennung/
*GermEval*
GermEval is a series of shared task evaluation campaigns that focus on
Natural Language Processing for the German language. GermEval has been
conducted regularly since 2014 in co-location with KONVENS/GSCL
conferences:
https://germeval.github.io/tasks/
*contact email*
Please send any enquiry to the following email address:
germeval-2025-candy-speech(a)ruhr-uni-bochum.de
Best regards,
Yulia Clausen, Ruhr-Universität Bochum, Germany
Tatjana Scheffler, Ruhr-Universität Bochum, Germany
Michael Wiegand, Universität Wien, Austria
Dear Colleagues,
(Apologize if you received multiple emails from different mailing lists)
We are pleased to announce that the 2nd Workshop on Agent AI for Scenario
Planning (AgentScen) will be held in conjunction with* IJCAI-25* from
August 16–22, 2025, in Montreal, Canada. This workshop will include both
paper presentations and *a shared task - **generating business ideas from
patent documents*. The goal of the AgentScen workshop is to explore the
potential of agent AI in scenario planning.* The submission deadline is May
9, 2025. Workshop details can be found at: *
https://sites.google.com/view/agentscen
Scenario planning can be applied to key healthcare and biomedical domains,
such as pandemic preparedness, hospital resource management, advancements
in medical technology, health insurance sustainability, and the future of
telemedicine. In the finance and business sectors, it can assist
executives, financial analysts, and decision-makers in navigating
uncertainties related to market volatility, interest rate fluctuations,
investment risks, corporate financial planning, regulatory changes, and
technological disruptions. By constructing and analyzing alternative
scenarios, organizations can better anticipate challenges and seize
emerging opportunities.
We are also introducing* a shared task focused on generating business ideas
from patent documents.* Participants will be provided with a set of patents
and asked to generate explanations for potential future products based on
the patented technologies. The aim is to foster AI-driven innovation by
leveraging patents as rich sources of technical insight.
We look forward to your participation.
Best regards,
AgentScen Organizers
Chung-Chi Chen - Artificial Intelligence Research Center, AIST, Japan
Tatsuya Ishigaki - Artificial Intelligence Research Center, AIST, Japan
Sophia Ananiadou - University of Manchester, UK
Hiroya Takamura - Artificial Intelligence Research Center, AIST, Japan
Wataru Hirota - Stockmark, Japan (Shared Task)
Tomoko Ohkuma - Asahi Kasei Corporation, Japan (Shared Task)
Tomoki Taniguchi - Asahi Kasei Corporation, Japan (Shared Task)
[Apologies for cross-posting]
== Second Call for Papers and Extended Abstracts ==
1st Workshop on Automatic Assessment of Atypical Speech (AAAS-2025)
We would like to invite you to submit papers to AAAS workshop co-located with NoDaLiDa/Baltic-HLT<https://www.nodalida-bhlt2025.eu> in Hestia Hotel Europa in Tallinn, Estonia on March 5th, 2025.
Workshop website: https://teflon.aalto.fi/aaas-2025/
== Important Dates ==
Submission DL: 16 December 2024 (both papers and abstracts)
Notification of acceptance: 24 January 2025
Camera-ready DL: 3 February 2025
Workshop: 5 March 2025 (full day)
All deadlines are 11:55PM UTC-12:00 ("anywhere on Earth").
== Overview ==
Automatic Assessment of Atypical Speech (AAAS) explores the assessment of pronunciation and speaking skills of children, language learners, people with speech sound disorders and methods to provide automatic rating and feedback using automatic speech recognition (ASR) and large language models (LLMs). Automatic speaking assessment (ASA) is a rapidly growing field that answers to the need of developing AI tools for self-practising second and foreign language skills. This is not limited to pronunciation assessment, but the AI tools can also provide more complex feedback about fluency, vocabulary and grammar of the recorded speech. ASA is also very relevant for detection and quantification of speech disorders and for developing speech exercises that can be performed independent of time and place. The important applications of non-standard speech also include interfaces for children and elderly speakers as an alternative to using text input and output. The topic is timely, because the latest large speech models allow us now to develop ASR and classification methods for low-resourced data, such as atypical speech, where annotated training datasets are rarely available and expensive and difficult to produce and share. The goal of this workshop is to present the latest results in ASA and discuss the future work and collaboration between the researchers in Nordic and Baltic countries.
== Topics of Interest ==
In particular, we would like to invite students, researchers, and other experts and stakeholders to contribute papers and/or join the discussion on the following (and related) topics:
Automatic speaking assessment (ASA) for L2 (second or foreign language) pronunciation
ASA for spoken L2 proficiency
ASA for speech sound disorders (SSD)
Automatic speech recognition (ASR) for L2 learners
ASR for children and young L2 learners
ASA and ASR for Nordic and other low-resource languages and tasks
Spoken L2 learning and speech therapy using games
Automatic generation of verbal feedback for spoken L2 learners using LLMs
== Submission Details ==
We accept both short and long papers, as well as demo papers. The submissions must describe original and unpublished work.
Paper length:
Short and demo papers up to 4 pages.
Long papers up to 8 pages.
References are not included in the page count, and the camera-ready versions of accepted papers will be added to the page to address reviewer comments.
Papers should describe original unpublished work or work-in-progress and will be peer-reviewed by at least two members of the program committee in a double-blind fashion. All accepted papers will be collected into a proceedings volume to be published in the ACL anthology. All submissions must follow the NoDaLida template, available in both LaTeX and MS Word. The links to the templates can be found here:
https://drive.google.com/file/d/1osWGzuRnYRQGRS70Lx_pdQKrIT-NefKS/viewhttps://www.overleaf.com/latex/templates/instructions-for-nodalida-baltic-h…
The submission will be through EasyChair:
https://easychair.org/conferences/?conf=aaas2025
We also invite submissions of maximum 2-page long extended non-anonymous abstracts with any number of pages for references describing work in progress, negative results and opinion pieces. The abstracts, which should follow the same formatting templates as the peer-reviewed papers, will be considered for presentation by the workshop organisers and the accepted ones will be posted on the workshop website. The abstracts can be based on results related to our theme and already published elsewhere. The abstracts will not be published in the proceedings, but only in the workshop program.
Please also consider volunteering to review 2-3 papers.
== Invited Speakers ==
We have the pleasure to announce two invited speakers:
1. Nina R. Benway: What is so hard about AI Speech Therapy? Evidence from Efficacy Trials.
Nina R Benway, PhD CCC-SLP, is a Postdoctoral Fellow in Electrical and Computer Engineering with Dr. Carol Espy-Wilson. Nina completed her doctoral training in speech-language pathology (concentration: neuroscience) with Dr. Jonathan Preston at Syracuse University, focusing on clinical trials in children with chronic rhotic speech sound disorders. The three studies of her dissertation resulted in the curation of an open-access 175,000-utterance speech corpus, the engineering of audio classification algorithms predicting speech-language pathologist perception of rhotic speech errors, and the clinical trial validation of an artificial intelligence tool that fully automates a speech sound treatment session. Nina’s doctoral training builds upon her undergraduate training in linguistics (acoustic phonetics) at Cornell University, graduate clinical training at The College of Saint Rose, and six years of clinical practice. Through these experiences Nina has refined a multidisciplinary skill set in speech science, speech signal processing, natural language processing, corpus phonetics, machine learning/artificial intelligence (AI), user interface development, cognitive frameworks of learning, and neurocomputational frameworks of speech production.
2. Ari Huhta: Automatic assessment of second/foreign language speaking: Review of developments for examination and teaching/learning purposes.
Ari Huhta is a Professor of Language Assessment at the Centre for Applied Language Studies, University of Jyväskylä, Finland. His research interests include diagnostic foreign/second language (L2) assessment, computerised assessment, self-assessment, as well as the development of reading, writing and vocabulary knowledge in L2. He was involved in developing the large-scale multilingual DIALANG online assessment and feedback system in the early 2000s and since then he has specialised in assessments that support language learning. Although his research has focused on learning and assessing reading and writing, he has been involved in designing several rating scales for speaking and in evaluating rating quality and studying rater behavior. Recently, he has participated in research projects that are developing ASR and automated assessment of L2 speaking, as well as using LLMs to evaluate Finnish L2 learners’ proficiency level.
== Organizers ==
Mikko Kurimo (chair), Aalto University, mikko.kurimo(a)aalto.fi
Giampiero Salvi, NTNU
Sofia Strömbergsson, Karolinska Institutet
Sari Ylinen, Tampere University
Minna Lehtonen, University of Turku
Tamas Grosz, Aalto University
Ekaterina Voskoboinik, Aalto University
Yaroslav Getman, Aalto University
Nhan Phan, Aalto University
This workshop is supported by “Technology-enhanced foreign and second-language learning of Nordic languages (TEFLON)” https://teflon.aalto.fi/ NordForsk project nr. 103893.
== Contact Information ==
For questions and comments, please email mikko.kurimo(a)aalto.fi
The Images and Contents team [1] of the L3i lab at La Rochelle
University, invites researchers who have or will have a PhD degree at
the call deadline in September 2025 and fulfill the eligibility criteria
to prepare a joint application for an EU-funded Marie Skłodowska-Curie
Fellowship [2,3].
Potential topics can include current areas of NLP research such as:
- Multimodal Large Language Models
- Text processing of news archives
- Using NLP for Cultural Heritage, Social Sciences, Humanities and
Literature
- Extraction, summarisation, and analysis of financial text
- Information extraction for knowledge graph population
- Using additional knowledge sources for information extraction from text
- Reasoning with knowledge graphs and language
Potential candidates should have published at top NLP or ML conferences
(EMNLP,ACL,NAACL,EACL,NeurIPS,ICML,ICLR). Female candidates and
candidates from other underrepresented groups in Computer Science are
especially encouraged to apply.
If you want to learn more, please send an email to Georgeta Bordea
(name.surname [at] univ-lr.fr [http://univ-lr.fr/]) with your CV attached and the subject
line "MSCA NLP".
[1] https://l3i.univ-larochelle.fr/Images-et-contenus
[2]
https://marie-sklodowska-curie-actions.ec.europa.eu/actions/postdoctoral-fe…
[3]
https://marie-sklodowska-curie-actions.ec.europa.eu/calls/msca-postdoctoral…
The Images and Contents team [1] of the L3i lab [2] at La Rochelle University invites applications for a fully-funded PhD POSITION.
Application deadline: 21/04/2025
Detection, Representation and Visualisation of Event-Event Relations
Understanding how events relate to each other is at the heart of narrative comprehension, whether in literature, news discourse, historical analysis, or AI-driven storytelling. The ability to detect, represent, and visualize explicit and implicit event-event relations is crucial for making sense of complex narratives, identifying causal structures, and uncovering hidden patterns in temporal or logical sequences. This research will explore machine learning approaches, structured knowledge graphs, and multimodal visual representations to model event-event relations effectively. The project will leverage AI to tackle four core challenges: (1) accurate detection of implicit and explicit event-event relations across diverse textual sources, (2) structured representation of these relationships using knowledge graphs and semantic models, (3) linking of event sequences with relevant background knowledge, and (4) intuitive visualization of event interactions to enhance interpretability and facilitate narrative analysis.
What we offer: A yearly scholarship of €40,000 is provided for the entire duration of enrollment, up to a maximum of 36 months. Funded within the framework of a Junior Professorship (CPJ) on Federating Digital Humanities, this PhD offers strong interdisciplinary and collaborative research opportunities with Digital Humanities and Computer Science researchers.
Eligibility: Master’s degree in Computer Science, Computational Linguistics, Information Retrieval, Data Science or a related field. Strong interest or prior experience in NLP, knowledge graphs, semantic technologies, or interactive data visualization.
Mode of study: Full time
Year of entry: 2025
Duration: 3 years
Additional criteria: Applicants must not already (i) hold a doctoral degree; or (ii) be matriculated for a doctoral degree at La Rochelle University, or another institution.
Expression of Interest (EOI): Students are to submit their EOIs to Georgeta Bordea (name.surname(a)univ-lr.fr) including the following documents:
*
CV including information about publications.
*
Transcripts of most relevant/recent degrees.
*
To apply, applicants must submit an EOI detailing their suitability for the project by addressing the required skills and key responsibilities in under 800 words.
[1] https://l3i.univ-larochelle.fr/Images-et-contenus [https://l3i.univ-larochelle.fr/Images-et-contenus]
[2] https://l3i.univ-larochelle.fr/ [https://l3i.univ-larochelle.fr/]