*CALL FOR PAPERS*
*Natural Language Processing, Text Mining and Applications (PLN-TEMA’25)
Track of EPIA’25*
PLN-TEMA’25 will be held at the 24th Portuguese Conference on Artificial
Intelligence (EPIA 2025) taking place in Universidade do Algarve, Faro,
Portugal, between October 1st-3rd 2025. This track is organized under the
auspices of the Portuguese Association for Artificial Intelligence (APPIA).
EPIA 2025 . URL https://epia2025.ualg.pt/
This announcement contains the following: [1] Track description; [2] Topics
of interest; [3] Important dates; [4] Paper submission; [5] Track fees; [6]
Organizing Committee; [7] Contacts.
[1] *Track Description*
The Track of Natural Language, Text Mining and Applications (NLP-TeMA 2025)
is a forum for researchers working in Human Language Technologies, i.e.
Natural Language Processing (NLP), Computational Linguistics (CL), Natural
Language Engineering (NLE), Text Mining (TM), Information Retrieval (IR),
and related areas.
The most natural form of sharing knowledge is indeed through textual
documents. Especially on the Web, a huge amount of textual information is
openly published every day, on many different topics and written in natural
language, thus offering new insights and many opportunities for innovative
applications of Human Language Technologies.
Following advances in general AI sub-fields such as NLP, Machine Learning
(ML) and Deep Learning (DL), text mining is now even more valuable as tool
for bridging the gap between language theories and effective use of natural
language contents, for harnessing the power of semi-structured and
unstructured data, and to enable important applications in real-world
heterogeneous environments. Both hidden and new knowledge can be discovered
by using NLP and Text Mining methods, at multiple levels and in multiple
dimensions, and often with high commercial value.
Authors are invited to submit their papers on any of the issues identified
in section [2]. Revision of the papers will be double-blind at least by
three members of the Program Committee. All accepted papers will be
published by Springer in a volume of Springer’s Lecture Notes in Artificial
Intelligence (LNAI) corresponding to the proceedings of the 24th EPIA
Conference on Artificial Intelligence, provided that at least one author is
registered in EPIA 2025 by the early registration deadline.
[2] *Topics of Interest*
Theories, Algorithms and Models
-
Language and Cognitive Modeling
-
Tagging, Chunking and Parsing
-
Morphology and Word Segmentation
-
Natural Language Generation
-
Discourse and Pragmatics
-
Semantics and Text Inference
-
Language Resources: Acquisition and Usage. Lexical Knowledge Acquisition
-
Entailment and Paraphrases
-
Entity Recognition and Word Sense Disambiguation
-
Natural language understanding
-
Language modeling
-
Mathematical Properties of Language
-
NLP for Low-Resource Languages
Text Mining and NLP Applications
-
Text Clustering, Classification and Summarization
-
Sentiment Analysis and Argument Mining
-
Computational Social Science
-
Multi-Word Units
-
Machine Learning for NLP and Text Mining
-
Spatio-Temporal and Big Text Mining
-
Machine Translation and Cross-Lingual Approaches
-
Algorithms and Data Structures for Text Mining
-
Information Retrieval and Information Extraction
-
Question-Answering and Dialogue Systems
-
Text-Based Prediction and Forecasting
-
Web Content Annotation
-
Health/Biomedical/Legal and other Text Mining Applications
-
Offensive Speech Detection and Analysis
[3] *Important dates*
- Paper submission deadline: May 23, 2025 (AoE)
- Notification of paper acceptance: July 4, 2025
- Camera-ready papers: July 14, 2025 (AoE)
- Conference dates: October 1-3, 2025
[4] *Paper submission*
Submissions must be full technical papers on substantial, original, and
previously unpublished research. Papers should be prepared according to the
Springer LNAI format, using either a LaTeX or Word template, with a maximum
od 12 pages, including references. EPIA 2025 will not accept any paper
that, at the time of submission, is under review for, has already been
published in or has already been accepted for publication in a journal or
another venue with formally published proceedings. Authors of EPIA 2025
submissions are not permitted to submit their paper to a journal or another
venue during the EPIA 2025 review period.
It is the responsibility of the authors to remove names and affiliations
from the submitted papers, and to take reasonable care to assure anonymity
during the review process. Authors should also follow the standards as set
out in the Springer Nature code of conduct.
[5] *Track Fees:*
Track participants must register at the main EPIA 2025 conference.
[6] *Organizing Committee:*
Joaquim Silva, DI – FCT/UNL
Pablo Gamallo, CiTIUS, Universidade de Santiago de Compostela
Paulo Quaresma, DI – Uviversidade de Évora
Irene Rodrigues, DI – Uviversidade de Évora
Alípio Jorge, Dep. Ciência de Computadores, Fac. Ciências, Universidade do
Porto
[7] *Contacts:*
Joaquim Francisco Ferreira da Silva, DI/FCT/UNL, Quinta da Torre, 2829‐516,
Caparica, Portugal. Tel: +351 21 294 8536 (ext. 10732) ‐ Fax: +351 21 294
8541 ‐ E‐mail: jfs [at]fct [dot] unl [dot] pt
[Apologies for cross-posting]
The 5th iteration of the NALOMA (Natural Logic Meets Machine Learning)
workshop invites submissions on any (theoretical or computational) aspect
of hybrid methods concerning Natural Language Understanding and Reasoning
(NLU&R). The topics include but are not limited to:
- Hybrid NLU&R systems that integrate logic-based/symbolic methods with
neural networks
- Explainable NLU&R (with structured explanations)
- Opening the black-box of deep learning in NLU&R
- Downstream applications of hybrid NLU&R systems
- Probabilistic semantics for NLU&R
- Comparison and contrast between symbolic and deep learning work on
NLU&R
- Creation, criticism, refinement, and augmentation of NLU&R datasets
- (Dis)Alignment of humans and machines on NLU&R tasks
- Addressing inherent human disagreements in NLU&R tasks
- Generalization of NLU&R systems
- Fine-grained evaluation of NLU&R systems
NALOMA accepts archival papers (to appear in the ACL anthology proceedings)
and (non-archival) extended abstracts.
The workshop is co-located with ESSLLI (https://2025.esslli.eu),
28 July-8 August 2025, Bochum (Germany).
The submission deadline is 25 April 2025.
Visit https://naloma.github.io for more details.
-
The NALOMA chairs,
Lasha Abzianidze and Valeria de Paiva
--
Lasha Abzianidze
Assistant professor at Utrecht University
Institute for Language Sciences
An exciting job opportunity at the University of Manchester for an outstanding researcher in NLP/LLMs.
The fellowship is for 5 years and associated with the British Heart Foundation Centre of Research Excellence https://www.bhf-cre.manchester.ac.uk/ and the Department of Computer Science, Faculty of Science and Engineering.
Areas of interest for this post:
Development of a Foundation Clinical LLM for the Cardiovascular Domain. Trustworthy, explainable multimodal LLMs for Medicine.
Further details about the fellowship and how to apply here: https://www.jobs.manchester.ac.uk/Job/JobDetail?JobId=31702
Deadline of applications: 7th April 2025.
----------
Professor Sophia Ananiadou
Department of Computer Science
Director, National Centre for Text Mining
Deputy Director, Institute for Data Science and Artificial Intelligence
ELLIS Fellow
sophia.ananiadou(a)manchester.ac.uk
The University of Manchester
----------------------------
HealTAC 2025
June 16-18th, 2025, Glasgow (UK)
https://healtac2025.github.io/
----------------------------
1) Call for contributions – deadline extended to 4 April
2) Keynotes, panels and workshop
3) Registration fees
4) Key dates
----------------------------
----------------------------------------
Call for contributions - reminder
----------------------------------------
The 8th Healthcare Text Analytics Conference (HealTAC 2025) invites contributions that address any aspect of healthcare text analytics. We invite submissions in the form of extended abstracts that describe either methodological or application work that has not been previously presented in a conference. Submissions (up to 2 pages) should be prepared based on a template that is available at the conference web site.
We also invite PhD and fellowship project submissions that describe ongoing PhD research (any stage) or a planned fellowship application. The conference will provide an opportunity to receive constructive feedback from a panel of experts.
Deadline for all submissions is now April 4th, 2025.
As in previous years, there will be a post-conference call to submit a journal length paper for further peer review and publication in Frontiers in Digital Health.
----------------------------
Programme
----------------------------
We are delighted to announce keynotes by Dr Jason Fries from Stanford University and Dr Alison O'Neil from Canon Medical Research, and panels on "Opportunities and challenges in LLMs for health research: social inequalities, bias detection, and mitigation " and "Challenges in AI deployment within NHS" (industry forum).
A pre-conference workshop on June 16th will focus on "NLP in mental healthcare and research" (https://healtac2025.github.io/workshop/).
----------------------------
Registration fees
----------------------------
Due to generous support from Health Data Research UK, CogStack, Frontiers, University of Glasgow, Research Data Scotland and Healtex, we will keep the registration fee low as before: an early registration fee for students is £100 and for others £200, and includes the full 3-day programme, lunches and the conference dinner.
----------------------------
Key dates
----------------------------
Deadline for all contributions: April 4th 2025
Notification of acceptance: April 18th 2025
Early-bird registration: by May 16th 2025
Pre-conference workshop: June 16th 2025
Conference: June 17-18th 2025
Follow the conference announcements on social media at #HEALTAC2025
We are looking forward to welcoming you to HealTAC 2025.
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
The 23rd International Workshop on Treebanks and Linguistic Theories (TLT 2025) will bring together developers and users of linguistically annotated natural language corpora. The workshop is part of SyntaxFest 2025 and will be hosted by University of Ljubljana in Slovenia on August 26-29, 2025.
Link to TLT 2025: https://www.korpuslab.uni-hamburg.de/en/tlt2025.html
Link to SyntaxFest 2025: https://syntaxfest.github.io/
-----------------------------
INVITED TALK
-----------------------------
Amir Zeldes (Georgetown University)
-----------------------------
SUBMISSION INFORMATION
-----------------------------
TLT addresses all aspects of treebank design, development, and use. As ‘treebanks’ we consider any pairing of natural language data (spoken, signed, or written) with annotations of linguistic structure at various levels of analysis, including, e.g., morpho-phonology, syntax, semantics, and discourse. Annotations can take any form (including trees or general graphs), but they should be encoded in a way that enables computational processing. Reflections on the design of linguistic annotations, methodology studies, resource announcements or updates, annotation or conversion tool development, or reports on treebank usage including probing the leakage of treebanks into large language models are but some examples of the types of papers we anticipate for TLT.
SyntaxFest joint submission link: https://openreview.net/group?id=SyntaxFest/2025
-----------------------------
IMPORTANT DATES
-----------------------------
* April 15, 2025: Paper submission deadline
* June 2, 2025: Notification of acceptance
* June 16, 2025: Camera-ready papers due
* August 26-29, 2025: SyntaxFest conference (about two workshop days for TLT; attendants are encouraged but not obliged to participate in the whole SyntaxFest.)
All deadlines are 11.59 pm UTC -12h (“anywhere on Earth”).
-----------------------------
TLT2025 WORKSHOP CHAIRS
-----------------------------
* Sarah Jablotschkin, University of Hamburg
* Sandra Kübler, Indiana University
* Heike Zinsmeister, University of Hamburg
Contact: tlt2025.gw(a)uni-hamburg.de
Website: https://www.korpuslab.uni-hamburg.de/en/tlt2025.html
=============================================================
DiSS 2025 - 12th Workshop on Disfluency in Spontaneous Speech
https://diss2025.inesc-id.pt <https://diss2025.inesc-id.pt/>
=============================================================
We are pleased to announce the 12th edition of DiSS workshop – Disfluency in Spontaneous Speech, which will take place in Lisbon, Portugal, on September 4-5, 2025. This year’s theme is “Disfluencies in the Age of AI: A Multidisciplinary View“. The workshop is organized as a satellite event of INTERSPEECH 2025 and is proudly sponsored by ISCA.
We invite submissions from all fields addressing disfluency, paralinguistics, and related phenomena, including (but not limited to): psychology, neuropsychology and neurocognition, psycholinguistics, linguistics, speech production and perception, conversational AI, gesture analysis, computational linguistics, speech technology, dialogue systems, human-centered AI, brain-computer interfaces, healthcare, and generative AI.
IMPORTANT DATES
- Paper submission deadline: April 19, 2025
- Notification of acceptance: May 24, 2025
- Camera-ready submission deadline: June 16, 2025
- Author registration deadline: June 23, 2025
- DiSS Workshop: September 4–5, 2025
SUBMISSION GUIDELINES
Please prepare your manuscript using the official Interspeech 2025 template <https://www.interspeech2025.org/author-resources> (LaTeX or Word) and submit a single PDF file. Submissions will be managed through the Microsoft CMT system <https://cmt3.research.microsoft.com/diss2025>. Authors must create a free account to submit their papers.
COMMITTEES
Organizers
- Helena Moniz, University of Lisbon, Portugal
- Elizabeth Shriberg, Ellipsis Health, USA
- Julia Hirschberg, Columbia University, USA
- Robert Eklund, Linköping University, Sweden
- Fernando Batista, ISCTE and INESC-ID Lisboa, Portugal
Publicity Chair
- Isabel Trancoso, University of Lisbon and INESC-ID Lisboa, Portugal
Local Organisation
- Ana Isabel Mata, University of Lisbon, Portugal
- Anna Havras, University of Lisbon and VoiceInteraction, Portugal
- Anna Maria Pompili, INESC-ID Lisboa, Portugal
- Miguel Menezes, University of Lisbon and Unbabel, Portugal
- Rubén Solera Ureña, INESC-ID Lisboa, Portugal
- Sérgio Paulo, INESC-ID Lisboa, Portugal
Scientific Committee
- Alexandra Markó, SSNS Institute for Expert Services, Hungary
- Ana Isabel Mata, University of Lisbon, Portugal
- Anna Havras, VoiceInteraction and University of Lisbon
- Anna Maria Pompili, INESC-ID Lisbon, Portugal
- Antonio Bonafonte, SANAS AI, Barcelona, Spain
- Catarina Botelho, INESC-ID Lisbon
- Chiara Mazzocconi, Aix Marseille Université, France
- Clara Niza, University of Lisbon and INESC-ID Lisbon, Portugal
- Daniela Braga, Defined.ai, USA
- David Escudero, Universidad de Valladolid, Spain
- David Matos, University of Lisbon and INESC-ID Lisbon, Portugal
- Elizabeth Shriberg, Ellipsis Health, USA
- Eugénio Ribeiro, ISCTE and INESC-ID Lisboa
- Francesco Cutugno, Universita’ Degli Studi di Napoli, Italy
- Francisco Teixeira, INESC-ID Lisbon
- Gueorgui Nenov Hristovky, University of Lisbon, Portugal
- George Georgiou, University of Nicosia, Greece
- Hermann Ney, RWTH Aachen University, Germany
- Ivana Didirková, Université Paul Valéry – Montpellier 3, France
- Jens Allwood, University of Götenburg, Sweden
- Jessica di Napoli, Aachen University, Germany
- Joakim Gustafson, KTH, Sweden
- João Graça, Unbabel and Widn.AI, USA
- Judit Bóna, Eötvös Loránd University, Hungary
- Julia Hirschberg, Columbia University, USA
- Jürgen Trouvain, Saarland University, Saarbrücken, Germany
- Keikichi Hirose, University of Tokyo, Japan
- Khiet Truong, University of Twente, The Netherlands
- Kikuo Maekawa, National Institute for Japanese Language and Linguistics, Japan
- Loulou Kosmala, Université Paris-Est Créteil, France
- Mária Gosy, Eötvös Loránd University, Hungary
- Mariana Julião, INESC-ID Lisbon
- Malte Belz, Humboldt-Universität, Germany
- Martin Corley, University of Edinburgh, Scotland
- Miguel Menezes, University of Lisbon and INESC-ID Lisbon, Portugal
- Paulina Peltone, University of Turku, Finland
- Petra Wagner, University of Bielefeld, Germany
- Plínio Barbosa, University of Campinas (UNICAMP), Brazil
- Ralph Rose, Waseda University, Japan
- Robert Hartsuiker, Ghent University, Belgium
- Rubén Solera Ureña, INESC-ID Lisboa, Portugal
- Sérgio Paulo, INESC-ID Lisboa, Portugal
- Simon Betz, University of Bielefeld, Germany
- Vera Cabarrão, Unbabel, Portugal
- Vered Silber Varod, Tel Aviv University, Israel
Please visit our webpage for up-to-date information: https://diss2025.inesc-id.pt/ <https://diss2025.inesc-id.pt/>
Any questions should be directed to: diss2025(a)googlegroups.com <mailto:diss2025@googlegroups.com>
We look forward to welcoming you in Lisbon for an engaging and collaborative event!
— The DiSS 2025 Organizing Committee
Dear colleagues,
(Apologize if you received multiple emails from different mailing lists)
We are delighted to share the 2nd call for task proposals for NTCIR-19.
NTCIR (NII Testbeds and Community for Information Access Research) is a
series of evaluation conferences that mainly focus on information access
with East Asian languages and English. The first NTCIR conference (NTCIR-1)
took place in August/September 1999, and the latest NTCIR-18 conference
will be held on June 10-13, 2025. Research teams from all over the world
participate in one or more NTCIR tasks to advance the state of the art and
to learn from one another's experiences.
It is time to call for task proposals for the next NTCIR (NTCIR-19), which
will start in September 2025 and conclude in December 2026. Task proposals
will be reviewed by the NTCIR Program Committee, and organizers of accepted
tasks will have a chance to present their proposed tasks at the NTCIR-18
Conference held in NII, Tokyo, Japan, from June 10-13, 2025.
* IMPORTANT DATES:
*March 31, 2025: Task Proposal Submission Due (Anywhere on Earth)*May 15,
2025: Acceptance Notification of Task Proposals
June 10-13, 2025: NTCIR-18 Conference (Organizers of accepted tasks have a
chance to present their proposed tasks)
* SUBMISSION LINK:
*https://easychair.org/conferences/?conf=ntcir19proposal
<https://easychair.org/conferences/?conf=ntcir19proposal>*
* NTCIR-19 TENTATIVE SCHEDULE:
January 2026: Dataset release*
January-June 2026: Dry run*
March-July 2026: Formal run*
August 1, 2026: Evaluation results return
August 1, 2026: Task overview release (draft)
September 1, 2026: Submission due of participant papers (draft)
November 1, 2026: Camera-ready participant paper due
December 2026: NTCIR-19 Conference at NII, Tokyo, Japan
(* indicates that the schedule can be different for different tasks)
* WHO SHOULD SUBMIT NTCIR-19 TASK PROPOSALS?
We invite new task proposals within the expansive field of information
access. Organizing an evaluation task entails pinpointing significant
research challenges, strategically addressing them through collaboration
with fellow researchers (including co-organizers and participants),
developing the requisite evaluation framework to propel advancements in the
state of the art, and generating a meaningful impact on both the research
community and future developments.
Prospective applicants are urged to underscore the real-world applicability
of their proposed tasks by utilizing authentic data, focusing on practical
tasks, and solving tangible problems. Additionally, they should confront
challenges in evaluating information access technology, such as the
extensive number of assessments needed for evaluation, ensuring privacy
while using proprietary data, and conducting live tests with actual users.
In the era of large language models (LLMs), these models are anticipated to
significantly influence daily human activities. Nonetheless, the content
produced by LLMs often exhibits issues, such as hallucinations. NTCIR-19
encourages tasks that focus on the evaluation of the quality of content
generated by LLMs continued from NTCIR-18 as well as information access
exploiting LLMs, including generative information retrieval (IR), IR using
generative queries, conversational search using generated utterances,
evaluation using LLM (relevance judgements or language annotation using
LLM), and RAG.
* PROPOSAL TYPES:
We will accept two types of task proposals:
- Proposal of a Core task:
This is for fostering research on a particular information access problem
by providing researchers with a common ground for evaluation. New test
collections and evaluation methods may be developed through the
collaboration between task organizers (proposers) and task participants. At
NTCIR-18, the core tasks are AEOLLM, FairWeb-2, FinArg-2, Lifelog-6,
MedNLP-CHAT, RadNLP, and Transfer-2. Details can be found at
http://research.nii.ac.jp/ntcir/NTCIR-18/tasks.html.
- Proposal of a Pilot task:
This is recommended for organizers who propose to focus on a novel
information access problem, and there are uncertainties either in task
design or organization. It may focus on a sub-problem of an information
access problem and attract a smaller group of participating teams than core
tasks. However, it may grow into a core challenging task in the next round
of NTCIR. At NTCIR-18, the pilot tasks are HIDDEN-RAD, SUSHI, and U4.
Details can be found at http://research.nii.ac.jp/ntcir/NTCIR-18/tasks.html.
Organizers are expected to run their tasks mainly with their own funding
and to make the task as self-sustaining as possible. A part of the fund can
be supported by NTCIR, which is called "seed funding." It is usually used
for some limited purposes such as hiring relevance assessors. The seed
funding allocated to each task varies depending on requirements and the
number of accepted tasks. Typical cases would be around 1M JPY for a core
task and around 0.5M JPY for a pilot task (note that the amount is subject
to change).
Please submit your task proposal as a PDF file via EasyChair by March 31,
2025 (Anywhere on Earth).
https://easychair.org/conferences/?conf=ntcir19proposal
* TASK PROPOSAL FORMAT:
The proposal should not exceed four pages in A4 single-column format. The
first three pages should contain the main part and appendix, and the last
page should contain only a description of the data to be used in the task.
Please describe the data in as much detail as possible so that we can help
your data release process after the proposal is accepted. In the past
NTCIRs, it took much time to create memorandums for data release, which
sometimes slowed down the task organization.
Main part
- Task name and short name
- Task type (core or pilot) - Abstract
- Motivation
- Methodology
- Expected results
Appendix
- Names and contact information of the organizers - Prospective participants
- Data to be used and/or constructed
- Budget planning
- Schedule
- Other notes
Data (to be used in your task) - Details
(Please describe the details of the data, which should include the source
of the data, methods to collect the data, range of the data, etc.)
- License
(Please make sure that you have a license to distribute the data, and
details of the license should be provided. If you do not have permission to
release the data yet, please describe your plan to get the permission.)
- Distribution
(Please describe how you plan to distribute the data to participants. There
are mainly three choices: distributed by the data provider, distributed by
organizers, and distributed by NII.)
- Legal / Ethical issues
(If the data can cause legal or ethical problems, please describe how you
propose to address them. e.g., some medical data may need approval from an
ethical committee. e.g., some Web data may need filtering for excluding
discriminative messages.)
If you want NII to distribute your data to task participants on your
behalf, please email ntc-admin(a)nii.ac.jp before your task proposal
submission attaching the task proposal.
* REVIEW CRITERIA:
- Importance of the task to the information access community and the
society - Timeliness of the task
- Organizers’ commitment in ensuring a successful task
- Financial sustainability (self-sustainable tasks are encouraged)
- Soundness of the evaluation methodology
- Detailed description about the data to be used
- Language scope
* NTCIR-19 PROGRAM CO-CHAIRS:
Qingyao Ai (Tsinghua University, China)
Chung-Chi Chen (National Institute of Advanced Industrial Science and
Technology (AIST), Japan)
Shoko Wakamiya (Nara Institute of Science and Technology (NAIST), Japan)
* NTCIR-19 GENERAL CHAIRS:
Charles Clarke (University of Waterloo, Canada)
Noriko Kando (National Institute of Informatics, Japan)
Makoto P. Kato (University of Tsukuba, Japan)
Yiqun Liu (Tsinghua University, China)
Dear colleagues,
applications are invited for a Research Assistant position on research infrastructure for computational literary studies in the Institute for Classical Philology at Humboldt University Berlin (Germany).
Contract Type: Fixed-term until 31 October 2026
Applications: https://haushalt-und-personal.hu-berlin.de/de/personal/stellenausschreibung…
Salary: €57,708 to €82,023 per annum, depending on qualification and experience
We are seeking a Research Assistant to work in Computational Literary Studies, as part of the Daidalos project (funded by the German Research Foundation), which is hosted at Humboldt University Berlin. The Daidalos project (https://daidalos-projekt.de) is building a research infrastructure that enables low-threshold access to computational literary studies methods in classical philology. The aim is to enable anyone who wants to conduct digitally supported research on Latin or Ancient Greek texts to carry out this research on their own corpus using the software. You will be expected to coordinate the project, organize workshops throughout the country, work with the community and help with publications. For these tasks, you will familiarize yourself with Digital Humanities in general, and natural language processing in particular.
This is an opportunity to work on an aspiring national research infrastructure for Classics in Germany. The project group is well-known and held in great esteem by the relevant community. The advertised position offers excellent opportunities for publications and conference trips. Candidates must have a master's degree, solid knowledge of Latin or Ancient Greek, and excellent German language proficiency. Experience with digital humanities, user experience design or scientific editorship is a plus.
For information questions, please contact Dr. Andrea Beyer (daidalos-projekt(a)hu-berlin.de).
Kind regards,
Konstantin Schulz
BioCreative IX Challenge and Workshop CFP
Large Language Models for Clinical and Biomedical NLP at IJCAI
Where, When:
The BioCreative IX workshop<https://www.ncbi.nlm.nih.gov/research/bionlp/biocreative9> will run with IJCAI 2025<https://2025.ijcai.org/>, August 16-22, 2025, In Montreal, CA.
BioCreative IX:
The 9th BioCreative workshop seeks to attract researchers interested in developing and evaluating automatic methods of extracting medically relevant information from clinical data and aims to bring together the medical NLP community and the healthcare researchers and practitioners. The challenge tracks explore MedHopQA, a dataset for benchmarking LLM-based reasoning systems with disease-centered question answers, ToxHabits, a task exploring the information extraction related to substance use and abuse in Spanish clinical content, and Sentence segmentation of real clinical notes using MIMIC-II clinical notes. We also will feature paper submissions on relevant topics and poster/tool demonstrations.
Important Dates
March - April: Team Registration
May 12, 2025: Testing predictions, Evaluation results
May 19, 2025: Submission of participants papers deadline
Jun 06, 2025: Notification of accepted papers deadline
Aug 16- Aug 22 2025: IJCAI 2025
Workshop Proceedings and Special Issue:
The BioCreative IX Proceedings will host all the submissions from participating teams, and they will be freely available by the time of the workshop.
In addition, select papers will be invited for a journal BioCreative IX special issue for work that passes their peer-review process. More details and information to submit will be posted in June.
Participation:
Teams can participate in one or more of these tracks. Team registration will continue until April 30th, when final commitment is requested.
To register a team go to the Registration Form<https://forms.gle/xbQp158cn5pgJ1oj9>. If you have restrictions accessing Google forms please send e-mail to BiocreativeChallenge(a)gmail.com.
Call for Papers
We welcome submissions on work that describes research on similar topics to the three challenges, as well as:
* Development of benchmarking datasets for clinical NLP
* Creating and evaluating synthetic data using LLMs and its impact for downstream tasks
* Creative use of data augmentation for increasing tool accuracy and trustworthiness
* Use of LLMs to streamline annotation tasks
* NLP-systems capable of identifying entities in multilingual corpora
* NLP-systems capable of semantic interoperability across different terminologies/ ontologies for efficient data curation
* Integrating ontologies and knowledge bases for factual LLM production
* Annotated corpora and other resources for health care and biomedical data modelling
All submissions will be considered for poster presentations and tool demonstrations at the workshop.
BioCreative IX Tracks:
Track 1: MedHopQA
Large language models (LLMs) are commonly evaluated on their capabilities to answer questions in various domains, and it has become clear that robust QA datasets are critical to ensure proper evaluation of LLMs prior to their deployment in real-world biomedical or healthcare related applications. This track aims to advance the development of LLM-based systems that are capable of answering questions that involve multi-step reasoning. We have created a resource consisting of 1,000 question-answer pairs - focusing on diseases, genes and chemicals, mostly pertaining to rare diseases - based on public information in Wikipedia. The participants are encouraged to use any training data they wish to design and develop their NLP system agents that understand asserted information on genes, diseases, chemicals etc. and are able to answer multi-step reasoning questions involving such information. This track builds on the previous success in biomedical QA benchmarking (e.g., PubMedQA and BioASQ, MedQA) but differs from them in the fact that for MedHopQA it is necessary to employ a multi-step reasoning process to find the correct answer.
Track 3: ToxHabits
There is a pressing need to extract information related to substance use and abuse more systematically, including not only smoking and alcohol abuse but also other harmful drugs and substances from clinical content. These toxic habits have a considerable health impact on a variety of medical conditions and also affect the action of prescribed medications. To make such information actionable, it is critical to not only detect instances of consumption, but also to characterize certain aspects related to it, such as duration or mode of administration. Some initial efforts have been made to automatically detect social determinants of health, including smoking status, for content in English, but very limited efforts have been made for content in other languages. Therefore, we propose the ToxHabits track to address the automatic extraction of substance use and abuse information from clinical cases in Spanish. This task will consist of three subtasks: (a) toxic habit mention recognition, (b) detection of relevant clinical modifiers related to substance abuse, as well as (c) toxic habit condition QA challenge.
Track 2: Sentence segmentation of real-life clinical notes
Sentence segmentation is a fundamental linguistic task and is widely used as a pre-processing step in many NLP tasks. Although the development of LLMs and the sparse attention mechanism in transformer networks have reduced the necessity of sentence level inputs in some NLP tasks, many models are designed and tested only for shorter sequences. The need for sentence segmentation is particularly pronounced in clinical notes, as most clinical NLP tasks depend on this information for annotation and model training. In this shared task, we challenge participants to detect sentence boundaries (spans) for MIMIC-III clinical notes, where fragmented and incomplete sentences, complex graphemic devices (e.g. abbreviations, and acronyms), and markups are common. To encourage generalizability to multi-domain texts, participants will receive annotated texts from newswire articles and biomedical literature, in addition to clinical notes, for model development and evaluation.
Organizing Committee
* Dr. Rezarta Islamaj, National Library of Medicine
* Dr. Graciela Gonzalez-Hernandez, Cedars-Sinai Medical Center
* Dr. Martin Krallinger, Barcelona Supercomputing Center
* Dr. Zhiyong Lu, National Library of Medicine
----------------------------------------------------------
Rezarta Islamaj
National Library of Medicine
Rezarta.Islamaj(a)nih.gov<mailto:Rezarta.Islamaj@nih.gov>
LLMSEC 2025
URL: https://sig.llmsecurity.net/workshop/
Direct submission deadline: April 15, 2025
LLMSEC is an academic event publishing & presenting work on
adversarially-induced failure modes of large language models, the
conditions that lead to them, and their mitigations.
Date: Aug 1, 2025
Location: Vienna, Austria
Co-located with ACL 2025 as a workshop
Scope
Large Language Models accept a variety of inputs and produce a variety of
outputs. It is possible to find inputs that lead to LLM outputs that model
creators, owners, or users do not want. Defining and enumerating this space
is an open task. We describe LLM security as the field of investigating how
models that process text can, by an adversary, be made to behave in
unintended and harmful ways. %The field covers both weaknesses and
vulnerabilities.
Research at LLMSEC includes the entire life cycle of LLMs, from training
data through fine-tuning and alignment over to inference-time. It also
covers deployment context of LLMs, including risk assessment, release
decisions, and use of LLMs in agent-based systems.
Event scope is LLM attacks, LLM defence, and the contextualisation of LLM
security. LLM attacks are anything that causes LLMs to behave in an
unexpected/unintended manner usable by an adversary. In the LLM life cycle,
this includes techniques like data poisoning and other model supply chain
attacks, as well as the adversarial inputs that yield insecure outputs.
Topics include:
Adversarial attacks on LLMs
Automated and adaptive LLM attacks
Data poisoning
Data extraction from trained models
Defining LLM vulnerabilities
Detection of adversarial LLM inputs
Ethical aspects of LLM security
Legal impacts and debates related to model security
LLM Denial-of-service
LLM security measurement
LLM supply chain attacks
Model input/output guardrails
Model inversion
Model policy
Multi-modal and cross-model models (e.g. vision&text-to-text,
text-to-speech, speech-to-text)
Organising model exploits
Organising model failure modes
Practical tools for exploiting LLMs
Privacy breaches mediated by LLM
Privilege escalation and lateral movement mediated by LLMs
Prompt injection
Proofs-of-concept of LLM exploits
Red teaming of LLMs
Retrieval Augmented Generation security
Secure LLM use and deployment
Keynotes
1. Johannes Bjerva, Aalborg University (Denmark). Prof. Bjerva’s research
is characterised by an interdisciplinary perspective on NLP, with a focus
on the potential for impact in society. His main contributions to my field
are to incorporate linguistic information into NLP, including large
language models (LLMs), and to improve the state of resource-poor
languages. Recent research focuses on embedding inversion and attacks on
multi-modal models.
2. Erick Galinkin, NVIDIA Corporation (USA). Erick Galinkin is a Research
Scientist at NVIDIA working on the security assessment and protection of
large language models. Previously, he led the AI research team at Rapid7
and has extensive experience working in the cybersecurity space. He is an
alumnus of Johns Hopkins University and holds degrees in applied
mathematics and computer science. Outside of his work, Erick is a lifelong
student, currently at Drexel University and is renowned for his ability to
be around equestrians.
3. TBA
Submission formats
Submissions must be anonymised & de-identified following ACL policy, and in
the ACL template.
Long & Short papers
We invite both short and long papers; short papers with a 4 page limit,
long papers with an 8 page limit, with references, ethics statements, &
other compulsory sections not subjected to this limit.
Qualitative work
As a relatively new field, still engaged in sense-making of the context of
this research, we particularly welcome rigorous qualitative work, and work
that provides novel information about LLMSEC practice and context.
War stories
Following cybersecurity tradition, LLMSEC also welcomes “war stories”, that
is, accounts of security investigations or operations that are informative
to broader audiences. These are intended to connect researchers and
practitioners; LLM security is highly interdisciplinary and we have a lot
to share with each other.
War story submissions need not provide novel quantitative empirical
results, but should be illuminating and helpful to the workshop audience.
They may be up to four pages, with references, appendices, and compulsory
sections excluded from the limit
Submission link
Submit via softconf: https://softconf.com/acl2025/llmsec2025/
Important Dates
Direct submission deadline: April 15, 2025
Notification of acceptance: May 17, 2025
Camera-ready paper deadline: June 16, 2025
Pre-recorded video due: July 5, 2025
Workshop dates: July 31st / August 1st 2025
TZ: Anywhere on earth
Organisation
Leon Derczynski. Principal Scientist in LLM Security at NVIDIA Corporation,
Associate Professor in NLP at ITU University of Copenhagen, President of
ACL SIGSEC. https://www.linkedin.com/in/leon-derczynski/
Jekaterina Novikova. Science Lead at the AI Risk and Vulnerability Alliance
(ARVA), Expert Advisor of ACL SIGSEC. https://jeknov.github.io/
Muhao Chen. Assistant Professor of Computer Science at Uuniversity of
California, Davis, Secretary of ACL SIGSEC. Prof Chen has considerable
organisational and service experience, including SAC and AC at NAACL, ACL,
EMNLP, and AAAI, and co-chairing workshops at NAACL 2022 and AKBC 2022.
https://muhaochen.github.io/