[CfP] 2nd CfP - LLMs Beyond the Cutoff workshop @ CIKM 2024 - Corpora

18 Jul 2024


      *Apologies for crossposting*
LLMs Beyond the Cutoff: 1st International Workshop on Computational Methods
Beyond the Temporal Borders of Training Data
https://llmsbeyondthecutoff2024.wordpress.com
Collocated with CIKM 2024
October 25, 2024 — Boise (Idaho), USA
* July 29, 2024: Paper submission deadline
* August 30, 2024: Paper acceptance notification
* September 15, 2024: Camera ready versions submission
* October 25, 2024: Workshop date
=== NEWS ===
* LLMs Beyond the CutOff will be published as a volume of Springer Nature’s
post-proceedings
* Submission via EasyChair:
https://easychair.org/conferences/?conf=llmsbeyondthecut0ff
* Springer guidelines for authors:
https://www.springer.com/gp/computer-science/lncs/conference-proceedings-gui...
SUMMARY
LLMs are trained on large amounts of web data that spread temporally up to
a specific moment in time. For instance, chatGPT’s LLM “knows” the world
before May 2023 with no real time access to information beyond this limit,
other than a browsing tool similar to a search engine enabling simple
lookup. However, in many scenarios, being able to analyze and reason with
novel emerging events and topics is crucial to face the challenges of
rapidly evolving landscapes of information.
The workshop provides an interdisciplinary forum for discussing the
temporal limitations of LLMs and proposing technical solutions of how to
apply and develop LLMs beyond their cutoff dates. We explore two prominent
scenarios, where contexts tend to evolve faster than the LLMs that are used
to analyze them: (1) journalism and (2) industry. In terms of (1) the goal
is to propose methods of detecting, classifying and reasoning with emerging
topics that infuse public discourse on social or mainstream media. An
example of such a topic is COVID-19 at the dawn of the pandemics outbreak.
Downstream tasks of interest are fake news detection and fact-checking on
novel topics, including claim analysis, opinion mining and narratives
extraction. With regard to (2), the goal is to shed light on the limits of
LLMs for companies in sectors such as international geopolitical monitoring
and corporate intelligence, finance and stock market trading or insurance,
where companies need to track their interests and products in real time.
This does not address the inclusion of corporate data into the LLMs, but
rather proposes solutions by using publicly available and constantly
growing data.  An overarching problem that will be studied is that of the
cross-language and cross-country specificities of emerging data, where
novel information in underrepresented languages or contexts may be more
challenging to analyze. We welcome insights and parallels from the field of
knowledge representation, where the similar problem with cutoff dates of
knowledge graphs (dynamics and regular updates) is well understood.
The expected outcomes are: 1) insights on the temporal limitations of LLMs,
where the workshop will outline concrete challenges and bottlenecks in the
identified scenarios; 2) novel methodological and technical solutions in
terms of (incremental) machine learning models when dealing with
(reasoning, extracting and classifying) information beyond the cutoff dates
of current LLMs.
TOPICS OF INTEREST
* Analysis of emerging topics and events, including counterfactual/what-if
reasoning
* Methods for few-shot or zero-shot learning
* Large language models for online discourse
* Large language models for corporate near real-time data analysis
* Large language models for multimodal understanding and generation
* Multilingual and cross-country emerging information extraction
* Computational journalism, disinformation spread, fact-checking and fake
news detection
* Stance and viewpoint discovery for novel information
* Detection and classification of claims within emerging narratives
* Social, ethical and legal aspects of LLMs up-to-dateness
* Interpretability / explainability of computational methods beyond the cut
off
* Linking and enrichment of data beyond LLM cut off
* Foundational models for knowledge graph building and entity alignment
* Recommender systems for novel information
* Quality, provenance, uncertainty and trust of emerging information and
data
* Use-cases, applications and cross-community interfaces
* Evaluation frameworks and benchmarks
SUBMISSION
We welcome the following types of contributions:
* Full papers (12-15 pages including references): contain original research.
* Short papers (up to 11 pages including references): contain original
research in progress.
* Demo papers (up to 11 pages including references): contain descriptions
of prototypes, demos or software systems.
* Data papers (up to 11 pages including references): contain descriptions
of resources related to the workshop topics, such as datasets, knowledge
graphs, corpora, annotation protocols, etc.
* Position papers (up to 11 pages including references): discuss vision
statements or research directions.
Workshop papers must be self-contained and in English. They should not have
been previously published, should not be considered for publication, and
should not be under review for another workshop, conference, or journal.
Manuscripts should be submitted via EasyChair (
https://easychair.org/conferences/?conf=llmsbeyondthecut0ff) in PDF format,
using the Springer LNCS format. For full authors instructions, please check
Springer’s website:
https://www.springer.com/gp/computer-science/lncs/conference-proceedings-gui....
The review of manuscripts will be double-blind. Papers will be evaluated
according to their significance, originality, technical content, style,
clarity, and relevance to the workshop. At least one author of each
accepted contribution must register for the workshop and present the paper.
Pre-prints of all contributions will be made available during the
conference. The accepted papers will appear as a volume of Springer
Nature’s LNCS post-proceedings.
Submission via EasyChair:
https://easychair.org/conferences/?conf=llmsbeyondthecut0ff
Springer guidelines for authors:
https://www.springer.com/gp/computer-science/lncs/conference-proceedings-gui...
For any enquiries, please contact the workshop organizers:
todorov@lirmm.fr, rettinger@uni-trier.de, jmgomez@expert.ai,
croitoru@lirmm.fr,
IMPORTANT DATES
* July 29, 2024: Paper submission deadline
* August 30, 2024: Paper acceptance notification
* September 15, 2024: Camera ready versions submission
* October 25, 2024: Workshop date
All submission deadlines are end-of-day in the Anywhere on Earth (AoE) time
zone.
KEYNOTES
* TBA
AWARD
* All contributions are eligible for the "Best Paper" award
ORGANIZING COMMITTEE
* Konstantin Todorov (University of Montpellier, CNRS, LIRMM, France)
* José Manuel Gomèz Perèz (Expert.ai, Spain)
* Madalina Croitoru (University of Montpellier, CNRS, LIRMM, France)
* Achim Rettinger (University of Trier, Germany)
PROGRAM COMMITTEE
* Preslav Nakov, MBZUAI, United Arabe Emirates
* Serena Villata, I3S, CNRS, France
* Ronald Denaux, Amazon, USA
* Filip Ilievski, Vrije Universiteit Amsterdam, The Netherlands
* Elena Montiel, Universidad Politécnica de Madrid, Spain
* Sandra Bringay, University Paul Valéry, France
* Carlos Badenes, Universidad Politécnica de Madrid, Spain
* Ioana Manolescu, Inria Saclay, France
* Dino Ienco, INRAE, France
* Colin Porlezza, Univ. della Svizzera Italiana, Switzerland
* Katarina Boland, Heinrich Heine Universität, Germany
* Gabriella Lapesa, GESIS, Germany
* Jonas Fegert, FZI, Germany
* Michael Färber, TU-Dresden, Germany
* Salim Hafid, University of Montpellier, France
* Pavlos Fafalios, FORTH, Greece
* Andrés García Silva, Expert.ai, Spain
* Sarah Labelle, University Paul Valéry, France
* Pablo Calleja, Universidad Politécnica de Madrid, Spain
*Patricia Martín Chozas*
*Assistant Professor *at the Applied Linguistics Department
*Postdoctoral Researcher *at the Ontology Engineering Group
(Artificial Intelligence Department)
ETSI Informáticos - Universidad Politécnica de Madrid
Phone: (+34) 910673091