DLnLD: Deep Learning and Linked Data — Last Call for Paper
Workshop colocated with LREC-COLING 2024, Date: May 21, 2024 Submissions due: 9th March 2024 Venue: Torino, Italy and online
For up to date info, check: https://dl-n-ld.github.io/ https://dl-n-ld.github.io/
Call for Papers
---------------------------------------------------------------------------------------- What does Linguistic Linked Data brings to Deep Learning and vice versa ? Let’s bring together these two complementary approaches in NLP. ----------------------------------------------------------------------------------------
Motivations for the Workshop
Since the appearance of transformers (Vaswani et al., 2017), Deep Learning (DL) and neural approaches have brought a huge contribution to Natural Language Processing (NLP) either with highly specialized models for specific application or via Large Language Models (LLMs) (Devlin et al., 2019; Brown et al., 2020; Touvron et al., 2023) that are efficient few-shot learners for many NLP tasks. Such models usually build on huge web-scale data (raw multilingual corpora and annotated specialized, task related, corpora) that are now widely available on the Web. This approach has clearly shown many successes, but still suffers from several weaknesses, such as the cost/impact of training on raw data, biases, hallucinations, explainability, among others (Nah et al., 2023).
The Linguistic Linked Open Data (LLOD) (Chiarcos et al., 2013) community aims at creating/distributing explicitly structured data (modelled as RDF graphs) and interlinking such data across languages. This collection of datasets, gathered inside the LLOD Cloud (Chiarcos et al., 2020), contains a huge amount of multilingual ontological (e.g. DBpedia (Lehmann et al., 2015)); lexical (e.g., DBnary (Sérasset, 2015), Wordnet (McCrae et al., 2014), Wikidata (Vrandečić and Krötzsch, 2014)); or linguistic (e.g., Universal Dependencies Treebank (Nivre et al., 2020; Chiarcos et al., 2021), DBpedia Abstract Corpus (Brümmer et al., 2016)) information, structured using common metadata (e.g., OntoLex (McCrae et al., 2017), NIF (Hellmann et al., 2013), etc.) and standardised data categories (e.g., lexinfo (Cimiano et al., 2011), OliA (Chiarcos and Sukhareva, 2015)).
Both communities bring striking contributions that seem to be highly complementary. However, if knowledge (ontological) graphs are now routinely used in DL, there is still very few research studying the value of Linguistic/Lexical knowledge in the context of DL. We think that, today, there is a real opportunity to bring both communities together to take the best of both worlds. Indeed, with more and more work on Graph Neural Networks (Wu et al., 2023) and Embeddings on RDF graphs (Ristoski et al., 2019), there is more and more opportunity to apply DL techniques to build, interlink or enhance Linguistic Linked Open Datasets, to borrow data from the LLOD Cloud for enhancing Neural Models on NLP tasks, or to take the best of both worlds for specific NLP use cases.
Submission Topics
This workshop aims at gathering researchers that work on the interaction between DL and LLOD in order to discuss what each approach has to bring to the other. For this, we welcome contributions on original work involving some of the following (non exhaustive) topics:
• Deep Learning for Linguistic Linked Data, among which (but not exclusively): • Modelling, Resources & Interlinking, • Relation Extraction • Corpus annotation • Ontology localization • Knowledge/Linguistic Graphs creation or expansion • Linguistic Linked Data for Deep Learning, among which (but not exclusively): • Linguistic/Knowledge Graphs as training data • Fine tuning LLMs using Linguistic Linked (meta)Data • Graph Neural Networks • Knowledge/Linguistic Graphs embeddings • LLOD for model explainability/sourcing • Neural models for under-resourced languages • Joint Deep Learning and Linguistic Data applications • Use cases combining Language Models and Structured Linguistic Data • LLOD and DL for Digital Humanities • Question-Answering on graph data All application domains (Digital Humanities, FinTech, Education, Linguistics, Cybersecurity…) as well as approaches (NLG, NLU, Data Extraction…) are welcome, provided that the work is based on the use of BOTH Deep Learning techniques and Linguistic Linked (meta)Data.
Important Dates
All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”)
• Submissions due: 9th March 2024 (Hard deadline: there will be no deadline extension) • Notification of acceptance: 2nd April 2024 • Camera-ready due: 12th April 2024
Authors kit
All papers must follow the LREC-COLING 2024 two-column format, using the supplied official style files. The templates can be downloaded from the Style Files and Formatting page provided on the website. Please do not modify these style files, nor should you use templates designed for other conferences. Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review.
LREC-COLING 2024 Author’s Kit Page: https://lrec-coling-2024.org/authors-kit/ https://lrec-coling-2024.org/authors-kit/
Paper submission
Submission is electronic at https://softconf.com/lrec-coling2024/dlnld2024/ https://softconf.com/lrec-coling2024/dlnld2024/
Workshop Chairs
• Gilles Sérasset, Université Grenoble Alpes, France • Hugo Gonçalo Oliveira, University of Coimbra, Portugal • Giedre Valunaite Oleskeviciene, Mykolas Romeris University, Lithuania Program Committee
• Mehwish Alam, Télécom Paris, Institut Polytechnique de Paris, France • Russa Biswas, Hasso Plattner Institute, Potsdam, Germany • Milana Bolatbek, Al-Farabi Kazakh National University, Kazakhstan • Michael Cochez, Vrije Universiteit Amsterdam, Netherlands • Milan Dojchinovski, Czech Technical University in Prague, Czech Republic • Basil Ell, University of Oslo, Norway • Robert Fuchs, University of Hamburg, Germany • Radovan Garabík, L’. Štúr Institute of Linguistics, Slovak Academy of Sciences, Slovakia • Daniela Gifu, Romanian Academy, Iasi branch & Alexandru Ioan Cuza University of Iasi, Romania • Katerina Gkirtzou, Athena Research Center, Maroussi, Greece • Jorge Gracia del Río, University of Zaragoza, Spain • Dagmar Gromann, University of Vienna, Austria • Dangis Gudelis, Mykolas Romeris University, Lithuania • Ilan Kernerman, Lexicala by K Dictionaries, Israel • Chaya Liebeskind, Jerusalem College of Technology, Israel • Marco C. Passarotti, Università Cattolica del Sacro Cuore, Milan, Italy • Heiko Paulheim, University of Mannheim, Germany • Alexandre Rademaker, IBM Research Brazil and EMAp/FGV, Brazil • Georg Rehm, DFKI GmbH, Berlin, Germany • Harald Sack, Karlsruhe Institute of Technology, Karlsruhe, Germany • Didier Schwab, Université Grenoble Alpes, France • Ranka Stanković, University of Belgrade, Serbia • Andon Tchechmedjiev, IMT Mines Alès, France • Dimitar Trajanov, Ss. Cyril and Methodius University – Skopje, Macedonia • Ciprian-Octavian Truică, POLITEHNICA Bucharest, Romania • Nicolas Turenne, Guangdong University of Foreign Studies, China • Slavko Žitnik, University of Ljubljana, Slovenia
Conspiracy theories are complex narratives that attempt to explain the ultimate causes of significant events as cover plots orchestrated by secret, powerful, and malicious groups. A challenging aspect of identifying conspiracy stems from the difficulty of distinguishing critical thinking from conspiratorial thinking. This distinction is vital because labeling a message as conspiratorial when it is only oppositional could drive those who were simply asking questions into the arms of the conspiracy communities.
At PAN 2024 we aim at analyzing texts that reflect oppositional thinking and contain either conspiracy or critical narratives: https://pan.webis.de/clef24/pan24-web/oppositional-thinking-analysis.html
The task will address two new challenges for the research community: (1) to distinguish the conspiracy narrative from other oppositional narratives that do not express a conspiracy mentality (i.e., critical thinking); and (2) to identify in online messages the key elements of a narrative that fuels the intergroup conflict in oppositional thinking. To this end we provide two Telegram text datasets, one English and one Spanish, and we propose two sub-tasks:
1. Distinguishing between critical and conspiracy texts, a binary classification task aimed at differentiating between critical messages that question major decisions in the public health domain, but do not promote a conspiracist mentality; and messages that view the pandemic or public health decisions as a result of a malevolent conspiracy by secret, influential groups.
2. Detecting elements of the oppositional narratives, a token-level classification task aimed at recognizing text spans corresponding to the key elements of oppositional narratives, where each annotation corresponds to a narrative element, and is described by its span and its category (there are six distinct span categories: AGENT, FACILITATOR, VICTIM, CAMPAIGNER, OBJECTIVE, NEGATIVE_EFFECT).
We propose the task both in English and in Spanish. Although we recommend to participate in both languages, it is possible to address the problem just in one language.
The training dataset (where the authors have been anonymised and neutral labels have been used) can be requested via Zenodo: https://zenodo.org/records/10680586
IMPORTANT DATES: February 20, 2024: Train data release May 30, 2024: Software submission deadline June 15, 2024: Participant paper submission Midnight CEST [submission] [paper template] July 1st, 2024: Peer review notification July 08, 2024: Camera-ready participant papers submission September 09-12, 2024: CLEF conference https://clef2024.imag.fr/
Paolo Rosso, co-organiser of the PAN task on Oppositional thinking analysis
Apologies: the PAN-2024 task on disinformation detection is on conspiracy theories and not on profiling fake news spreaders as wrongly announced one month ago (task at PAN on 2020)...
Paolo Rosso -----------
Paolo Rosso via Corpora corpora@list.elra.info ha scritto:
Conspiracy theories are complex narratives that attempt to explain the ultimate causes of significant events as cover plots orchestrated by secret, powerful, and malicious groups. A challenging aspect of identifying conspiracy stems from the difficulty of distinguishing critical thinking from conspiratorial thinking. This distinction is vital because labeling a message as conspiratorial when it is only oppositional could drive those who were simply asking questions into the arms of the conspiracy communities.
At PAN 2024 we aim at analyzing texts that reflect oppositional thinking and contain either conspiracy or critical narratives: https://pan.webis.de/clef24/pan24-web/oppositional-thinking-analysis.html
The task will address two new challenges for the research community: (1) to distinguish the conspiracy narrative from other oppositional narratives that do not express a conspiracy mentality (i.e., critical thinking); and (2) to identify in online messages the key elements of a narrative that fuels the intergroup conflict in oppositional thinking. To this end we provide two Telegram text datasets, one English and one Spanish, and we propose two sub-tasks:
- Distinguishing between critical and conspiracy texts, a binary
classification task aimed at differentiating between critical messages that question major decisions in the public health domain, but do not promote a conspiracist mentality; and messages that view the pandemic or public health decisions as a result of a malevolent conspiracy by secret, influential groups.
- Detecting elements of the oppositional narratives, a token-level
classification task aimed at recognizing text spans corresponding to the key elements of oppositional narratives, where each annotation corresponds to a narrative element, and is described by its span and its category (there are six distinct span categories: AGENT, FACILITATOR, VICTIM, CAMPAIGNER, OBJECTIVE, NEGATIVE_EFFECT).
We propose the task both in English and in Spanish. Although we recommend to participate in both languages, it is possible to address the problem just in one language.
The training dataset (where the authors have been anonymised and neutral labels have been used) can be requested via Zenodo: https://zenodo.org/records/10680586
IMPORTANT DATES: February 20, 2024: Train data release May 30, 2024: Software submission deadline June 15, 2024: Participant paper submission Midnight CEST [submission] [paper template] July 1st, 2024: Peer review notification July 08, 2024: Camera-ready participant papers submission September 09-12, 2024: CLEF conference https://clef2024.imag.fr/
Paolo Rosso, co-organiser of the PAN task on Oppositional thinking analysis
Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info