We are delighted to invite you to an afternoon of public lectures on language technology and its interaction with society. The lectures will take place in Auditorium 4 at the IT University of Copenhagen on 13 June 2023, from 14:30 to 16:45 (UTC+2). Attendance is free for anyone interested.
The lectures will also be streamed on Zoom. For remote participation, please register here:
https://itucph.zoom.us/meeting/register/u5Utf-Chrz0sHNYzd76DD9WjJpOU0CckVsMb
Speakers and titles:
- Luca Maria Aiello, IT University of Copenhagen: The Language of Coordination
- Shashi Narayan, Google London: Introducing Text-blueprint: Conditional generation with question-answering plans
- Anne Lauscher, University of Hamburg: Ethical conversational AI - Searching for the truth?
Abstracts and further information can be found here:
https://christianhardmeier.rax.ch/workshop/langtech-society-2023/
--
Christian Hardmeier, Associate Professor – https://christianhardmeier.rax.ch/
IT University of Copenhagen, Department of Computer Science
CoCo4MT Shared Task: First Call for Participation
We are excited to introduce a new shared task for this year’s CoCo4MT
workshop! Our aim is to encourage and facilitate research on corpus
construction for low-resource machine translation.
Corpus creation for machine translation is typically constrained by the
cost and availability of human translators. When a new dataset needs to be
created for a low-resource language or a specialized domain, the annotation
budget should be used efficiently and any sentences chosen for translation
should be of high quality and as useful for machine translation system
training as possible.
In this shared task, we ask participants to come up with ways in which such
examples can be identified for a target language without any existing data.
Specifically, given a parallel corpus between high-resource languages, the
goal is to choose a good subset of the high-resource corpus to be
translated into the low-resource language, in order to obtain a good
training set for a machine translation system. The shared task winner will
be the team whose instances result in the best final system after training.
Detailed information: https://sites.google.com/view/coco4mt/shared-task
Registration: https://forms.gle/jfKSPQMKEmaaXFHy5
Important Dates
- May 19 2023: Release of train, dev and test data
-
May 30 2023: Release of baselines
-
July 12, 2023: Deadline to submit results
-
July 20, 2023: System description papers due
Organizers (listed alphabetically)
-
Ananya Ganesh, University of Colorado Boulder
-
Constantine Lignos, Brandeis University
-
John E. Ortega, Northeastern University
-
Jonne Sälevä, Brandeis University
-
Katharina Kann, University of Colorado Boulder
-
Marine Carpuat, University of Maryland
-
Rodolfo Zevallos, Universitat Pompeu Fabra
-
Shabnam Tafreshi, University of Maryland
-
William Chen, Carnegie Mellon University
--
Dr. Katharina Kann
Assistant Professor of Computer Science
University of Colorado Boulder
Personal page: https://kelina.github.io
Group page: https://nala-cub.github.io
==================================================
*CFP: ML/NLP Competition on Automatic Classification of Literary Epochs
(CoLiE)*
To advance the field of implicit temporal information retrieval from a
text, this competition aims to challenge participants to develop automatic
methods to identify the literary epochs of a given text, which is
considered here as an implicit temporal context of a book.
The task on Automatic Classification of Literary Epochs (CoLiE) aims at
automatic identification of the literary epoch of a given text from its
writing style: (1) Romanticism (1798-1837), (2) Victorian Literature
(1837-1901), (3) Modernism (1900-1945), (4) Postmodernism (1945-2000), and
(5) our days (from 2000).
The competition is held as a part of the IACT’23
<https://en.sce.ac.il/news/iact23> workshop, held on July 27, 2023, in
conjunction with the 46th International ACM SIGIR Conference on Research
and Development in Information Retrieval
This competition is open to anyone with a passion for information
retrieval, machine learning, and natural language processing. Whether you
are a seasoned expert or a newcomer to the field, we welcome you to
participate and extend the boundaries of automated text analysis!
Competition site: http://www.kaggle.com/competitions/colie
Competition Timeline
- May 28, 2023: The competition is open to participants. Training and
validation sets together with their labels are available.
- July 10, 2023: Test dataset available.
- July 17, 2023, 23:59 UTC: Final submission deadline.
- July 27, 2023: The winners are announced at the special session at the
IACT'23 <https://en.sce.ac.il/news/iact23> workshop.
*The organizing team*
- Dr. Marina Litvak (marinal(a)ac.sce.ac.il),
Software Engineering Department,
Shamoon College of Engineering, Beer Sheva,
84100, Israel
- Dr. Irina Rabaev (irinar(a)ac.sce.ac.il),
Software Engineering Department,
Shamoon College of Engineering, Beer Sheva,
84100, Israel
- Prof. Ricardo Campos (ricardo.campos(a)ipt.pt),
Ci2 - Smart Cities Research Center, Polytechnic Institute of Tomar
INESC TEC, Porto
Porto, Portugal
- Prof. Alípio Mário Jorge (amjorge(a)fc.up.pt)
University of Porto
Porto, Portugal
- Prof. Adam Jatowt (adam.jatowt(a)uibk.ac.at)
University of Innsbruck,
Innsbruck, Austria
- Mr. Vladimir Younkin (vladiyo(a)ac.sce.ac.il),
Software Engineering Department,
Shamoon College of Engineering, Beer Sheva,
84100, Israel
--
Best regards,
Marina Litvak
On behalf of Dr. Elliot Crowley from the School of Engineering at the University of Edinburgh (queries at elliot.crowley(a)ed.ac.uk):
Application link: Affordable Training of Large Language Models<https://www.eng.ed.ac.uk/studying/postgraduate/research/phd/affordable-trai…>
Recent developments in large language models (LLMs) have caught the attention of the public. LLMs such as OpenAI's GPT-4 and Google's Bard are able to generate remarkably realistic, coherent text based on a user's input and have the potential to be general-purpose tools used throughout society e.g. for customer service, summarising text, answering questions, writing contracts or translating between languages.
However, LLMs are prohibitively expensive to train. GPT-3 (which is significantly smaller than its successor, GPT-4) has an estimated training time of 355-GPU years and an estimated training cost of $4.6M [1]. Only large, wealthy institutions can train these models and thereby control how they are trained and who gets access to them. This is undemocratic.
Very recent work provides hope however. In [2] the authors explore the promising idea of “cramming”: the training of a LLM on a single GPU in a day. In [3] the authors use synthetic data to train “small” language models that can produce consistent stories at little cost. There is a huge discrepancy in quality between these models and their expensive counterparts, however.
In this PhD, the student will investigate affordable LLM training i.e. with limited compute and/or data, inspired by [2,3]. Avenues of research could include (i) generating training data that facilitates fast training e.g. through dataset distillation [4]; (ii) exploring neural architecture search to develop models that are "aware" of being resource-constrained while being trained; (iii) developing novel cost-effective training algorithms, (iv) leveraging and tuning open-source LLMs.
The successful student will have opportunities for collaboration within and outside Edinburgh’s School of Engineering e.g. with colleagues in the Institute for Digital Communications<https://www.eng.ed.ac.uk/research/institutes/idcom/>, The Bayesian and Neural Systems Group<https://www.bayeswatch.com/>, and Edinburgh NLP<https://edinburghnlp.inf.ed.ac.uk/>.
[1] https://lambdalabs.com/blog/demystifying-gpt-3
[2] https://arxiv.org/abs/2212.14034
[3] https://arxiv.org/abs/2305.07759
[4] https://arxiv.org/abs/1811.10959
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
*** Final Call for Submissions ***
10th European Conference On Service-Oriented And Cloud Computing (ESOCC 2023)
October 24-26, 2023, Golden Bay Beach Hotel, Larnaca, Cyprus
https://cyprusconferences.org/esocc2023/
(Proceedings to be published in Springer LNCS;
Journal Special Issue with Springer Computing)
Submission Deadline: Abstracts by June 25, 2023; Full Papers by July 2, 2023
AIM AND SCOPE
Nowadays, Service-Oriented and Cloud Computing are the primary approaches to build
large-scale distributed systems and deliver software services to end users. Cloud-native
software is pervading the delivery of enterprise applications, as they are composed of
(micro)services that can be independently developed and deployed by exploiting multiple
heterogeneous technologies. Resulting applications are polyglot service compositions that can
then be shipped in serverful or serverless platforms (e.g., using virtualization technologies).
These characteristics make Service-Oriented and Cloud Computing the natural answers for
fulfilling the industry’s need for flexibly scalable and maintainable enterprise applications, to
be delivered through state-of-the-art methodologies, like DevOps. To further support this,
researchers and practitioners need to create methods, tools and techniques to support
cost-effective and secure development as well as use of dependable devices, platforms,
services and service-oriented applications in the Cloud, now also considering the Cloud-IoT
computing continuum to exploit widespread adoption of smart connected things and the
increasing growth of their computing capabilities.
The European Conference on Service-Oriented and Cloud Computing (ESOCC) is the premier
conference on advances in the state-of-the-art and practice of Service-Oriented Computing
and Cloud Computing in Europe. ESOCC aims to facilitate the exchange between researchers
and practitioners in the areas of Service-Oriented Computing and Cloud Computing, as well as
to explore the new trends in those areas and foster future collaborations in Europe and beyond.
TOPICS OF INTEREST
ESOCC 2023 seeks original, high-quality contributions related to all aspects of Service-Oriented
and Cloud computing. Specific topics of interest include, but are not limited to: • Applications for Service-Oriented and Cloud Computing, e.g., big data, commerce, energy,
finance, health, scientific computing, smart cities • Blockchains for Service-Oriented and Cloud Computing • Business aspects of Service-Oriented and Cloud Computing, e.g., business models,
brokerage, marketplaces, costs, pricing • Business processes, e.g., service-based workflow deployment and management • Cloud interoperability, service and Cloud standards, • Cloud-IoT computing continuum, e.g., edge computing, fog computing, mobility computing,
next generation services/IoT • Cloud-native architectures and paradigms, e.g., microservices and DevOps • Cloud service models, e.g., IaaS, PaaS, SaaS, DBaaS, FaaS, etc. • Deployment, composition, and management of applications in Service-Oriented and Cloud
Computing • Foundations and formal methods for Service-Oriented and Cloud Computing • Enablers for Service-Oriented and Cloud Computing, e.g., service discovery, orchestration,
matchmaking, monitoring, and analytics • Model-Driven Engineering for Service-Oriented and Cloud Computing • Multi-Cloud, cross-Cloud, and federated Cloud solutions • Requirements engineering, design, development, and testing of applications in
Service-Oriented and Cloud Computing • Semantic services and service mining • Service and Cloud middlewares and platforms • Software/service adaptation and evolution in Service-Oriented and Cloud Computing • Storage, computation and network Clouds • Sustainability and energy issues in Service-Oriented and Cloud Computing • Quality aspects (e.g., governance, privacy, security, and trust) of Service-Oriented and Cloud
Computing • Quality of Service (QoS) and Service-Level Agreement (SLA) for Service-Oriented and Cloud
Computing • Social aspects of Service-Oriented and Cloud Computing, e.g., crowdsourcing services, social
and crowd-based Clouds • Virtualization for Service-Oriented and Cloud Computing, e.g., serverless, container-based
virtualization, VMs
IMPORTANT DATES
• Submission of abstracts: June 25th, 2023 (AoE) • Submission of full papers: July 2nd, 2023 (AoE) • Notification to authors: August 4th, 2023 (AoE) • Camera-ready versions due: August 21st, 2023 (AoE)
• Author registration due: August 21st, 2023 (AoE)
TYPES OF SUBMISSIONS
ESOCC 2023 invites submissions of the following kinds: • Regular Research Papers (15 pages including references) • PhD Symposium (12 pages including references) • Projects and Industry Reports (Projects and Industry Reports (1 to 6 pages including
references, describing an ongoing EU or national project, or providing industrial perspectives
on innovative applications, technologies, or methods in ESOCC’s scope)
We only accept original papers, not submitted for publication elsewhere. The papers must be
formatted according to the proceedings guidelines of Springer’s Lecture Notes in Computer
Science (LNCS) series (http://www.springer.com/lncs).
They must be submitted to the EasyChair site at: https://easychair.org/conferences/?conf=esocc2023 by selecting the right track.
Accepted papers from all tracks will be published in the main conference proceedings by
Springer in the LNCS series. For publication to happen, at least one author of each accepted
paper is expected to register and present the work at the conference.
The best papers accepted will be invited to submit extended versions for a Journal Special
Issue to be published by Springer Computing.
ORGANISATION
General Chair
• George A. Papadopoulos, University of Cyprus, CY
(george at ucy.ac.cy)
Program Chairs
• Florian Rademacher, University of Applied Sciences and Arts Dortmund, DE
(florian.rademacher at fh-dortmund.de) • Jacopo Soldani, University of Pisa, IT
(jacopo.soldani at unipi.it)
Steering and Program Committee
https://cyprusconferences.org/esocc2023/committees/
The University of Bologna is offering* a funded residential bootcamp*:
*"Theories and methods for the corpus-assisted analysis of discourse: from
language that denotes to language that expresses phenomena". *
20 participants
*27-30 September (Bertinoro - Italy)*
*Deadline for application 19/06/2023*
#cadscamp
All info:
*https://centri.unibo.it/colitec/en/events/bootcamp-28-30-september-2023
<https://centri.unibo.it/colitec/en/events/bootcamp-28-30-september-2023> *
Best,
Anna
Call for Participation:
*FinCausal-2023 Shared Task: “Financial Document Causality Detection” *is
organised within the *5th Financial Narrative Processing Workshop (FNP
2023)* taking place in the 2023 IEEE International Conference on Big Data
(IEEE BigData 2023) <http://bigdataieee.org/BigData2023/>, Sorrento, Italy,
15-18 December 2023. It is a *one-day event*. The exact date is to be
announced.
Important Dates:
- Call for participation and registration: 3rd June 2023
- Registration deadline: 28 June
- Training set release: 29 June 2023
- Test set release: 5 September 2023
- Systems submission deadline: 15 September 2023
- Release of results: 20 September 2023
- Paper submission deadline: 20 October 2023
- Notification of acceptance: November 12, 2023
- Camera-ready of accepted papers: November 20, 2023
- FNP Workshop: December 2023
Workshop URL: https://wp.lancs.ac.uk/cfie/fincausal2023/
Registration Form: https://forms.gle/29E161a8RmMosBLU8. After completing
the registration form, the practice set will be sent to participants.
*Shared Task Description:*
Financial analysis needs factual data and an explanation of the variability
of these data. Data state facts but need more knowledge regarding how these
facts materialised. Furthermore, understanding causality is crucial in
studying decision-making processes.
The *Financial Document Causality Detection Task* (FinCausal) aims at
identifying elements of cause and effect in causal sentences extracted from
financial documents. Its goal is to evaluate which events or chain of
events can cause a financial object to be modified or an event to occur,
regarding a given context. In the financial landscape, identifying cause
and effect from external documents and sources is crucial to explain why a
transformation occurs.
Two subtasks are organised this year. *English FinCausal subtask *and* Spanish
FinCausal subtask*. This is the first year where we introduce a subtask in
Spanish.
*Objective*: For both tasks, participants are asked to identify, given a
causal sentence, which elements of the sentence relate to the cause, and
which relate to the effect. Participants can use any method they see fit
(regex, corpus linguistics, entity relationship models, deep learning
methods) to identify the causes and effects.
*English FinCausal subtask*
- *Data Description: *The dataset has been sourced from various 2019
financial news articles provided by Qwam, along with additional SEC data
from the Edgar Database. Additionally, we have augmented the dataset from
FinCausal 2022, adding 500 new segments. Participants will be provided with
a sample of text blocks extracted from financial news and already labelled.
- *Scope: *The* English FinCausal subtask* focuses on detecting causes
and effects when the effects are quantified. The aim is to identify, in
a causal sentence or text block, the causal elements and the consequential
ones. Only one causal element and one effect are expected in each segment.
- *Length of Data fragments: *The* English FinCausal subtask* segments
are made up of up to three sentences.
- *Data format: *CSV files. Datasets for both the English and the
Spanish subtasks will be presented in the same format.
This shared task focuses on determining causality associated with a
quantified fact. An event is defined as the arising or emergence of a new
object or context regarding a previous situation. So, the task will
emphasise the detection of causality associated with the transformation of
financial objects embedded in quantified facts.
*Spanish FinCausal subtask*
- *Data Description: *The dataset has been sourced from a corpus of
Spanish financial annual reports from 2014 to 2018. Participants will be
provided with a sample of text blocks extracted from financial news,
labelled through inter-annotator agreement.
- *Scope: *The *Spanish FinCausal subtask* aims to detect all types of
causes and effects, not necessarily limited to quantified effects. The
aim is to identify, in a paragraph, the causal elements and the
consequential ones. Only one causal element and one effect are expected in
each paragraph.
- *Length of Data fragments: *The *Spanish FinCausal subtask* involves
complete paragraphs.
- *Data format: *CSV files. Datasets for both the English and the
Spanish subtasks will be presented in the same format.
This shared task focuses on determining causality associated with both
events or quantified facts. For this task, a cause can be the justification
for a statement or the reason that explains a result. This task is also a
relation detection task.
*FinCausal Shared Task Organisers:*
- Antonio Moreno-Sandoval (UAM, Spain)
- Blanca Carbajo Coronado (UAM, Spain)
- Doaa Samy (UCM, Spain)
- Jordi Porta (UAM, Spain)
- Dominique Mariko (Yseop, France)
For any questions, please contact the organisers at *fincausal.2023(a)gmail.com
<fincausal.2023(a)gmail.com>*
8th Symposium on Corpus Approaches to Lexicogrammar (LxGr2022)
The symposium will take place online on 6-8 July 2023.
The programme and registration details are here: https://sites.edgehill.ac.uk/lxgr/lxgr2023
For more information, contact lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>.
________________________________
Edge Hill University<http://ehu.ac.uk/home/emailfooter>
Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter>
University of the Year, Educate North 2021/21
________________________________
This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>
Dear colleagues,
We are happy to invite you to join the *Arabic NER SharedTask 2023*
<https://dlnlp.ai/st/wojood/> which will be organized as part of the WANLP
2023. We will provide you with a large corpus and Google Colab notebooks to
help you reproduce the baseline results.
دعوة للمشاركة في مسابقة استخراج الكيونات المسماه من النصوص العربية. سنزود
المشاركين بمدونة وبرمجيات للحصول على نتائج مرجعية يمكنهم البناء عليها.
*INTRODUCTION*
Named Entity Recognition (NER) is integral to many NLP applications. It is
the task of identifying named entity mentions in unstructured text and
classifying them to predefined classes such as person, organization,
location, or date. Due to the scarcity of Arabic resources, most of the
research on Arabic NER focuses on flat entities and addresses a limited
number of entity types (person, organization, and location). The goal of
this shared task is to alleviate this bottleneck by providing Wojood, a
large and rich Arabic NER corpus. Wojood consists of about 550K tokens (MSA
and dialect, in multiple domains) that are manually annotated with 21
entity types.
*REGISTRATION*
Participants need to register via this form (
*https://forms.gle/UCCrVNZ2LaPviCZS6* <https://forms.gle/UCCrVNZ2LaPviCZS6>).
Participating teams will be provided with common training development
datasets. No external manually labelled datasets are allowed. Blind test
data set will be used to evaluate the output of the participating teams.
Each team is allowed a maximum of 3 submissions. All teams are required to
report on the development and test sets (after results are announced) in
their write-ups.
*FAQ*
For any questions related to this task, please check our *Frequently Asked
Questions*
<https://docs.google.com/document/d/1XE2n89mFLic2P9DO_sAD51vy734BOt0kgtZ6bFf…>
*IMPORTANT DATES*
- March 03, 2023: Registration available
- May 25, 2023: Data-sharing and evaluation on development set Avaliable
- June 10, 2023: Registration deadline
- July 20, 2023: Test set made available
- July 30, 2023: Evaluation on test set (TEST) deadline
- Augest 29, 2023: Shared task system paper submissions due
- October 12, 2023: Notification of acceptance
- October 30, 2023: Camera-ready version
- TBA: WANLP 2023 Conference.
** All deadlines are 11:59 PM UTC-12:00 (Anywhere On Earth).*
*CONTACT*
For any questions related to this task, please contact the organizers
directly using the following email address: *NERShare...(a)gmail.com
<https://groups.google.com/>* or join the google group:
*https://groups.google.com/g/ner_sharedtask2023*
<https://groups.google.com/g/ner_sharedtask2023>.
*SHARED TASK*
As described, this shared task targets both flat and nested Arabic NER. The
subtasks are:
*Subtask 1:* *Flat NER*
In this subtask, we provide the Wojood-Flat train (70%) and development
(10%) datasets. The final evaluation will be on the test set (20%). The
flat NER dataset is the same as the nested NER dataset in terms of
train/test/dev split and each split contains the same content. The only
difference in the flat NER is each token is assigned one tag, which is the
first high-level tag assigned to each token in the nested NER dataset.
*Subtask 2:* *Nestd NER*
In this subtask, we provide the Wojood-Nested train (70%) and development
(10%) datasets. The final evaluation will be on the test set (20%).
*METRICS*
The evaluation metrics will include precision, recall, F1-score. However,
our official metric will be the micro F1-score.
The evaluation of shared tasks will be hosted through CODALAB. Teams will
be provided with a CODALAB link for each shared task.
-*CODALAB link for NER Shared Task Subtask 1 (Flat NER)*
<https://codalab.lisn.upsaclay.fr/competitions/11594>
-*CODALAB link for NER Shared Task Subtask 2 (Nestd NER)*
<https://dlnlp.ai/st/wojood/>
*BASELINES*
Two baseline models trained on Wojood (flat and nested) are provided:
*Nested NER baseline:* is presented in this *article*
<https://aclanthology.org/2022.lrec-1.387/>, and code is available in
*GitHub* <https://github.com/SinaLab/ArabicNER>. The model achieves a micro
F1-score of 0.9059 (note that this baseline does not handle nested entities
of the same type).
*Flat NER baseline:* same code repository for nested NER (*GitHub*
<https://github.com/SinaLab/ArabicNER>) can also be used to train flat NER
task. Our flat NER baseline achieved a micro F1-score of 0.8785.
*GOOGLE COLAB NOTEBOOKS*
To allow you to experiment with the baseline, we authored four Google Colab
notebooks that demonstrate how to train and evaluate our baseline models.
[1] *Train Flat NER*
<https://gist.github.com/mohammedkhalilia/72c3261734d7715094089bdf4de74b4a>:
This notebook can be used to train our ArabicNER model on the flat NER task
using the sample Wojood data found in our repository.
[2] *Evaluate Flat NER*
<https://gist.github.com/mohammedkhalilia/c807eb1ccb15416b187c32a362001665>:
this notebook will use the trained model saved from the notebook above to
perform evaluation on unseen dataset.
[3] *Train Nested NER*
<https://gist.github.com/mohammedkhalilia/a4d83d4e43682d1efcdf299d41beb3da>:
This notebook can be used to train our ArabicNER model on the nested NER
task using the sample Wojood data found in our repository.
[4] *Evaluate Nested NER*
<https://gist.github.com/mohammedkhalilia/9134510aa2684464f57de7934c97138b>:
this notebook will use the trained model saved from the notebook above to
perform evaluation on unseen dataset.
*ORGANIZERS*
- Mustafa Jarrar, Birzeit University
- Muhammad Abdul-Mageed, University of British Columbia & MBZUAI
- Mohammed Khalilia, Birzeit University
- Bashar Talafha, University of British Columbia
- AbdelRahim Elmadany, University of British Columbia
- Nagham Hamad, Birzeit University
- Alaa Omer, Birzeit University
*The 5th Financial Narrative Processing Workshop (FNP 2023)*
To be held at the 2023 IEEE International Conference on Big Data (IEEE
BigData 2023), Sorrento, Italy, from 15 to 18 December 2023.
FNP 2023: http://wp.lancs.ac.uk/cfie/fnp2023/
*Submission page:*
https://wi-lab.com/cyberchair/2023/bigdata23/scripts/submit.php?subarea=S14…
*Important Dates:*
1st Call for workshop papers: June 1, 2023
2nd Call for workshop papers: August 15, 2023
Final Call for workshop papers: October 1, 2023
Due date for workshop papers submission: October 30, 2023 (anywhere in the
world)
Notification of paper acceptance to authors: November 12, 2023
Camera-ready of accepted papers: November 20, 2023
Workshop date: 1 day event: December 15-18, 2023 (exact date to be
announced)
Other dates for shared tasks will be advertised separately
*Workshop Description:*Financial narrative processing is an emerging field
that combines natural language processing (NLP) and machine learning (ML)
techniques to extract, summarise, and analyse both qualitative and
quantitative financial data.
As the amount of financial data continues to grow exponentially, this data
is increasingly considered as big data, which presents challenges and
opportunities for data scientists.
The 5th Financial Narrative Processing Workshop (FNP 2023) aims to bring
together researchers and industry practitioners to share their latest
research results and practical experiences in financial narrative
processing, which is a key aspect of big data.
In particular, the workshop will focus on three shared tasks: Financial
Narrative Summarization, Financial Table of Content Extraction, and
Financial Causality Detection.
These tasks will challenge participants to apply state-of-the-art
techniques in NLP and ML to extract meaningful insights from financial
documents.
The workshop will provide an informal and vibrant forum for discussion and
collaboration, with the goal of advancing the field of financial narrative
processing within the context of big data.
We welcome submissions from researchers and practitioners in academia and
industry.
FNP 2023 workshop is organised by a team of experts who have been at the
forefront of financial NLP research for the past five years.
We have organised more than 7 international events, introduced NLP and AI
shared tasks, and provided big datasets and methodologies needed to push
forward the emerging field of financial NLP.
Our workshop series has contributed significantly to the field of financial
NLP, as evidenced by our proceedings on ACL anthology and citations in
Google Scholar.
*Previous Proceedings:*All FNP proceedings across the years are on ACL
Anthology: https://aclanthology.org/venues/fnp/.
The 1st FNP was associated with LREC 2018
http://lrec-conf.org/workshops/lrec2018/W27/pdf/book_of_proceedings.pdf
FNP Google Scholar:
https://scholar.google.com/citations?hl=en&user=8Qn7yJ8AAAAJ
*Motivation:*Financial narrative disclosures represent a significant
portion of firms’ overall financial communications with investors.
Textual commentaries help to clarify issues that may be obscured by complex
accounting methods and footnote disclosures.
In addition, narratives summarise corporate strategy, contextualise
results, explain governance arrangements, describe corporate social
responsibility policy, and provide forward-looking information for
investors.
However, financial narratives may also provide management with an
opportunity to obfuscate accounting results and manipulate readers’
perceptions of underlying economic performance.
In a previous FNP workshop, we organised a panel of experts to discuss the
future of Financial NLP and data leaders from AI firms in France and London.
The consensus was that financial data has increased exponentially in recent
years due to the increase in regulations.
This has led to an increase in the number of financial news surrounding the
events of releasing such disclosures.
Therefore, state-of-the-art methodologies are necessary to understand and
analyse huge and sensitive financial data in a short amount of time.
We believe that the FNP 2023 workshop will continue to contribute to the
field of Financial NLP by providing a platform for researchers and industry
practitioners to share their research results and practical development
experiences in Big Data research, development, and practice.
In addition, our workshop will help participants gain a better
understanding of the challenges posed by big data and its 5 V’s (velocity,
volume, value, variety, and veracity) in financial text analysis.
*Topics of Interest in relation to Financial NLP:*We encourage research on
topics related to analysing financial narratives using state-of-the-art NLP
techniques, including but not limited to morphological analysis,
disambiguation, tokenization,
part-of-speech tagging, named entity recognition, chunking, parsing,
semantic role labelling, sentiment analysis, document quality, and advanced
readability metrics.
The use of NLP and machine learning in the financial domain has encouraged
studies around gender and ethnicities imbalance, as well as mental health
and well-being research.
Given the focus of the IEEE Big Data 2023 conference, we also encourage
research on under-resourced languages and under-represented financial
markets.
In recent years, FNP has included research on Arabic, Spanish, and
Portuguese financial markets.
Our collaboration with the MultiLing workshop (
http://multiling.iit.demokritos.gr) has highlighted the importance of
summarization across domains and sources that are related to finance (e.g.,
company blogs, product reviews, market briefs, etc.).
This includes financial multilingual and cross-lingual summarization using
single-document summarization, multi-document summarization, summarization
evaluation, headline generation, and cross-domain/cross-topic summarization.
Given the international nature of the event, we particularly welcome FNP
papers reporting non-English and multilingual research, describing the
different regulatory regimes within which companies operate internationally.
*The FNP2023 shared tasks* will be announced separately and are expected to
be:
Financial Narrative Summarisation (FNS 2023)
Financial Table of Content Extraction (FinTOC 2023)
Financial Causality Detection (FinCausal 2023)
For the latest details about the shared tasks please visit:
http://wp.lancs.ac.uk/cfie/shared-tasks/
*Call For Papers for the Main Workshop:*
We invite papers describing original, completed or ongoing, unpublished
research in Financial Natural Language Processing and Financial Text
Analysis.
As financial data is increasingly considered as big data, we encourage
submissions that address the five main and innate characteristics of big
data (velocity, volume, value, variety, and veracity) in the context of
financial narrative processing.
We encourage submissions on topics that include, but are not limited to,
the following:
- Applying core technologies on financial narratives within the context
of big data: morphological analysis, disambiguation, tokenization,
part-of-speech tagging, named entity recognition, chunking, parsing,
semantic role labelling, sentiment analysis, document quality and advanced
readability metrics, etc.
- Using NLP to detect misreporting in relation to diversity and
wellbeing on issues related to gender, ethnicity, women at work as well as
employee mental health and stability, in the context of big data.
- Financial narrative resources and tools for managing and analysing
large-scale financial data.
- Summarization techniques across domains and sources that are related
to finance (e.g. company blogs, product reviews, market briefs, etc.), this
includes financial multilingual and cross-lingual summarization using
single-document summarization, multi-document summarization, summarization
evaluation, headline generation, cross-domain/cross-topic summarization.
- Analysis of Online Social Networks for detection of public opinions
towards financial events.
- Multilingual analysis, describing the different regulatory regimes
within which companies operate internationally.
- Ongoing research and preliminary results that explore the intersection
of financial narrative processing and big data.
- Negative results, for example techniques and methodologies that work
for certain languages but not on others. Other venues could be showing that
state-of-the-art technologies such as BERT could fail on certain tasks or
languages.
All papers accepted will be included in the conference proceedings
published by the IEEE Computer Society Press. We follow IEEE submission
format.
Please submit a full paper (up to 10 page IEEE 2-column format) or short
paper (up to 4 page IEEE 2-column format) through the online submission
system.
*Organising Committee:*
Dr Mo El-Haj, Lancaster University, UK (General Chair)
Dr Houda Bouamor, CMU, Qatar (FNP Program Chair)
Prof Paul Rayson, Lancaster University, UK (FNP Program Chair)
Blanca Carbajo Coronado, UAM, Madrid, Spain (FNP coordinator, Publication
Chair)
Nikiforos Pittaras, NCSR Demokritos (Publicity Chair)
Dr George Giannakopoulos, NCSR Demokritos (FNS Shared Task Organiser)
Dr Marina Litvak, Shamoon Academic College of Engineering (FNS Shared Task
Organiser)
Prof Antonio Moreno Sandoval, UAM, Madrid, Spain (FinCausal Shared Task
Organizer)
Dr Doaa Samy, UAM, Madrid, Spain (FinCausal Shared Task Organizer)
Dr Juyeon KANG, Fortia Financial Solution (FinTOC Shared Task Organiser)
Dr Ismail El Maarouf, Imprevicible (FinTOC Shared Task Organiser)
--
Best regards,
Marina Litvak