On behalf of Dr. Elliot Crowley from the School of Engineering at the University of Edinburgh (queries at elliot.crowley(a)ed.ac.uk):
Application link: Affordable Training of Large Language Models<https://www.eng.ed.ac.uk/studying/postgraduate/research/phd/affordable-trai…>
Recent developments in large language models (LLMs) have caught the attention of the public. LLMs such as OpenAI's GPT-4 and Google's Bard are able to generate remarkably realistic, coherent text based on a user's input and have the potential to be general-purpose tools used throughout society e.g. for customer service, summarising text, answering questions, writing contracts or translating between languages.
However, LLMs are prohibitively expensive to train. GPT-3 (which is significantly smaller than its successor, GPT-4) has an estimated training time of 355-GPU years and an estimated training cost of $4.6M [1]. Only large, wealthy institutions can train these models and thereby control how they are trained and who gets access to them. This is undemocratic.
Very recent work provides hope however. In [2] the authors explore the promising idea of “cramming”: the training of a LLM on a single GPU in a day. In [3] the authors use synthetic data to train “small” language models that can produce consistent stories at little cost. There is a huge discrepancy in quality between these models and their expensive counterparts, however.
In this PhD, the student will investigate affordable LLM training i.e. with limited compute and/or data, inspired by [2,3]. Avenues of research could include (i) generating training data that facilitates fast training e.g. through dataset distillation [4]; (ii) exploring neural architecture search to develop models that are "aware" of being resource-constrained while being trained; (iii) developing novel cost-effective training algorithms, (iv) leveraging and tuning open-source LLMs.
The successful student will have opportunities for collaboration within and outside Edinburgh’s School of Engineering e.g. with colleagues in the Institute for Digital Communications<https://www.eng.ed.ac.uk/research/institutes/idcom/>, The Bayesian and Neural Systems Group<https://www.bayeswatch.com/>, and Edinburgh NLP<https://edinburghnlp.inf.ed.ac.uk/>.
[1] https://lambdalabs.com/blog/demystifying-gpt-3
[2] https://arxiv.org/abs/2212.14034
[3] https://arxiv.org/abs/2305.07759
[4] https://arxiv.org/abs/1811.10959
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
*** Final Call for Submissions ***
10th European Conference On Service-Oriented And Cloud Computing (ESOCC 2023)
October 24-26, 2023, Golden Bay Beach Hotel, Larnaca, Cyprus
https://cyprusconferences.org/esocc2023/
(Proceedings to be published in Springer LNCS;
Journal Special Issue with Springer Computing)
Submission Deadline: Abstracts by June 25, 2023; Full Papers by July 2, 2023
AIM AND SCOPE
Nowadays, Service-Oriented and Cloud Computing are the primary approaches to build
large-scale distributed systems and deliver software services to end users. Cloud-native
software is pervading the delivery of enterprise applications, as they are composed of
(micro)services that can be independently developed and deployed by exploiting multiple
heterogeneous technologies. Resulting applications are polyglot service compositions that can
then be shipped in serverful or serverless platforms (e.g., using virtualization technologies).
These characteristics make Service-Oriented and Cloud Computing the natural answers for
fulfilling the industry’s need for flexibly scalable and maintainable enterprise applications, to
be delivered through state-of-the-art methodologies, like DevOps. To further support this,
researchers and practitioners need to create methods, tools and techniques to support
cost-effective and secure development as well as use of dependable devices, platforms,
services and service-oriented applications in the Cloud, now also considering the Cloud-IoT
computing continuum to exploit widespread adoption of smart connected things and the
increasing growth of their computing capabilities.
The European Conference on Service-Oriented and Cloud Computing (ESOCC) is the premier
conference on advances in the state-of-the-art and practice of Service-Oriented Computing
and Cloud Computing in Europe. ESOCC aims to facilitate the exchange between researchers
and practitioners in the areas of Service-Oriented Computing and Cloud Computing, as well as
to explore the new trends in those areas and foster future collaborations in Europe and beyond.
TOPICS OF INTEREST
ESOCC 2023 seeks original, high-quality contributions related to all aspects of Service-Oriented
and Cloud computing. Specific topics of interest include, but are not limited to: • Applications for Service-Oriented and Cloud Computing, e.g., big data, commerce, energy,
finance, health, scientific computing, smart cities • Blockchains for Service-Oriented and Cloud Computing • Business aspects of Service-Oriented and Cloud Computing, e.g., business models,
brokerage, marketplaces, costs, pricing • Business processes, e.g., service-based workflow deployment and management • Cloud interoperability, service and Cloud standards, • Cloud-IoT computing continuum, e.g., edge computing, fog computing, mobility computing,
next generation services/IoT • Cloud-native architectures and paradigms, e.g., microservices and DevOps • Cloud service models, e.g., IaaS, PaaS, SaaS, DBaaS, FaaS, etc. • Deployment, composition, and management of applications in Service-Oriented and Cloud
Computing • Foundations and formal methods for Service-Oriented and Cloud Computing • Enablers for Service-Oriented and Cloud Computing, e.g., service discovery, orchestration,
matchmaking, monitoring, and analytics • Model-Driven Engineering for Service-Oriented and Cloud Computing • Multi-Cloud, cross-Cloud, and federated Cloud solutions • Requirements engineering, design, development, and testing of applications in
Service-Oriented and Cloud Computing • Semantic services and service mining • Service and Cloud middlewares and platforms • Software/service adaptation and evolution in Service-Oriented and Cloud Computing • Storage, computation and network Clouds • Sustainability and energy issues in Service-Oriented and Cloud Computing • Quality aspects (e.g., governance, privacy, security, and trust) of Service-Oriented and Cloud
Computing • Quality of Service (QoS) and Service-Level Agreement (SLA) for Service-Oriented and Cloud
Computing • Social aspects of Service-Oriented and Cloud Computing, e.g., crowdsourcing services, social
and crowd-based Clouds • Virtualization for Service-Oriented and Cloud Computing, e.g., serverless, container-based
virtualization, VMs
IMPORTANT DATES
• Submission of abstracts: June 25th, 2023 (AoE) • Submission of full papers: July 2nd, 2023 (AoE) • Notification to authors: August 4th, 2023 (AoE) • Camera-ready versions due: August 21st, 2023 (AoE)
• Author registration due: August 21st, 2023 (AoE)
TYPES OF SUBMISSIONS
ESOCC 2023 invites submissions of the following kinds: • Regular Research Papers (15 pages including references) • PhD Symposium (12 pages including references) • Projects and Industry Reports (Projects and Industry Reports (1 to 6 pages including
references, describing an ongoing EU or national project, or providing industrial perspectives
on innovative applications, technologies, or methods in ESOCC’s scope)
We only accept original papers, not submitted for publication elsewhere. The papers must be
formatted according to the proceedings guidelines of Springer’s Lecture Notes in Computer
Science (LNCS) series (http://www.springer.com/lncs).
They must be submitted to the EasyChair site at: https://easychair.org/conferences/?conf=esocc2023 by selecting the right track.
Accepted papers from all tracks will be published in the main conference proceedings by
Springer in the LNCS series. For publication to happen, at least one author of each accepted
paper is expected to register and present the work at the conference.
The best papers accepted will be invited to submit extended versions for a Journal Special
Issue to be published by Springer Computing.
ORGANISATION
General Chair
• George A. Papadopoulos, University of Cyprus, CY
(george at ucy.ac.cy)
Program Chairs
• Florian Rademacher, University of Applied Sciences and Arts Dortmund, DE
(florian.rademacher at fh-dortmund.de) • Jacopo Soldani, University of Pisa, IT
(jacopo.soldani at unipi.it)
Steering and Program Committee
https://cyprusconferences.org/esocc2023/committees/
The University of Bologna is offering* a funded residential bootcamp*:
*"Theories and methods for the corpus-assisted analysis of discourse: from
language that denotes to language that expresses phenomena". *
20 participants
*27-30 September (Bertinoro - Italy)*
*Deadline for application 19/06/2023*
#cadscamp
All info:
*https://centri.unibo.it/colitec/en/events/bootcamp-28-30-september-2023
<https://centri.unibo.it/colitec/en/events/bootcamp-28-30-september-2023> *
Best,
Anna
Call for Participation:
*FinCausal-2023 Shared Task: “Financial Document Causality Detection” *is
organised within the *5th Financial Narrative Processing Workshop (FNP
2023)* taking place in the 2023 IEEE International Conference on Big Data
(IEEE BigData 2023) <http://bigdataieee.org/BigData2023/>, Sorrento, Italy,
15-18 December 2023. It is a *one-day event*. The exact date is to be
announced.
Important Dates:
- Call for participation and registration: 3rd June 2023
- Registration deadline: 28 June
- Training set release: 29 June 2023
- Test set release: 5 September 2023
- Systems submission deadline: 15 September 2023
- Release of results: 20 September 2023
- Paper submission deadline: 20 October 2023
- Notification of acceptance: November 12, 2023
- Camera-ready of accepted papers: November 20, 2023
- FNP Workshop: December 2023
Workshop URL: https://wp.lancs.ac.uk/cfie/fincausal2023/
Registration Form: https://forms.gle/29E161a8RmMosBLU8. After completing
the registration form, the practice set will be sent to participants.
*Shared Task Description:*
Financial analysis needs factual data and an explanation of the variability
of these data. Data state facts but need more knowledge regarding how these
facts materialised. Furthermore, understanding causality is crucial in
studying decision-making processes.
The *Financial Document Causality Detection Task* (FinCausal) aims at
identifying elements of cause and effect in causal sentences extracted from
financial documents. Its goal is to evaluate which events or chain of
events can cause a financial object to be modified or an event to occur,
regarding a given context. In the financial landscape, identifying cause
and effect from external documents and sources is crucial to explain why a
transformation occurs.
Two subtasks are organised this year. *English FinCausal subtask *and* Spanish
FinCausal subtask*. This is the first year where we introduce a subtask in
Spanish.
*Objective*: For both tasks, participants are asked to identify, given a
causal sentence, which elements of the sentence relate to the cause, and
which relate to the effect. Participants can use any method they see fit
(regex, corpus linguistics, entity relationship models, deep learning
methods) to identify the causes and effects.
*English FinCausal subtask*
- *Data Description: *The dataset has been sourced from various 2019
financial news articles provided by Qwam, along with additional SEC data
from the Edgar Database. Additionally, we have augmented the dataset from
FinCausal 2022, adding 500 new segments. Participants will be provided with
a sample of text blocks extracted from financial news and already labelled.
- *Scope: *The* English FinCausal subtask* focuses on detecting causes
and effects when the effects are quantified. The aim is to identify, in
a causal sentence or text block, the causal elements and the consequential
ones. Only one causal element and one effect are expected in each segment.
- *Length of Data fragments: *The* English FinCausal subtask* segments
are made up of up to three sentences.
- *Data format: *CSV files. Datasets for both the English and the
Spanish subtasks will be presented in the same format.
This shared task focuses on determining causality associated with a
quantified fact. An event is defined as the arising or emergence of a new
object or context regarding a previous situation. So, the task will
emphasise the detection of causality associated with the transformation of
financial objects embedded in quantified facts.
*Spanish FinCausal subtask*
- *Data Description: *The dataset has been sourced from a corpus of
Spanish financial annual reports from 2014 to 2018. Participants will be
provided with a sample of text blocks extracted from financial news,
labelled through inter-annotator agreement.
- *Scope: *The *Spanish FinCausal subtask* aims to detect all types of
causes and effects, not necessarily limited to quantified effects. The
aim is to identify, in a paragraph, the causal elements and the
consequential ones. Only one causal element and one effect are expected in
each paragraph.
- *Length of Data fragments: *The *Spanish FinCausal subtask* involves
complete paragraphs.
- *Data format: *CSV files. Datasets for both the English and the
Spanish subtasks will be presented in the same format.
This shared task focuses on determining causality associated with both
events or quantified facts. For this task, a cause can be the justification
for a statement or the reason that explains a result. This task is also a
relation detection task.
*FinCausal Shared Task Organisers:*
- Antonio Moreno-Sandoval (UAM, Spain)
- Blanca Carbajo Coronado (UAM, Spain)
- Doaa Samy (UCM, Spain)
- Jordi Porta (UAM, Spain)
- Dominique Mariko (Yseop, France)
For any questions, please contact the organisers at *fincausal.2023(a)gmail.com
<fincausal.2023(a)gmail.com>*
8th Symposium on Corpus Approaches to Lexicogrammar (LxGr2022)
The symposium will take place online on 6-8 July 2023.
The programme and registration details are here: https://sites.edgehill.ac.uk/lxgr/lxgr2023
For more information, contact lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>.
________________________________
Edge Hill University<http://ehu.ac.uk/home/emailfooter>
Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter>
University of the Year, Educate North 2021/21
________________________________
This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>
Dear colleagues,
We are happy to invite you to join the *Arabic NER SharedTask 2023*
<https://dlnlp.ai/st/wojood/> which will be organized as part of the WANLP
2023. We will provide you with a large corpus and Google Colab notebooks to
help you reproduce the baseline results.
دعوة للمشاركة في مسابقة استخراج الكيونات المسماه من النصوص العربية. سنزود
المشاركين بمدونة وبرمجيات للحصول على نتائج مرجعية يمكنهم البناء عليها.
*INTRODUCTION*
Named Entity Recognition (NER) is integral to many NLP applications. It is
the task of identifying named entity mentions in unstructured text and
classifying them to predefined classes such as person, organization,
location, or date. Due to the scarcity of Arabic resources, most of the
research on Arabic NER focuses on flat entities and addresses a limited
number of entity types (person, organization, and location). The goal of
this shared task is to alleviate this bottleneck by providing Wojood, a
large and rich Arabic NER corpus. Wojood consists of about 550K tokens (MSA
and dialect, in multiple domains) that are manually annotated with 21
entity types.
*REGISTRATION*
Participants need to register via this form (
*https://forms.gle/UCCrVNZ2LaPviCZS6* <https://forms.gle/UCCrVNZ2LaPviCZS6>).
Participating teams will be provided with common training development
datasets. No external manually labelled datasets are allowed. Blind test
data set will be used to evaluate the output of the participating teams.
Each team is allowed a maximum of 3 submissions. All teams are required to
report on the development and test sets (after results are announced) in
their write-ups.
*FAQ*
For any questions related to this task, please check our *Frequently Asked
Questions*
<https://docs.google.com/document/d/1XE2n89mFLic2P9DO_sAD51vy734BOt0kgtZ6bFf…>
*IMPORTANT DATES*
- March 03, 2023: Registration available
- May 25, 2023: Data-sharing and evaluation on development set Avaliable
- June 10, 2023: Registration deadline
- July 20, 2023: Test set made available
- July 30, 2023: Evaluation on test set (TEST) deadline
- Augest 29, 2023: Shared task system paper submissions due
- October 12, 2023: Notification of acceptance
- October 30, 2023: Camera-ready version
- TBA: WANLP 2023 Conference.
** All deadlines are 11:59 PM UTC-12:00 (Anywhere On Earth).*
*CONTACT*
For any questions related to this task, please contact the organizers
directly using the following email address: *NERShare...(a)gmail.com
<https://groups.google.com/>* or join the google group:
*https://groups.google.com/g/ner_sharedtask2023*
<https://groups.google.com/g/ner_sharedtask2023>.
*SHARED TASK*
As described, this shared task targets both flat and nested Arabic NER. The
subtasks are:
*Subtask 1:* *Flat NER*
In this subtask, we provide the Wojood-Flat train (70%) and development
(10%) datasets. The final evaluation will be on the test set (20%). The
flat NER dataset is the same as the nested NER dataset in terms of
train/test/dev split and each split contains the same content. The only
difference in the flat NER is each token is assigned one tag, which is the
first high-level tag assigned to each token in the nested NER dataset.
*Subtask 2:* *Nestd NER*
In this subtask, we provide the Wojood-Nested train (70%) and development
(10%) datasets. The final evaluation will be on the test set (20%).
*METRICS*
The evaluation metrics will include precision, recall, F1-score. However,
our official metric will be the micro F1-score.
The evaluation of shared tasks will be hosted through CODALAB. Teams will
be provided with a CODALAB link for each shared task.
-*CODALAB link for NER Shared Task Subtask 1 (Flat NER)*
<https://codalab.lisn.upsaclay.fr/competitions/11594>
-*CODALAB link for NER Shared Task Subtask 2 (Nestd NER)*
<https://dlnlp.ai/st/wojood/>
*BASELINES*
Two baseline models trained on Wojood (flat and nested) are provided:
*Nested NER baseline:* is presented in this *article*
<https://aclanthology.org/2022.lrec-1.387/>, and code is available in
*GitHub* <https://github.com/SinaLab/ArabicNER>. The model achieves a micro
F1-score of 0.9059 (note that this baseline does not handle nested entities
of the same type).
*Flat NER baseline:* same code repository for nested NER (*GitHub*
<https://github.com/SinaLab/ArabicNER>) can also be used to train flat NER
task. Our flat NER baseline achieved a micro F1-score of 0.8785.
*GOOGLE COLAB NOTEBOOKS*
To allow you to experiment with the baseline, we authored four Google Colab
notebooks that demonstrate how to train and evaluate our baseline models.
[1] *Train Flat NER*
<https://gist.github.com/mohammedkhalilia/72c3261734d7715094089bdf4de74b4a>:
This notebook can be used to train our ArabicNER model on the flat NER task
using the sample Wojood data found in our repository.
[2] *Evaluate Flat NER*
<https://gist.github.com/mohammedkhalilia/c807eb1ccb15416b187c32a362001665>:
this notebook will use the trained model saved from the notebook above to
perform evaluation on unseen dataset.
[3] *Train Nested NER*
<https://gist.github.com/mohammedkhalilia/a4d83d4e43682d1efcdf299d41beb3da>:
This notebook can be used to train our ArabicNER model on the nested NER
task using the sample Wojood data found in our repository.
[4] *Evaluate Nested NER*
<https://gist.github.com/mohammedkhalilia/9134510aa2684464f57de7934c97138b>:
this notebook will use the trained model saved from the notebook above to
perform evaluation on unseen dataset.
*ORGANIZERS*
- Mustafa Jarrar, Birzeit University
- Muhammad Abdul-Mageed, University of British Columbia & MBZUAI
- Mohammed Khalilia, Birzeit University
- Bashar Talafha, University of British Columbia
- AbdelRahim Elmadany, University of British Columbia
- Nagham Hamad, Birzeit University
- Alaa Omer, Birzeit University
*The 5th Financial Narrative Processing Workshop (FNP 2023)*
To be held at the 2023 IEEE International Conference on Big Data (IEEE
BigData 2023), Sorrento, Italy, from 15 to 18 December 2023.
FNP 2023: http://wp.lancs.ac.uk/cfie/fnp2023/
*Submission page:*
https://wi-lab.com/cyberchair/2023/bigdata23/scripts/submit.php?subarea=S14…
*Important Dates:*
1st Call for workshop papers: June 1, 2023
2nd Call for workshop papers: August 15, 2023
Final Call for workshop papers: October 1, 2023
Due date for workshop papers submission: October 30, 2023 (anywhere in the
world)
Notification of paper acceptance to authors: November 12, 2023
Camera-ready of accepted papers: November 20, 2023
Workshop date: 1 day event: December 15-18, 2023 (exact date to be
announced)
Other dates for shared tasks will be advertised separately
*Workshop Description:*Financial narrative processing is an emerging field
that combines natural language processing (NLP) and machine learning (ML)
techniques to extract, summarise, and analyse both qualitative and
quantitative financial data.
As the amount of financial data continues to grow exponentially, this data
is increasingly considered as big data, which presents challenges and
opportunities for data scientists.
The 5th Financial Narrative Processing Workshop (FNP 2023) aims to bring
together researchers and industry practitioners to share their latest
research results and practical experiences in financial narrative
processing, which is a key aspect of big data.
In particular, the workshop will focus on three shared tasks: Financial
Narrative Summarization, Financial Table of Content Extraction, and
Financial Causality Detection.
These tasks will challenge participants to apply state-of-the-art
techniques in NLP and ML to extract meaningful insights from financial
documents.
The workshop will provide an informal and vibrant forum for discussion and
collaboration, with the goal of advancing the field of financial narrative
processing within the context of big data.
We welcome submissions from researchers and practitioners in academia and
industry.
FNP 2023 workshop is organised by a team of experts who have been at the
forefront of financial NLP research for the past five years.
We have organised more than 7 international events, introduced NLP and AI
shared tasks, and provided big datasets and methodologies needed to push
forward the emerging field of financial NLP.
Our workshop series has contributed significantly to the field of financial
NLP, as evidenced by our proceedings on ACL anthology and citations in
Google Scholar.
*Previous Proceedings:*All FNP proceedings across the years are on ACL
Anthology: https://aclanthology.org/venues/fnp/.
The 1st FNP was associated with LREC 2018
http://lrec-conf.org/workshops/lrec2018/W27/pdf/book_of_proceedings.pdf
FNP Google Scholar:
https://scholar.google.com/citations?hl=en&user=8Qn7yJ8AAAAJ
*Motivation:*Financial narrative disclosures represent a significant
portion of firms’ overall financial communications with investors.
Textual commentaries help to clarify issues that may be obscured by complex
accounting methods and footnote disclosures.
In addition, narratives summarise corporate strategy, contextualise
results, explain governance arrangements, describe corporate social
responsibility policy, and provide forward-looking information for
investors.
However, financial narratives may also provide management with an
opportunity to obfuscate accounting results and manipulate readers’
perceptions of underlying economic performance.
In a previous FNP workshop, we organised a panel of experts to discuss the
future of Financial NLP and data leaders from AI firms in France and London.
The consensus was that financial data has increased exponentially in recent
years due to the increase in regulations.
This has led to an increase in the number of financial news surrounding the
events of releasing such disclosures.
Therefore, state-of-the-art methodologies are necessary to understand and
analyse huge and sensitive financial data in a short amount of time.
We believe that the FNP 2023 workshop will continue to contribute to the
field of Financial NLP by providing a platform for researchers and industry
practitioners to share their research results and practical development
experiences in Big Data research, development, and practice.
In addition, our workshop will help participants gain a better
understanding of the challenges posed by big data and its 5 V’s (velocity,
volume, value, variety, and veracity) in financial text analysis.
*Topics of Interest in relation to Financial NLP:*We encourage research on
topics related to analysing financial narratives using state-of-the-art NLP
techniques, including but not limited to morphological analysis,
disambiguation, tokenization,
part-of-speech tagging, named entity recognition, chunking, parsing,
semantic role labelling, sentiment analysis, document quality, and advanced
readability metrics.
The use of NLP and machine learning in the financial domain has encouraged
studies around gender and ethnicities imbalance, as well as mental health
and well-being research.
Given the focus of the IEEE Big Data 2023 conference, we also encourage
research on under-resourced languages and under-represented financial
markets.
In recent years, FNP has included research on Arabic, Spanish, and
Portuguese financial markets.
Our collaboration with the MultiLing workshop (
http://multiling.iit.demokritos.gr) has highlighted the importance of
summarization across domains and sources that are related to finance (e.g.,
company blogs, product reviews, market briefs, etc.).
This includes financial multilingual and cross-lingual summarization using
single-document summarization, multi-document summarization, summarization
evaluation, headline generation, and cross-domain/cross-topic summarization.
Given the international nature of the event, we particularly welcome FNP
papers reporting non-English and multilingual research, describing the
different regulatory regimes within which companies operate internationally.
*The FNP2023 shared tasks* will be announced separately and are expected to
be:
Financial Narrative Summarisation (FNS 2023)
Financial Table of Content Extraction (FinTOC 2023)
Financial Causality Detection (FinCausal 2023)
For the latest details about the shared tasks please visit:
http://wp.lancs.ac.uk/cfie/shared-tasks/
*Call For Papers for the Main Workshop:*
We invite papers describing original, completed or ongoing, unpublished
research in Financial Natural Language Processing and Financial Text
Analysis.
As financial data is increasingly considered as big data, we encourage
submissions that address the five main and innate characteristics of big
data (velocity, volume, value, variety, and veracity) in the context of
financial narrative processing.
We encourage submissions on topics that include, but are not limited to,
the following:
- Applying core technologies on financial narratives within the context
of big data: morphological analysis, disambiguation, tokenization,
part-of-speech tagging, named entity recognition, chunking, parsing,
semantic role labelling, sentiment analysis, document quality and advanced
readability metrics, etc.
- Using NLP to detect misreporting in relation to diversity and
wellbeing on issues related to gender, ethnicity, women at work as well as
employee mental health and stability, in the context of big data.
- Financial narrative resources and tools for managing and analysing
large-scale financial data.
- Summarization techniques across domains and sources that are related
to finance (e.g. company blogs, product reviews, market briefs, etc.), this
includes financial multilingual and cross-lingual summarization using
single-document summarization, multi-document summarization, summarization
evaluation, headline generation, cross-domain/cross-topic summarization.
- Analysis of Online Social Networks for detection of public opinions
towards financial events.
- Multilingual analysis, describing the different regulatory regimes
within which companies operate internationally.
- Ongoing research and preliminary results that explore the intersection
of financial narrative processing and big data.
- Negative results, for example techniques and methodologies that work
for certain languages but not on others. Other venues could be showing that
state-of-the-art technologies such as BERT could fail on certain tasks or
languages.
All papers accepted will be included in the conference proceedings
published by the IEEE Computer Society Press. We follow IEEE submission
format.
Please submit a full paper (up to 10 page IEEE 2-column format) or short
paper (up to 4 page IEEE 2-column format) through the online submission
system.
*Organising Committee:*
Dr Mo El-Haj, Lancaster University, UK (General Chair)
Dr Houda Bouamor, CMU, Qatar (FNP Program Chair)
Prof Paul Rayson, Lancaster University, UK (FNP Program Chair)
Blanca Carbajo Coronado, UAM, Madrid, Spain (FNP coordinator, Publication
Chair)
Nikiforos Pittaras, NCSR Demokritos (Publicity Chair)
Dr George Giannakopoulos, NCSR Demokritos (FNS Shared Task Organiser)
Dr Marina Litvak, Shamoon Academic College of Engineering (FNS Shared Task
Organiser)
Prof Antonio Moreno Sandoval, UAM, Madrid, Spain (FinCausal Shared Task
Organizer)
Dr Doaa Samy, UAM, Madrid, Spain (FinCausal Shared Task Organizer)
Dr Juyeon KANG, Fortia Financial Solution (FinTOC Shared Task Organiser)
Dr Ismail El Maarouf, Imprevicible (FinTOC Shared Task Organiser)
--
Best regards,
Marina Litvak
2nd Call for Papers
The 1st Workshop on Counter Speech for Online Abuse:
A workshop for creating, investigating and improving tools for producing and evaluating counter speech.
Hate speech and abusive and toxic language are prevalent in online spaces. For example, a 2019 survey shows that in the UK 30-40% of people have experienced online abuse, and platforms like Facebook bring down millions of harmful posts every year, with the help of AI tools. While removal of such content can immediately reduce the quantity of harmful messages, it can bring about accusations of censorship and may not be effective at curbing hate in the long term. An alternative approach is to reply with counter speech, i.e. targeted responses aimed at refuting the hateful language using thoughtful and cogent reasons, and fact-bound arguments. This has been shown to be effective in influencing the behaviour of both the perpetrators of abuse and bystanders that witness the interactions, as well as providing support to victims.
The sheer amount of social media data shared online on a daily basis means that hate mitigation, using counter speech, requires reliable, efficient and scalable tools. Recently, efforts have been made to curate hate countering datasets and automate the production of counter speech. However, this research field is still in its infancy, and many questions remain open regarding the most effective approaches and methods to take, as well as how to evaluate them.
This first multidisciplinary workshop aims to bring together researchers from diverse backgrounds such as computer science and the social sciences, as well as policy makers and other stakeholders to attempt to understand how counter speech is currently used to tackle abuse by individuals, activists and organisations, how Natural Language Processing (NLP) and Generation (NLG) can be applied to produce counter narratives, and the implications of using large language models for this task. It will also address, but not be limited to, the questions of how to evaluate and measure the impacts of counter speech, the importance of expert knowledge from civil society in the development of counter speech datasets and taxonomies, and how to ensure fairness and mitigate the biases present in language models when generating counter speech.
Topics
We invite papers (long and short) on a wide range of topics, including but not limited to:
• Models and methods for generating counter speech;
• Dialogue agents employing counter speech to address hateful inputs, directed towards other people or the AI itself;
• Human and automatic evaluation methods of counter speech tools;
• Multidisciplinary studies including different perspectives on the topic such as from computer science, social science, NGOs and stakeholders;
• Development of datasets and taxonomy for counter speech;
• Potentials and limitations (e.g., fairness, biases) of using large language models for generating counter speech;
• Social impact and empirical studies of counter speech on social media, including investigating the effectiveness and consequences on users of employing counter speech to fight online hate;
• Proposals for future research on counter speech, and/or preliminary results of studies in this field
We accept three types of submissions:
* Regular research papers – long (8 pages) or short (4 pages);
* Non-archival submissions: like research papers, but will not be included in the proceedings;
* Research communications: 2-4 page abstracts summarising relevant research published elsewhere.
Submission link: https://softconf.com/n/cs4oa2023
Location: co-located with SIGdialxINLG, Prague, Czechia
Important dates
All deadlines are Anywhere on Earth (UTC-12)
* Submission deadline: Jun 26, 2023
* Notification of acceptance Jul 17, 2023
* Camera-ready deadline Aug 11, 2023
* Workshop date: September 11/12 2023
Format and Styling
Submissions should follow ACL Author Guidelines<https://www.aclweb.org/adminwiki/index.php?title=ACL_Author_Guidelines> and policies for submission, review and citation, and be anonymised for double blind reviewing. Please use ACL 2023 style files; LaTeX style files and Microsoft Word templates are available at https://2023.aclweb.org/calls/style_and_formatting/<https://2021.aclweb.org/downloads/acl-ijcnlp2021-templates.zip>.
Organising Committee:
* Yi-Ling Chung, The Alan Turing Institute
* Gavin Abercrombie, Heriot-Watt University
* Helena Bonaldi, Fondazione Bruno Kessler
* Marco Guerini, Fondazione Bruno Kessler
Contact
If you have any questions, please let us know at cs4oa(a)googlegroups.com
Website: https://sites.google.com/view/cs4oa
Twitter: @cs4oa_workshop<https://twitter.com/cs4oa_workshop>
________________________________
Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With campuses and students across the entire globe we span the world, delivering innovation and educational excellence in business, engineering, design and the physical, social and life sciences. This email is generated from the Heriot-Watt University Group, which includes:
1. Heriot-Watt University, a Scottish charity registered under number SC000278
2. Heriot- Watt Services Limited (Oriam), Scotland's national performance centre for sport. Heriot-Watt Services Limited is a private limited company registered is Scotland with registered number SC271030 and registered office at Research & Enterprise Services Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.
The contents (including any attachments) are confidential. If you are not the intended recipient of this e-mail, any disclosure, copying, distribution or use of its contents is strictly prohibited, and you should please notify the sender immediately and then delete it (including any attachments) from your system.
Dear Sir/Ma'am,
I hope you are doing well and in good health. We are excited to announce a
call for a book chapter for an upcoming book titled "*Empowering
Low-Resource Languages With NLP Solutions.*"
Link: https://www.igi-global.com/publish/call-for-papers/call-details/6596
The objective of this book is to provide an in-depth understanding of
Natural Language Processing (NLP) techniques and applications specifically
tailored for low-resource languages. We believe that your valuable insights
and research in this domain would greatly enrich the content of this book.
To ensure a comprehensive and high-quality book, all submitted chapters
will undergo a rigorous peer-review process. The accepted book will be *indexed
in Scopus and Web of Science*, thereby enhancing the visibility and impact
of your work.
The book aims to cover a wide range of topics related to NLP in
low-resource languages. Some of the suggested topics, although not limited
to, include:
· Introduction to Low-Resource Languages in NLP
· Language Resource Acquisition for Low-Resource Languages
· Morphological Analysis and Morpho-Syntactic Processing
· Named Entity Recognition and Entity Linking for Low-Resource
Languages
· Part-of-Speech Tagging and Syntactic Parsing
· Machine Translation for Low-Resource Languages
· Sentiment Analysis and Opinion Mining for Low-Resource Languages
· Speech and Audio Processing for Low-Resource Languages
· Text Summarization and Information Retrieval for Low-Resource
Languages
· Multimodal NLP for Low-Resource Languages
· Code-switching and Language Identification for Low-Resource
Languages
· Evaluation and Benchmarking for NLP in Low-Resource Languages
· Applications of NLP in Low-Resource Language Settings
· Future Directions and Challenges in NLP
We encourage you to contribute a book chapter focusing on any of the
above-mentioned topics or related areas within the scope of NLP in
low-resource languages. The submission guidelines are as follows:
1. Please submit a chapter proposal (maximum 500 words) outlining the
objective, methodology, and expected outcomes of your proposed chapter by
July 3, 2023, to the submission portal:
https://www.igi-global.com/publish/call-for-papers/call-details/6596
2. Chapter proposals should include the title of the chapter,
author(s) name and their affiliations.
3. All submissions should be original and should not have been
previously published or currently under review elsewhere.
4. The chapters should be written in English and adhere to the
formatting guidelines provided after the acceptance of the proposal.
*Important Dates:*
July 3, 2023: Proposal Submission Deadline
July 17, 2023: Notification of Acceptance
September 17, 2023: Full Chapter Submission
October 31, 2023: Review Results Returned
December 12, 2023: Final Acceptance Notification
December 26, 2023: Final Chapter Submission
Thank you for considering this invitation, and we look forward to receiving
your valuable contribution to this book. If you have any further questions
or require additional information, please do not hesitate to contact us.
Best regards,
Editorial Team
Dr. Partha Pakray
National Institute of Technology Silchar
Email: partha(a)cse.nits.ac.in
Dr. Pankaj Dadure
University of Petroleum and Energy Studies Dehradun
Email: pankajk.dadure(a)ddn.upes.ac.in
Prof. Sivaji Bandyopadhyay
Jadavpur University, Kolkata
Email: sivaji.cse.ju(a)gmail.com