CALL FOR PARTICIPATION
IBERLEF 2023 Task - FinancES. Financial Targeted Sentiment Analysis in
Spanish
Training set released!
Held as part of iberLEF 2023 <https://sites.google.com/view/iberlef-2023>,
a shared evaluation campaign for Natural Language Processing (NLP) systems
in Spanish and other Iberian languages
September 26th 2023, Jaen
Codalab link: https://codalab.lisn.upsaclay.fr/competitions/10052
Dear All,
We are inviting researchers and students to participate in the
shared-task FinancES.
Financial Targeted Sentiment Analysis in Spanish held as part of iberLEF
2023, shared evaluation campaign for Natural Language Processing (NLP)
systems in Spanish and other Iberian languages.
This shared task aims to explore targeted sentiment analysis in the
financial domain. Specifically, the approach adopted here is grounded in
the field of microeconomics. In this regard, Bowles (2004) explains the
role of economic agents, that is to say, individuals or organizations
impacting the economy. The author states that the main microeconomic agents
in the capital market are consumers (households/individuals), companies
(firms), governments, and central banks. Consequently, in order to develop
a sentiment analysis method where different viewpoints are considered,
three different perspectives are included: (1) economic target of the news
item; (2) individual economic agent: companies; and (3) individual economic
agent: consumers –the target is the sector where the economic fact applies,
and companies produce the goods and services that households/individuals
consume. From these three viewpoints, the news item has an impact on the
target and the economic agents which are considered as positive, negative,
or neutral. With all, two tasks are proposed. On the one hand, a task
combining the challenges of aspect-term extraction for identifying the
target entity in text, and aspect-based sentiment classification for
determining the sentiment polarity towards the target. On the other hand, a
task devoted to assessing the impact of a news headline on both other
economic agents, namely, companies and consumers.
The participants will be provided development, development_test, training
and test datasets in Spanish. The dataset for this task is composed of news
headlines written in Spanish collected from digital newspapers specialized
in economic, financial and political news. The dataset is labeled with the
target entity and the sentiment polarity on three dimensions: target,
companies, and consumers. That is, given a headline, it has been manually
classified as positive, neutral, or negative for three specific entities:
(1) target entity (i.e., the specific company or asset where the economic
fact applies), (2) companies (i.e., the entities producing the goods and
services that others consume), and (3) consumers (i.e.,
households/individuals). Each headline was annotated by three members of
the organization committee. In case of disagreement, the annotators
discussed the special case and, if no agreement was reached, the headline
was discarded. During this first step, we compiled about 14k headlines, the
headlines with a short length or those that did not specify a target entity
were filtered out. The final dataset is composed of 8k-10k news headlines.
For the shared tasks, training and test sets will be released (80%-20%).
Today, we have released the training dataset that can be found in the "Files"
subsection of the "Participate" tab. It is worth mentioning that this
dataset includes all the instances that were also released during the
Practice stage; so, it is not needed to combine both datasets.
Finally, remember that the CodaLab competition is open to submit your
results with the development dataset provided. This dataset is also
available in the same section as the training dataset.
Best regards,
The FinancES 2023 organizing committee
References
-
Bowles, S. (2004). Microeconomics: Behavior, institutions, and
evolution. Princeton University Press.
Important dates
-
Release of development corpora: Feb 13, 2023
-
Release of training corpora: Mar 13, 2023
-
Release of test corpora and start of evaluation campaign: Apr 17, 2023
-
End of evaluation campaign (deadline for runs submission): May 3, 2023
-
Publication of official results: May 5, 2023
-
Paper submission: May 28, 2023
-
Review notification: Jun 16, 2023
-
Camera ready submission: Jul 6, 2023
-
IberLEF Workshop (SEPLN 2023): Sep 26, 2023
-
Publication of proceedings: Sep ??, 2023
Organizing committee
-
José Antonio García-Díaz (UMUTeam,, Universidad de Murcia)
-
Ángela Almela Sánchez-Lafuente (UMUTeam, Universidad de Murcia)
-
Francisco García-Sánchez (UMUTeam, Universidad de Murcia)
-
Gema Alcaraz-Mármol (UMUTeam, Universidad de Castilla La Mancha)
-
María José Marín (UMUTeam, Universidad de Murcia)
-
Rafael Valencia-García (UMUTeam, Universidad de Murcia)
The Chair of Data Science and Natural Language Processing (Prof. Siegfried Handschuh) invites applications for a PhD position as part of the recently granted Swiss National Science Foundation (SNF) research project "Conversational AI: Dialogue-based Adaptive Argumentative Writing Support". You will contribute to its successful implementation, as well as the chair's varied activities in teaching and outreach.
Our research at the Data Science and NLP Chair at ICS-HSG focuses on the cutting-edge field of Natural Language Processing (NLP) and Natural Language Understanding (NLU). Our group delves into various aspects, including conversational AI, Large Language Models (LLM), intricate language analysis, Knowledge Graphs and more. Utilising sophisticated methods and algorithms, we aim to extract deeper knowledge and understanding from vast amounts of textual and auditory data. Our research not only delves into the core principles of NLP but also explores its practical applications across various industries.
Details of the post and how to apply: https://bit.ly/phd_conversational_ai
HSG: PhD in Conversational AI (m/f/d)<https://bit.ly/phd_conversational_ai>
You have a university-level master degree in Computer Science, Computational Linguistics, Data Science or related disciplines. You have a strong background in Machine Learning and Natural Language Processing; knowledge of Deep Learning architectures (e.g., Transformer, BERT) and frameworks (e.g., PyTorch, Tensorflow) is a plus. You have good programming skills in Python. Desirable qualifications include one or more of the following areas: Chatbots, Argument Mining and Text Generation. You have excellent written and verbal communication skills in English. You have a strong analytical, structured and independent working style, as well as first experience in writing scientific papers.
bit.ly
--
Prof. Dr. Siegfried Handschuh
Full Professor of Data Science and Natural Language Processing
Director, Institute of Computer Science
University of St.Gallen
E-mail: siegfried.handschuh(a)unisg.ch
Hi,
this is an invitation to attend the NLP challenge day organized by CERIST,
Algeria on March 29, 2023.
The link to attend the event is:
https://visioconf.cerist.dz/CERISTNLPChallenge
The program is available at:
http://www.nlpchallenge.cerist.dz/
Thank you.
Best regards.
Hassina Aliane.
*Dr. Hassina Aliane, Direcctor of Research*
*Head of Natural Language Processing and Digital Content Team,*
*Director of the Digital Humanities R&D laboratory, *
*Editor of the Information Processing at The Digital Age Review. *
*Research Center on Scientific and Technical Information.*
The 6th ASAIL workshop, focused on Natural Language Processing for legal texts and co-located with ICAIL 2023 in Braga, Portugal, is coming up soon.
We would like to invite you to submit papers on, and demonstrations of, original work on automated detection, extraction and analysis of semantic information in legal texts.
Submission deadline: 26th April 2023
Workshop date: 23rd June 2023
We are accepting three tiers of (two column format) papers: long (10 pages); short (6 pages); and position (2 pages).
Since we are very interested in sparking discussion around ideas and work in their early stages, we welcome short and position papers as particularly suitable for this ambition.
You can find more information, including the full call for papers, at our website: https://sites.google.com/view/asail/asail-2023-call-for-papers
Best wishes,
Daphne Odekerken
On behalf of the ASAIL Organising Committee
Apologies for cross-posting.
----------------------------------------
We invite proposals for tasks to be run as part of SemEval-2024
<https://semeval.github.io/SemEval2024/>. SemEval (the International
Workshop on Semantic Evaluation) <https://semeval.github.io/>is an ongoing
series of evaluations of computational semantics systems, organized under
the umbrella of SIGLEX <https://siglex.org/>, the Special Interest Group on
the Lexicon of the Association for Computational Linguistics.
SemEval tasks explore the nature of meaning in natural languages: how to
characterize meaning and how to compute it. This is achieved in practical
terms, using shared datasets and standardized evaluation metrics to
quantify the strengths and weaknesses and possible solutions. SemEval tasks
encompass a broad range of semantic topics from the lexical level to the
discourse level, including word sense identification, semantic parsing,
coreference resolution, and sentiment analysis, among others.
For SemEval-2024, we welcome any task that can test an automatic system for
the semantic analysis of text, which could be an intrinsic semantic
evaluation or an application-oriented evaluation. We especially encourage
tasks for languages other than English, cross-lingual tasks, and tasks that
develop novel applications of computational semantics. See the websites of
previous editions of SemEval to get an idea about the range of tasks
explored, SemEval-2022 <https://semeval.github.io/SemEval2022/> and
SemEval-2023 <https://semeval.github.io/SemEval2023/>.
We strongly encourage proposals based on pilot studies that have already
generated initial data, as this can provide concrete examples and can help
to foresee the challenges of preparing the full task. In the event of
receiving many proposals, preference will be given to proposals that have
already run a pilot study.
In case you are not sure whether a task is suitable for SemEval, please
feel free to get in touch with the SemEval organizers at
semevalorganizers(a)gmail.com to discuss your idea.
=== Task Selection ===
Task proposals will be reviewed by experts, and reviews will serve as the
basis for acceptance decisions. Everything else being equal, more
innovative new tasks will be given preference over task reruns. Task
proposals will be evaluated on:
- Novelty: Is the task on a compelling new problem that has not been
explored much in the community? Is the task a rerun, but covering
substantially new ground (new subtasks, new types of data, new languages,
etc.)?
- Interest: Is the proposed task likely to attract a sufficient number
of participants?
- Data: Are the plans for collecting data convincing? Will the resulting
data be of high quality? Will annotations have meaningfully high
inter-annotator agreements? Have all appropriate licenses for use and
re-use of the data after the evaluation been secured? Have all
international privacy concerns been addressed? Will the data annotation be
ready on time?
- Evaluation: Is the methodology for evaluation sound? Is the necessary
infrastructure available or can it be built in time for the shared task?
Will research inspired by this task be able to evaluate in the same manner
and on the same data after the initial task?
- Impact: What is the expected impact of the data in this task on future
research beyond the SemEval Workshop?
=== New Tasks vs. Task Reruns ===
We welcome both new tasks and task reruns. For a new task, the proposal
should address whether the task would be able to attract participants.
Preference will be given to novel tasks that have not received much
attention yet.
For reruns of previous shared tasks (whether or not the previous task was
part of SemEval), the proposal should address the need for another
iteration of the task. Valid reasons include: a new form of evaluation
(e.g. a new evaluation metric, a new application-oriented scenario), new
genres or domains (e.g. social media, domain-specific corpora), or a
significant expansion in scale. We further discourage carrying over a
previous task and just adding new subtasks, as this can lead to the
accumulation of too many subtasks. Evaluating on a different dataset with
the same task formulation, or evaluating on the same dataset with a
different evaluation metric, typically should not be considered a separate
subtask.
=== Task Organization ===
We welcome people who have never organized a SemEval task before, as well
as those who have. Apart from providing a dataset, task organizers are
expected to:
- Verify the data annotations have sufficient inter-annotator agreement
- Verify licenses for the data to allow its use in the competition and
afterwards. In particular, text that is publicly available online is not
necessarily in the public domain; unless a license has been provided, the
author retains all rights associated with their work, including copying,
sharing and publishing. For more information, see:
https://creativecommons.org/faq/#what-is-copyright-and-why-does-it-matter
- Resolve any potential security, privacy, or ethical concerns about the
data
- Make the data available in a long-term repository under an appropriate
license, preferably using Zenodo: https://zenodo.org/communities/semeval/
- Provide task participants with format checkers and standard scorers.
- Provide task participants with baseline systems to use as a starting
point (in order to lower the obstacles to participation). A baseline system
typically contains code that reads the data, creates a baseline response
(e.g. random guessing, majority class prediction), and outputs the
evaluation results. Whenever possible, baseline systems should be written
in widely used programming languages and/or should be implemented as a
component for standard NLP pipelines.
- Create a mailing list and website for the task and post all relevant
information there.
- Create a CodaLab or other similar competition for the task and upload the
evaluation script.
- Manage submissions on CodaLab or a similar competition site.
- Write a task description paper to be included in SemEval proceedings, and
present it at the workshop.
- Manage participants’ submissions of system description papers, manage
participants’ peer review of each others’ papers, and possibly shepherd
papers that need additional help in improving the writing.
- Review other task description papers.
=== Important dates ===
- Task proposals due April 17, 2023 (Anywhere on Earth)
- Task selection notification May 22, 2023
=== Preliminary timetable ===
- Sample data ready July 15, 2023
- Training data ready September 1, 2023
- Evaluation data ready December 1, 2023 (internal deadline; not for public
release)
- Evaluation starts January 10, 2024
- Evaluation end by January 31, 2024 (latest date; task organizers may
choose an earlier date)
- Paper submission due February 2024
- Notification to authors on March 2024
- Camera-ready due April 2024
- SemEval workshop Summer 2024 (co-located with a major NLP conference)
Tasks that fail to keep up with crucial deadlines (such as the dates for
having the task and CodaLab website up and dates for uploading samples,
training, and evaluation data) may be cancelled at the discretion of
SemEval organizers. While consideration will be given to extenuating
circumstances, our goal is to provide sufficient time for the participants
to develop strong and well-thought-out systems. Cancelled tasks will be
encouraged to submit proposals for the subsequent year’s SemEval. To reduce
the risk of tasks failing to meet the deadlines, we are unlikely to accept
multiple tasks with overlap in the task organizers.
=== Submission Details ===
The task proposal should be a self-contained document of no longer than 3
pages (plus additional pages for references). All submissions must be in
PDF format, following the ACL template
<https://github.com/acl-org/acl-style-files>.
Each proposal should contain the following:
- Overview
- Summary of the task
- Why this task is needed and which communities would be interested in
participating
- Expected impact of the task
- Data & Resources
- How the training/testing data will be produced. Please discuss whether
existing corpora will be re-used.
- Details of copyright, so that the data can be used by the research
community both during the SemEval evaluation and afterwards
- How much data will be produced
- How data quality will be ensured and evaluated
- An example of what the data would look like
- Resources required to produce the data and prepare the task for
participants (annotation cost, annotation time, computation time, etc.)
- Assessment of any concerns with respect to ethics, privacy, or security
(e.g. personally identifiable information of private individuals; potential
for systems to cause harm)
- Pilot Task (strongly recommended)
- Details of the pilot task
- What lessons were learned and how these will impact the task design
- Evaluation
- The evaluation methodology to be used, including clear evaluation
criteria
- For Task Reruns
- Justification for why a new iteration of the task is needed (see
criteria above)
- What will differ from the previous iteration
- Expected impact of the rerun compared with the previous iteration
- Task organizers
- Names, affiliations, email addresses
- (optional) brief description of relevant experience or expertise
- (if applicable) years and task numbers, of any SemEval tasks you have
run in the past
Proposals will be reviewed by an independent group of area experts who may
not have familiarity with recent SemEval tasks, and therefore all proposals
should be written in a self-explanatory manner and contain sufficient
examples.
The submission webpage is:
https://openreview.net/group?id=aclweb.org/ACL/2023/Workshop/SemEval
=== Chairs ===
Atul Kr. Ojha, SFI Insight Centre for Data Analytics, DSI, University of
Galway
A. Seza Doğruöz, Ghent University
Giovanni Da San Martino, University of Padua
Harish Tayyar Madabushi, The University of Bath
Ritesh Kumar, Dr. Bhimrao Ambedkar University
Contact: semevalorganizers(a)gmail.com
* ***LREC-COLING 2024 Announcement****
_LREC-COLING 2024 - The 2024 Joint International Conference on
Computational Linguistics, Language Resources and Evaluation__
__Lingotto Conference Centre - Turin (Italy)__
__20-25 May, 2024_
*Conference website: https://lrec-coling-2024.lrec-conf.org/
*Twitter: @LrecColing2024
Two major international key players in the area of computational
linguistics, the ELRA Language Resources Association (ELRA) and the
International Committee on Computational Linguistics (ICCL), are joining
forces to organize the 2024 Joint International Conference on
Computational Linguistics, Language Resources and Evaluation
(LREC-COLING 2024) to be held in Turin (Italy) on 20-25 May, 2024.
The hybrid conference will bring together researchers and practitioners
in computational linguistics, speech, multimodality, and natural
language processing, with special attention to evaluation and the
development of resources that support work in these areas. Following in
the tradition of the well-established parent conferences COLING and
LREC, the joint conference will feature grand challenges and provide
ample opportunity for attendees to exchange information and ideas
through both oral presentations and extensive poster sessions,
complemented by a friendly social program.
The three-day main conference will be accompanied by a total of three
days of workshops and tutorials held in the days immediately before and
after.
*General Chairs*
Nicoletta Calzolari, CNR-ILC, Pisa
Min-Yen Kan, National University of Singapore
*Advisors to General Chairs*
Chu-Ren Huang, The Hong Kong Polytechnic University
Joseph Mariani, LISN-CNRS, Paris-Saclay University
*Programme Chairs*
Veronique Hoste, Ghent University
Alessandro Lenci, University of Pisa
Sakriani Sakti, Japan Advanced Institute of Science and Technology
Nianwen Xue, Brandeis University
*Management Chair*
Khalid Choukri, ELDA/ELRA, Paris
*Local Chairs*
Valerio Basile, University of Turin
Cristina Bosco, University of Turin
Viviana Patti, University of Turin
Job advertisement!
TurkuNLP (Natural Language Processing) is a multidisciplinary research group combining NLP and digital linguistics. We develop machine learning methods and tools to automatically process and understand text data and apply these to explore human interaction, communication and language use in very large digital text datasets such as those automatically crawled from the internet and historical text collections.
We invite applications for post doctoral researcher positions. The postdocs recruited will work within our research projects on web-as-corpus research, corpus linguistics and NLP on topics such as human diversity, multilingual modeling of web genres (registers), and semantic search.
For more details and to leave an application, please see job ID 14647 at https://www.utu.fi/en/university/come-work-with-us/open-vacancies and visit our websites at turkunlp.org and https://sites.utu.fi/humandiversity/. I am also happy to answer any questions you might have, please don't hesitate to contact me!
The postdocs are expected to begin their employment 1st of May 2023 or as soon as possible based on agreement.
Best regards,
Veronika Laippala
Dear list members,
I am delighted to announce the latest publication in the Elements in Corpus Linguistics series, published by Cambridge University Press. The title is "Corpus-Assisted Discourse Studies", and the authors are Mathew Gillings, Gerlinde Mautner and Paul Baker. This Element is now available FREE until 4 April 2023 at the following URL:
https://www.cambridge.org/core/search?q=9781009168151
Here is a summary of the Element:
"The breadth and spread of corpus-assisted discourse studies (CADS) indicate its usefulness for exploring language use within a social context. However, its theoretical foundations, limitations, and epistemological implications must be considered so that we can adjust our research designs accordingly. This Element offers a compact guide to which corpus linguistic tools are available and how they can contribute to finding out more about discourse. It will appeal to researchers both new and experienced, within the CADS community and beyond."
Best wishes
Susan Hunston (Series Editor)
Professor Susan Hunston (she/her)
Department of English Language and Linguistics
University of Birmingham
Birmingham B15 2TT
UK
(+44) 0121 414 5675
s.e.hunston(a)bham.ac.uk
Hola Luis
¿qué tal?
Acabo de ver en Corpora-list que estás a tope con temas de chatbots.
A lo mejor ya te ha llegado la info: estamos organizando una tarea que
puede que os pueda interesar.
A ver si participas ;-)
Saludos
Paolo
-----
*Apologies for cross-posting*
Do you believe machine generated text is becoming an issue? Are you
interested in boosting research to automatically detect machine
generated text? 🤖👩🏻
We cordially invite all researchers and practitioners from all fields
to participate in the AuTexTification task. If interested, register
yourself in the shared task through this link: https://lnkd.in/dzBZsYiD
Once registered and training phase started, the datasets will be sent
to your email along with a password. Look for more information
regarding task description, schedules, or submissions through the
Autextification web page: https://sites.google.com/view/autextification
More information on the shared task
The new era of automatic content generation has surged through
powerful causal language models like GPT, PALM, or Bloom that can be
used to spread untruthful news, human-looking reviews, or opinions.
Thus, it is imperative to develop technology to automatically detect
generated text for content moderation and to attribute generated text
to specific models to protect intellectual property or to distill
responsibilities. In this context, we propose the “Automatic Text
Identification” (AuTexTification) shared task, to boost research and
development of automatic systems to detect automatically generated
text, obtained by state-of-the-art language models, in English and
Spanish.
We propose two subtasks: (i) Human or Generated, where given a
text participants will have to determine whether a text has been
automatically generated or not; and (ii) Model Attribution, where
participants will have to determine what model generated a text. The
generation models used to generate the text are of increasing number
of neural parameters, ranging from 2 to 175 billion, meaning that
participants' systems should be versatile enough to detect a diverse
set of text generation models and writing styles.
In the training phase, participants will be provided with two
partitions for subtask 1, i.e., English and Spanish partitions, with
binary labels 👩🏻 and 🤖. Similarly, a partition per language will be
released for subtask 2. It will include six labels (A, B, C, D, E, and
F), each label representing a text generation model. Later, the
unlabeled test data will be released.
Important Dates
March 22, 2023: Release of training data
April 21, 2023: Release of test data
May 10, 2023: Participant system results submission
May 17, 2023: Results notification
June 3, 2023: Paper submission
June 16, 2023: Paper peer-reviewed
July 4, 2023: Camera-ready paper version
September 26, 2023: Conference
Task organizers
José Ángel González (Symanto) Contact Email: jose.gonzalez(a)symanto.com
Areg Sarvazyan (Symanto) Contact Email: areg.sarvazyan(a)symanto.com
Marc Franco-Salvador (Symanto)
Francisco Rangel (Symanto)
Berta Chulvi (Universitat Politècnica de València)
Paolo Rosso (Universitat Politècnica de València)
Please reach out to the organizers or join the Slack workspace to
connect with the other participants and organizers:
https://lnkd.in/di_zaMHf
The Digital Linguistics Lab at Bielefeld University (head: JProf. Dr.-Ing. Hendrik Buschmeier) is seeking to fill a research position (PhD-student, E13 TV-L, 100%, fixed-term) in the area of multimodal human-robot interaction in the research project “Hybrid Living”.
Join us to work in an interdisciplinary team on research questions in the intersection of human-robot interaction and computational linguistics. Specifically, you will work (1) on the use of multimodal communication (verbal and nonverbal) to situatively instruct a service robot, (2) on making the robot's behaviour transparent to its users, and (3) on models for solving human-robot interaction problems through communication.
The formal job advertisement, with information on how to apply, can be found here: https://uni-bielefeld.hr4you.org/job/view/2265/research-position-in-multimo…
Questions? Don't hesitate to get in touch: hbuschme(a)uni-bielefeld.de
Hendrik Buschmeier
--
JProf. Dr.-Ing. Hendrik Buschmeier
Digital Linguistics Lab
Faculty of Linguistics and Literary Studies, Bielefeld University
https://purl.org/net/hbuschme