[Apologies for multiple postings]
ImageCLEFmedicalGANs (1st edition)
Registration: https://www.imageclef.org/2023/medical/gans
Run submission: May 10, 2023
Working notes submission: June 5, 2023
CLEF 2023 conference: September 18-21, Thessaloniki, Greece
*** CALL FOR PARTICIPATION ***
The task is focused on examining the existing hypothesis that GANs are
generating medical images that contain the "fingerprints" of the real
images used for generative network training. If the hypothesis is
correct, artificial biomedical images may be subject to the same
sharing and usage limitations as real sensitive medical data. On the
other hand, if the hypothesis is wrong, GANs may be potentially used
to create rich datasets of biomedical images that are free of ethical
and privacy regulations. The participants will test the hypothesis by
solving one or several tasks related to the detection of relations
between real and artificial biomedical image datasets.
*** TASK ***
Given a set of real-world medical images comprising 2D axial CT image
slices of the heart (including the middle sections and adjacent
slices) of patients afflicted with lung #tuberculosis, the task
challenges participants to develop #machinelearning solutions to
automatically determine which real images were used in training the
generator of realistic synthetic examples.
*** DATA SET ***
The image datasets comprise 2D axial CT image slices of the heart,
including the middle sections of the heart and adjacent slices. These
images are obtained from patients afflicted with Lung Tuberculosis and
are stored in the form of 8 bit/pixel PNG images with dimensions of
256x256 pixels. The development dataset comprises three distinct sets
of images. One set contains images that were generated using a GAN,
while the other two sets are comprised of real images. The first of
these real image sets contains images that were used during the
algorithm's training process. The second set consists of real images
that were not used during the training process. Test dataset is a
collection of two image sets. The first set contains 10,000 images
that have been generated, while the second set is made up of a
combination of 200 real images that were either used or unused during
the training process.
*** IMPORTANT DATES ***
- Run submission: May 10, 2023
- Working notes submission: June 5, 2023
- CLEF 2023 conference: September 18-21, Thessaloniki, Greece
(https://clef2023.clef-initiative.eu/)
*** OVERALL COORDINATION ***
Serge Kozlovski, Belarusian Academy of Sciences, Belarus
Vassili Kovalev, Belarusian Academy of Sciences, Belarus
Ihar Filipovich, Belarus State University, Belarus
Alexandra Andrei, Politehnica University of Bucharest, Romania
Ioan Coman, Politehnica University of Bucharest, Romania
Bogdan Ionescu, Politehnica University of Bucharest, Romania
Henning Müller, University of Applied Sciences Western Switzerland, Switzerland
*** ACKNOWLEDGEMENT ***
Alexandra Andrei, Ioan Coman, Bogdan Ionescu, and Henning Müller
contribution is supported under the H2020 AI4Media "A European
Excellence Centre for Media, Society and Democracy" project, contract
#951911 https://www.ai4media.eu/.
On behalf of the Organizers,
Bogdan Ionescu
https://www.AIMultimediaLab.ro/
3 July 2023 (online, via Zoom)
Organisers: Catherine Travis & Li Nguyen (ANU; Language Data Commons of Australia (LDaCA))
Over decades of work in Australia, significant collections of language data have been amassed, including of varieties of Australian English, Australian migrant languages, Australian Indigenous languages, sign languages and others. These collections represent a trove of knowledge not only of language in Australia, but also of Australia’s social and cultural history. And yet, not all are well known and many lack published descriptions. The purpose of this workshop is to provide an opportunity to share information about existing language corpora in Australia, with a view to producing a special issue of the Australian Journal of Linguistics that introduces a selection of these corpora, explores how they can contribute to our understanding of language, society, and history in Australia, and considers avenues that such corpora open up for future research.
This workshop is being run as part of the Language Data Commons of Australia (LDaCA), which is working to build national research infrastructure for the Humanities and Social Sciences, facilitating access to and use of digital language corpora for linguists, scholars across the Humanities and Social Sciences, and non-academics.
Abstract submission
For a 20 min presentation, please submit a 250-300 word abstract in English (excluding references). The presentation should include the following information:
· Speech community/fieldsite: Describe the location of the community and/or their brief history in Australia, the languages spoken and their current status.
· Corpus design principles: Specify the sample size, sociolinguistic background of the participants, method of data collection and/or genre (e.g. sociolinguistic interviews, natural conversations, oral histories, elicited data, etc.); data format (written/spoken/audio/video, etc.) and where it is stored.
· Corpus findings and implications: Summarise some key findings from the corpus and discuss other insights that might be obtained from the data in current or future work.
Important dates
22 May Abstracts due
5 June Notification of acceptance
3 July Workshop
How to Submit: Please submit your abstract by 22 May on https://forms.gle/1pwxVVmUV5hCCZ997
Inquiries: Please contact either Catherine Travis or Li Nguyen
CALL FOR PARTICIPATION
IBERLEF 2023 Task - FinancES. Financial Targeted Sentiment Analysis in
Spanish
Training set released!
Held as part of iberLEF 2023 <https://sites.google.com/view/iberlef-2023>,
a shared evaluation campaign for Natural Language Processing (NLP) systems
in Spanish and other Iberian languages
September 26th 2023, Jaen
Codalab link: https://codalab.lisn.upsaclay.fr/competitions/10052
Dear All,
We are inviting researchers and students to participate in the
shared-task FinancES.
Financial Targeted Sentiment Analysis in Spanish held as part of iberLEF
2023, shared evaluation campaign for Natural Language Processing (NLP)
systems in Spanish and other Iberian languages.
This shared task aims to explore targeted sentiment analysis in the
financial domain. Specifically, the approach adopted here is grounded in
the field of microeconomics. In this regard, Bowles (2004) explains the
role of economic agents, that is to say, individuals or organizations
impacting the economy. The author states that the main microeconomic agents
in the capital market are consumers (households/individuals), companies
(firms), governments, and central banks. Consequently, in order to develop
a sentiment analysis method where different viewpoints are considered,
three different perspectives are included: (1) economic target of the news
item; (2) individual economic agent: companies; and (3) individual economic
agent: consumers –the target is the sector where the economic fact applies,
and companies produce the goods and services that households/individuals
consume. From these three viewpoints, the news item has an impact on the
target and the economic agents which are considered as positive, negative,
or neutral. With all, two tasks are proposed. On the one hand, a task
combining the challenges of aspect-term extraction for identifying the
target entity in text, and aspect-based sentiment classification for
determining the sentiment polarity towards the target. On the other hand, a
task devoted to assessing the impact of a news headline on both other
economic agents, namely, companies and consumers.
The participants will be provided development, development_test, training
and test datasets in Spanish. The dataset for this task is composed of news
headlines written in Spanish collected from digital newspapers specialized
in economic, financial and political news. The dataset is labeled with the
target entity and the sentiment polarity on three dimensions: target,
companies, and consumers. That is, given a headline, it has been manually
classified as positive, neutral, or negative for three specific entities:
(1) target entity (i.e., the specific company or asset where the economic
fact applies), (2) companies (i.e., the entities producing the goods and
services that others consume), and (3) consumers (i.e.,
households/individuals). Each headline was annotated by three members of
the organization committee. In case of disagreement, the annotators
discussed the special case and, if no agreement was reached, the headline
was discarded. During this first step, we compiled about 14k headlines, the
headlines with a short length or those that did not specify a target entity
were filtered out. The final dataset is composed of 8k-10k news headlines.
For the shared tasks, training and test sets will be released (80%-20%).
Today, we have released the training dataset that can be found in the "Files"
subsection of the "Participate" tab. It is worth mentioning that this
dataset includes all the instances that were also released during the
Practice stage; so, it is not needed to combine both datasets.
Finally, remember that the CodaLab competition is open to submit your
results with the development dataset provided. This dataset is also
available in the same section as the training dataset.
Best regards,
The FinancES 2023 organizing committee
References
-
Bowles, S. (2004). Microeconomics: Behavior, institutions, and
evolution. Princeton University Press.
Important dates
-
Release of development corpora: Feb 13, 2023
-
Release of training corpora: Mar 13, 2023
-
Release of test corpora and start of evaluation campaign: Apr 17, 2023
-
End of evaluation campaign (deadline for runs submission): May 3, 2023
-
Publication of official results: May 5, 2023
-
Paper submission: May 28, 2023
-
Review notification: Jun 16, 2023
-
Camera ready submission: Jul 6, 2023
-
IberLEF Workshop (SEPLN 2023): Sep 26, 2023
-
Publication of proceedings: Sep ??, 2023
Organizing committee
-
José Antonio García-Díaz (UMUTeam,, Universidad de Murcia)
-
Ángela Almela Sánchez-Lafuente (UMUTeam, Universidad de Murcia)
-
Francisco García-Sánchez (UMUTeam, Universidad de Murcia)
-
Gema Alcaraz-Mármol (UMUTeam, Universidad de Castilla La Mancha)
-
María José Marín (UMUTeam, Universidad de Murcia)
-
Rafael Valencia-García (UMUTeam, Universidad de Murcia)
The Chair of Data Science and Natural Language Processing (Prof. Siegfried Handschuh) invites applications for a PhD position as part of the recently granted Swiss National Science Foundation (SNF) research project "Conversational AI: Dialogue-based Adaptive Argumentative Writing Support". You will contribute to its successful implementation, as well as the chair's varied activities in teaching and outreach.
Our research at the Data Science and NLP Chair at ICS-HSG focuses on the cutting-edge field of Natural Language Processing (NLP) and Natural Language Understanding (NLU). Our group delves into various aspects, including conversational AI, Large Language Models (LLM), intricate language analysis, Knowledge Graphs and more. Utilising sophisticated methods and algorithms, we aim to extract deeper knowledge and understanding from vast amounts of textual and auditory data. Our research not only delves into the core principles of NLP but also explores its practical applications across various industries.
Details of the post and how to apply: https://bit.ly/phd_conversational_ai
HSG: PhD in Conversational AI (m/f/d)<https://bit.ly/phd_conversational_ai>
You have a university-level master degree in Computer Science, Computational Linguistics, Data Science or related disciplines. You have a strong background in Machine Learning and Natural Language Processing; knowledge of Deep Learning architectures (e.g., Transformer, BERT) and frameworks (e.g., PyTorch, Tensorflow) is a plus. You have good programming skills in Python. Desirable qualifications include one or more of the following areas: Chatbots, Argument Mining and Text Generation. You have excellent written and verbal communication skills in English. You have a strong analytical, structured and independent working style, as well as first experience in writing scientific papers.
bit.ly
--
Prof. Dr. Siegfried Handschuh
Full Professor of Data Science and Natural Language Processing
Director, Institute of Computer Science
University of St.Gallen
E-mail: siegfried.handschuh(a)unisg.ch
Hi,
this is an invitation to attend the NLP challenge day organized by CERIST,
Algeria on March 29, 2023.
The link to attend the event is:
https://visioconf.cerist.dz/CERISTNLPChallenge
The program is available at:
http://www.nlpchallenge.cerist.dz/
Thank you.
Best regards.
Hassina Aliane.
*Dr. Hassina Aliane, Direcctor of Research*
*Head of Natural Language Processing and Digital Content Team,*
*Director of the Digital Humanities R&D laboratory, *
*Editor of the Information Processing at The Digital Age Review. *
*Research Center on Scientific and Technical Information.*
The 6th ASAIL workshop, focused on Natural Language Processing for legal texts and co-located with ICAIL 2023 in Braga, Portugal, is coming up soon.
We would like to invite you to submit papers on, and demonstrations of, original work on automated detection, extraction and analysis of semantic information in legal texts.
Submission deadline: 26th April 2023
Workshop date: 23rd June 2023
We are accepting three tiers of (two column format) papers: long (10 pages); short (6 pages); and position (2 pages).
Since we are very interested in sparking discussion around ideas and work in their early stages, we welcome short and position papers as particularly suitable for this ambition.
You can find more information, including the full call for papers, at our website: https://sites.google.com/view/asail/asail-2023-call-for-papers
Best wishes,
Daphne Odekerken
On behalf of the ASAIL Organising Committee
Apologies for cross-posting.
----------------------------------------
We invite proposals for tasks to be run as part of SemEval-2024
<https://semeval.github.io/SemEval2024/>. SemEval (the International
Workshop on Semantic Evaluation) <https://semeval.github.io/>is an ongoing
series of evaluations of computational semantics systems, organized under
the umbrella of SIGLEX <https://siglex.org/>, the Special Interest Group on
the Lexicon of the Association for Computational Linguistics.
SemEval tasks explore the nature of meaning in natural languages: how to
characterize meaning and how to compute it. This is achieved in practical
terms, using shared datasets and standardized evaluation metrics to
quantify the strengths and weaknesses and possible solutions. SemEval tasks
encompass a broad range of semantic topics from the lexical level to the
discourse level, including word sense identification, semantic parsing,
coreference resolution, and sentiment analysis, among others.
For SemEval-2024, we welcome any task that can test an automatic system for
the semantic analysis of text, which could be an intrinsic semantic
evaluation or an application-oriented evaluation. We especially encourage
tasks for languages other than English, cross-lingual tasks, and tasks that
develop novel applications of computational semantics. See the websites of
previous editions of SemEval to get an idea about the range of tasks
explored, SemEval-2022 <https://semeval.github.io/SemEval2022/> and
SemEval-2023 <https://semeval.github.io/SemEval2023/>.
We strongly encourage proposals based on pilot studies that have already
generated initial data, as this can provide concrete examples and can help
to foresee the challenges of preparing the full task. In the event of
receiving many proposals, preference will be given to proposals that have
already run a pilot study.
In case you are not sure whether a task is suitable for SemEval, please
feel free to get in touch with the SemEval organizers at
semevalorganizers(a)gmail.com to discuss your idea.
=== Task Selection ===
Task proposals will be reviewed by experts, and reviews will serve as the
basis for acceptance decisions. Everything else being equal, more
innovative new tasks will be given preference over task reruns. Task
proposals will be evaluated on:
- Novelty: Is the task on a compelling new problem that has not been
explored much in the community? Is the task a rerun, but covering
substantially new ground (new subtasks, new types of data, new languages,
etc.)?
- Interest: Is the proposed task likely to attract a sufficient number
of participants?
- Data: Are the plans for collecting data convincing? Will the resulting
data be of high quality? Will annotations have meaningfully high
inter-annotator agreements? Have all appropriate licenses for use and
re-use of the data after the evaluation been secured? Have all
international privacy concerns been addressed? Will the data annotation be
ready on time?
- Evaluation: Is the methodology for evaluation sound? Is the necessary
infrastructure available or can it be built in time for the shared task?
Will research inspired by this task be able to evaluate in the same manner
and on the same data after the initial task?
- Impact: What is the expected impact of the data in this task on future
research beyond the SemEval Workshop?
=== New Tasks vs. Task Reruns ===
We welcome both new tasks and task reruns. For a new task, the proposal
should address whether the task would be able to attract participants.
Preference will be given to novel tasks that have not received much
attention yet.
For reruns of previous shared tasks (whether or not the previous task was
part of SemEval), the proposal should address the need for another
iteration of the task. Valid reasons include: a new form of evaluation
(e.g. a new evaluation metric, a new application-oriented scenario), new
genres or domains (e.g. social media, domain-specific corpora), or a
significant expansion in scale. We further discourage carrying over a
previous task and just adding new subtasks, as this can lead to the
accumulation of too many subtasks. Evaluating on a different dataset with
the same task formulation, or evaluating on the same dataset with a
different evaluation metric, typically should not be considered a separate
subtask.
=== Task Organization ===
We welcome people who have never organized a SemEval task before, as well
as those who have. Apart from providing a dataset, task organizers are
expected to:
- Verify the data annotations have sufficient inter-annotator agreement
- Verify licenses for the data to allow its use in the competition and
afterwards. In particular, text that is publicly available online is not
necessarily in the public domain; unless a license has been provided, the
author retains all rights associated with their work, including copying,
sharing and publishing. For more information, see:
https://creativecommons.org/faq/#what-is-copyright-and-why-does-it-matter
- Resolve any potential security, privacy, or ethical concerns about the
data
- Make the data available in a long-term repository under an appropriate
license, preferably using Zenodo: https://zenodo.org/communities/semeval/
- Provide task participants with format checkers and standard scorers.
- Provide task participants with baseline systems to use as a starting
point (in order to lower the obstacles to participation). A baseline system
typically contains code that reads the data, creates a baseline response
(e.g. random guessing, majority class prediction), and outputs the
evaluation results. Whenever possible, baseline systems should be written
in widely used programming languages and/or should be implemented as a
component for standard NLP pipelines.
- Create a mailing list and website for the task and post all relevant
information there.
- Create a CodaLab or other similar competition for the task and upload the
evaluation script.
- Manage submissions on CodaLab or a similar competition site.
- Write a task description paper to be included in SemEval proceedings, and
present it at the workshop.
- Manage participants’ submissions of system description papers, manage
participants’ peer review of each others’ papers, and possibly shepherd
papers that need additional help in improving the writing.
- Review other task description papers.
=== Important dates ===
- Task proposals due April 17, 2023 (Anywhere on Earth)
- Task selection notification May 22, 2023
=== Preliminary timetable ===
- Sample data ready July 15, 2023
- Training data ready September 1, 2023
- Evaluation data ready December 1, 2023 (internal deadline; not for public
release)
- Evaluation starts January 10, 2024
- Evaluation end by January 31, 2024 (latest date; task organizers may
choose an earlier date)
- Paper submission due February 2024
- Notification to authors on March 2024
- Camera-ready due April 2024
- SemEval workshop Summer 2024 (co-located with a major NLP conference)
Tasks that fail to keep up with crucial deadlines (such as the dates for
having the task and CodaLab website up and dates for uploading samples,
training, and evaluation data) may be cancelled at the discretion of
SemEval organizers. While consideration will be given to extenuating
circumstances, our goal is to provide sufficient time for the participants
to develop strong and well-thought-out systems. Cancelled tasks will be
encouraged to submit proposals for the subsequent year’s SemEval. To reduce
the risk of tasks failing to meet the deadlines, we are unlikely to accept
multiple tasks with overlap in the task organizers.
=== Submission Details ===
The task proposal should be a self-contained document of no longer than 3
pages (plus additional pages for references). All submissions must be in
PDF format, following the ACL template
<https://github.com/acl-org/acl-style-files>.
Each proposal should contain the following:
- Overview
- Summary of the task
- Why this task is needed and which communities would be interested in
participating
- Expected impact of the task
- Data & Resources
- How the training/testing data will be produced. Please discuss whether
existing corpora will be re-used.
- Details of copyright, so that the data can be used by the research
community both during the SemEval evaluation and afterwards
- How much data will be produced
- How data quality will be ensured and evaluated
- An example of what the data would look like
- Resources required to produce the data and prepare the task for
participants (annotation cost, annotation time, computation time, etc.)
- Assessment of any concerns with respect to ethics, privacy, or security
(e.g. personally identifiable information of private individuals; potential
for systems to cause harm)
- Pilot Task (strongly recommended)
- Details of the pilot task
- What lessons were learned and how these will impact the task design
- Evaluation
- The evaluation methodology to be used, including clear evaluation
criteria
- For Task Reruns
- Justification for why a new iteration of the task is needed (see
criteria above)
- What will differ from the previous iteration
- Expected impact of the rerun compared with the previous iteration
- Task organizers
- Names, affiliations, email addresses
- (optional) brief description of relevant experience or expertise
- (if applicable) years and task numbers, of any SemEval tasks you have
run in the past
Proposals will be reviewed by an independent group of area experts who may
not have familiarity with recent SemEval tasks, and therefore all proposals
should be written in a self-explanatory manner and contain sufficient
examples.
The submission webpage is:
https://openreview.net/group?id=aclweb.org/ACL/2023/Workshop/SemEval
=== Chairs ===
Atul Kr. Ojha, SFI Insight Centre for Data Analytics, DSI, University of
Galway
A. Seza Doğruöz, Ghent University
Giovanni Da San Martino, University of Padua
Harish Tayyar Madabushi, The University of Bath
Ritesh Kumar, Dr. Bhimrao Ambedkar University
Contact: semevalorganizers(a)gmail.com
* ***LREC-COLING 2024 Announcement****
_LREC-COLING 2024 - The 2024 Joint International Conference on
Computational Linguistics, Language Resources and Evaluation__
__Lingotto Conference Centre - Turin (Italy)__
__20-25 May, 2024_
*Conference website: https://lrec-coling-2024.lrec-conf.org/
*Twitter: @LrecColing2024
Two major international key players in the area of computational
linguistics, the ELRA Language Resources Association (ELRA) and the
International Committee on Computational Linguistics (ICCL), are joining
forces to organize the 2024 Joint International Conference on
Computational Linguistics, Language Resources and Evaluation
(LREC-COLING 2024) to be held in Turin (Italy) on 20-25 May, 2024.
The hybrid conference will bring together researchers and practitioners
in computational linguistics, speech, multimodality, and natural
language processing, with special attention to evaluation and the
development of resources that support work in these areas. Following in
the tradition of the well-established parent conferences COLING and
LREC, the joint conference will feature grand challenges and provide
ample opportunity for attendees to exchange information and ideas
through both oral presentations and extensive poster sessions,
complemented by a friendly social program.
The three-day main conference will be accompanied by a total of three
days of workshops and tutorials held in the days immediately before and
after.
*General Chairs*
Nicoletta Calzolari, CNR-ILC, Pisa
Min-Yen Kan, National University of Singapore
*Advisors to General Chairs*
Chu-Ren Huang, The Hong Kong Polytechnic University
Joseph Mariani, LISN-CNRS, Paris-Saclay University
*Programme Chairs*
Veronique Hoste, Ghent University
Alessandro Lenci, University of Pisa
Sakriani Sakti, Japan Advanced Institute of Science and Technology
Nianwen Xue, Brandeis University
*Management Chair*
Khalid Choukri, ELDA/ELRA, Paris
*Local Chairs*
Valerio Basile, University of Turin
Cristina Bosco, University of Turin
Viviana Patti, University of Turin
Job advertisement!
TurkuNLP (Natural Language Processing) is a multidisciplinary research group combining NLP and digital linguistics. We develop machine learning methods and tools to automatically process and understand text data and apply these to explore human interaction, communication and language use in very large digital text datasets such as those automatically crawled from the internet and historical text collections.
We invite applications for post doctoral researcher positions. The postdocs recruited will work within our research projects on web-as-corpus research, corpus linguistics and NLP on topics such as human diversity, multilingual modeling of web genres (registers), and semantic search.
For more details and to leave an application, please see job ID 14647 at https://www.utu.fi/en/university/come-work-with-us/open-vacancies and visit our websites at turkunlp.org and https://sites.utu.fi/humandiversity/. I am also happy to answer any questions you might have, please don't hesitate to contact me!
The postdocs are expected to begin their employment 1st of May 2023 or as soon as possible based on agreement.
Best regards,
Veronika Laippala
Dear list members,
I am delighted to announce the latest publication in the Elements in Corpus Linguistics series, published by Cambridge University Press. The title is "Corpus-Assisted Discourse Studies", and the authors are Mathew Gillings, Gerlinde Mautner and Paul Baker. This Element is now available FREE until 4 April 2023 at the following URL:
https://www.cambridge.org/core/search?q=9781009168151
Here is a summary of the Element:
"The breadth and spread of corpus-assisted discourse studies (CADS) indicate its usefulness for exploring language use within a social context. However, its theoretical foundations, limitations, and epistemological implications must be considered so that we can adjust our research designs accordingly. This Element offers a compact guide to which corpus linguistic tools are available and how they can contribute to finding out more about discourse. It will appeal to researchers both new and experienced, within the CADS community and beyond."
Best wishes
Susan Hunston (Series Editor)
Professor Susan Hunston (she/her)
Department of English Language and Linguistics
University of Birmingham
Birmingham B15 2TT
UK
(+44) 0121 414 5675
s.e.hunston(a)bham.ac.uk