Apologies for cross-posting.
----------------------------------------
*The International Conference on Spoken Language Translation*
*21st IWSLT 2024 – **Second** Call for Participation*
*August 15-16, 2024 – Bangkok, Thailand*
*http://iwslt.org <http://iwslt.org/>*
The International Conference on Spoken Language Translation (IWSLT) is the
premier annual conference for all aspects of Spoken Language Translation.
Every year, the conference organizes and sponsors open evaluation campaigns
around key challenges in simultaneous and consecutive translation, under
real-time/low latency or offline conditions and under low-resource or
multilingual constraints. System descriptions and results from
participants’ systems and scientific papers related to key algorithmic
advances and best practices are presented.
IWSLT is the venue of the SIGSLTs, the Special Interest Group on Spoken
Language Translation of ACL, ISCA and ELRA. With a track record of 20
years, IWSLT benchmarks and proceedings serve as reference for all
researchers and practitioners working on speech translation and related
fields.
The 21st edition of IWSLT <https://iwslt.org/2024/> will be run as an
*ELRA/ACL* event and co-located with ACL 2024 <https://2024.aclweb.org/> on
August 15-16, 2024. It will be run as a hybrid event.
Important Dates
January 15, 2024: Release of shared task training and dev data
April 01-15, 2024: Evaluation period
April 29, 2024: Paper submission due (all papers)
June 4, 2024: Notification of acceptance
June 24, 2024: Camera-ready paper due
July 22, 2024: Pre-recorded video due
August 15-16, 2024: Conference
Evaluation
The IWSLT 2024 features shared tasks <https://iwslt.org/2024/#shared-tasks>
that address the following focus areas:
- Speech-to-speech track
- Simultaneous track
- Subtitling track
- Offline track
- Dubbing track
- Low-resource track
- Indic track
Training, development and test data for each shared task will be prepared
and released by the respective organizers (for further information on this
initiative, please refer to the website <https://iwslt.org/2024/>).
Participants will receive instructions about how to submit their runs. In
addition, participants have the opportunity to present their work
through a system
paper that will be published in the ACL Proceedings.
Conference
IWSLT also invites submissions of scientific papers to be published in the
ACL Proceedings and presented either in oral or poster format. The
conference selects high-quality, original contributions on theoretical and
practical issues of spoken language translation research, technologies and
applications. For further information on this initiative, please refer to
the website <https://iwslt.org/2024/#paper-submission>
Contact
Please send an email to iwslt-evaluation-campaign(a)googlegroups.com if you
have any questions related to the shared tasks.
Thanks,
Marine, Marcello, Alex, Jan, Sebastian, Elizabeth, Atul
(IWSLT organisers)
The International Congress of Linguists (ICL) is organized once every five years as the meeting place for international linguistics, where all areas and sub-disciplines of linguistics as well as interdisciplinary topics can be discussed. Its 21st edition (https://icl2024poznan.pl/) will be held from 8 to 14 September 2024 in Poznań and now invites abstracts for Sections, Focus streams, and Workshops.
Call for Abstracts: Corpus Linguistics
Focus stream 8 invites abstracts of papers that examine the methods and applications of corpus linguistics. Topics may include the design and construction of corpora, the analysis and interpretation of corpus data, the use of corpus tools and software, and the implications of corpus findings for various linguistic domains and disciplines. The focus stream also explores the challenges and opportunities of corpus linguistics in the era of big data, artificial intelligence, and natural language processing.
Abstracts should clearly state the research question(s), approach, method, data, and (expected) results. They should not display the names of the presenters, nor their affiliations or addresses, or any other information that could reveal their authorship. They should contain the title, five keywords, and a text between 300 and 400 words (including examples, excluding references).
Each abstract will be reviewed anonymously by two reviewers (section/focus stream/workshop convenor + external reviewer).
Important dates
Feb 1, 2024: (Extended) submission deadline (12.00 PM CET). Submission link: https://easychair.org/conferences/?conf=icl2024poznan
Apr 15, 2024: Notification of acceptance.
Sep 11, 2024: Focus stream date
Presentations and posters
Authors may apply, upon abstract submission, for a presentation or a poster. Presentations will be organized in 30 minute slots (20 min. presentation, 7 min. discussion, 3 min. room change). Posters are always displayed during one full day. Separate time slots will be included in the program in which participants can discuss with the poster presenters.
Best regards –
Maciej Ogrodniczuk
Convenor of FS8: Corpus Linguistics at ICL 2024
The fifth workshop on Resources for African Indigenous Language (RAIL)
Colocated with LREC-COLING 2024
https://bit.ly/rail2024
Conference dates: 20-25 May 2024
Workshop date: 25 May 2024
Venue: Lingotto Conference Centre, Torino (Italy)
The fifth RAIL workshop website: https://bit.ly/rail2024
LREC-COLING 2024 website: https://lrec-coling-2024.org/
Submission website: https://softconf.com/lrec-coling2024/rail2024/
The fifth Resources for African Indigenous Languages (RAIL) workshop
will be co-located with LREC-COLING 2024 in Lingotto Conference Centre,
Torino, Italy on 25 May 2024. The RAIL workshop is an interdisciplinary
platform for researchers working on resources (data collections, tools,
etc.) specifically targeted towards African indigenous languages. In
particular, it aims to create the conditions for the emergence of a
scientific community of practice that focuses on data, as well as
computational linguistic tools specifically designed for or applied to
indigenous languages found in Africa.
Many African languages are under-resourced while only a few of them are
somewhat better resourced. These languages often share interesting
properties such as writing systems, or tone, making them different from
most high-resourced languages. From a computational perspective, these
languages lack enough corpora to undertake high level development of
Human Language Technologies (HLT) and Natural Language Processing (NLP)
tools, which in turn impedes the development of African languages in
these areas. During previous workshops, it has become clear that the
problems and solutions presented are not only applicable to African
languages but are also relevant to many other low-resource languages.
Because these languages share similar challenges, this workshop
provides researchers with opportunities to work collaboratively on
issues of language resource development and learn from each other.
The RAIL workshop has several aims. First, the workshop brings together
researchers who work on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances sharing of knowledge on the development of
low-resource languages. Finally, it enables discussions on how to
improve the quality as well as availability of the resources.
The workshop has “Creating resources for less-resourced languages” as
its theme, but submissions on any topic related to properties of
African indigenous languages (including non-African languages) may be
accepted. Suggested topics include (but are not limited to) the
following:
* Digital representations of linguistic structures
* Descriptions of corpora or other data sets of African indigenous
languages
* Building resources for (under resourced) African indigenous languages
* Developing and using African indigenous languages in the digital age
* Effectiveness of digital technologies for the development of African
indigenous languages
* Revealing unknown or unpublished existing resources for African
indigenous languages
* Developing desired resources for African indigenous languages
* Improving quality, availability and accessibility of African
indigenous language resources
Submission requirements:
We invite papers on original, unpublished work related to the topics of
the workshop. Submissions, presenting completed work, may consist of up
to eight (8) pages of content plus additional pages of references. The
final camera-ready version of accepted long papers are allowed one
additional page of content (up to 9 pages) so that reviewers’ feedback
can be incorporated. Papers should be formatted according to the LREC-
COLING style sheet (https://lrec-coling-2024.org/authors-kit/), which
is provided on the LREC-COLING 2024 website
(https://lrec-coling-2024.org/). Reviewing is double-blind, so make
sure to anonymise your submission (e.g., do not provide author names,
affiliations, project names, etc.) Limit the amount of self citations
(anonymised citations should not be used). The RAIL workshop follows
the LREC-COLING submission requirements.
Please submit papers in PDF format to the START account
(https://softconf.com/lrec-coling2024/rail2024/). Accepted papers will
be published in proceedings linked to the LREC-COLING conference.
Important dates:
Submission deadline: 16 February 2024
Date of notification: 15 March 2024
Camera ready deadline: 29 March 2024
RAIL workshop: 25 May 2024
Organising Committee
Rooweither Mabuya, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Muzi Matfunjwa, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Mmasibidi Setaka, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources
(SADiLaR), South Africa
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
[Apologies for cross-postings]
CALL FOR PAPERS FOR THE
SECOND INTERNATIONAL WORKSHOP TOWARDS DIGITAL LANGUAGE EQUALITY (TDLE):
FOCUSING ON SUSTAINABILITY
_ _
co-located with LREC-COLING 2024, Saturday 25th May 2024, Turin (Italy)
_ _
https://european-language-equality.eu/tdle-2024/
1 DESCRIPTION AND AIMS OF THE WORKSHOP
The key aim of this half-day workshop co-located with LREC-COLING 2024
(https://lrec-coling-2024.org/), to be held in Turin (Italy) on Saturday
25th May 2024, is to discuss and promote the importance of
sustainability in the design, development, creation, use, distribution
and sharing of language data, resources, platforms, infrastructures,
tools and technologies, with the intention of achieving Digital Language
Equality (DLE). While some important work has recently addressed these
crucial areas (e.g. Fort and Couillault, 2016; Hessenthaler et al.,
2022; Ramesh et al., 2023; Castilho et al., forthcoming), the relevant
contributions seem to be as yet unsystematic and relatively isolated.
The workshop intends to provide an inclusive forum to encourage in-depth
debate and facilitate collaborations to promote the sustainability of
resources and technologies in any (combination of) languages, in support
of multilingualism and of the overarching goal of DLE.
_The sustainability of language resources and technologies is key to
enabling multilingualism and digital language equality in the age of
Artificial Intelligence._
2 TOPICS OF INTEREST
The _Second International Workshop_ _Towards Digital Language Equality
(TDLE) _focuses on sustainability in relation to the design,
development, creation, use, distribution and sharing of language data,
resources, platforms, infrastructures, tools and technologies, with a
view to promoting the broader goal of Digital Language Equality (DLE).
The concept of DLE has been firmly established in relation to all
languages of Europe (Rehm and Way, 2023), and has the potential to also
benefit other languages throughout the world, to support the prosperity
of the respective communities at a time of impressive - but as yet very
unevenly distributed and severely imbalanced - progress in
language-centric Artificial Intelligence (AI), e.g. through large
language models (LLMs). The workshop places particular emphasis on
multilingualism and on leveling up digital support for languages,
domains and applications that have so far been underserved, and wishes
to explore ways to develop policies and funding streams to work towards
sustainability in connection with DLE, especially in support of
regional, minority and territorial languages.
To this end, recognizing that the sustainability of Language Resources
and Technologies (LRTs) is key to enabling multilingualism and DLE in
the age of AI, topics of particular interest for the workshop on which
we invite original contributions covering any (combination of) languages
include, but are not limited to, the following:
* research on the factors affecting DLE and the sustainability of
LRTs;
* best practices, case studies and validated guidelines related to the
design, implementation and improvement of sustainability of written,
oral/spoken, signed and/or multimodal LRTs (including LLMs),
particularly in support of DLE;
* how multilingual LLM technology can support DLE;
* retrospectively assessing the sustainability of legacy LRTs, and
future-proofing new LRTs in the interest of DLE;
* analyzing the costs and benefits of foregrounding sustainability for
LRTs;
* the role of metadata, accompanying documentation and licenses in
showing and improving the sustainability of LRTs;
* sustainability, fairness and accessibility (e.g. for users with
physical or cognitive disabilities, limited computing resources and
connectivity) of platforms and infrastructures hosting, distributing and
sharing LRTs in the interest of DLE;
* how current data and computing access inequality is affecting DLE
(in particular regarding LLMs);
* ecological sustainability and environmental fairness of developing
and deploying state-of-the-art LRTs, e.g. LLMs with regard to energy
consumption, global warming and climate change;
* developing data and parameter efficient methods to train or adapt
language models to new languages;
* how to evaluate, measure, compare and improve the sustainability of
LRTs;
* establishing benchmarks and protocols to ensure the sustainability
of LRTs;
* how to avoid the potential dangers of developing and using _un_fair
and _un_sustainable LRTs, e.g. for malicious, ill-intentioned or harmful
purposes;
* ethical, legal, cultural and/or socio-economic implications of
(ignoring) fairness and sustainability of LRTs;
* developing and implementing forward-looking policies to promote
fairness and long-term sustainability of LRTs to achieve DLE;
* education and training needs and experiences in relation to
promoting fairness and sustainability of LRTs and ways to raise broad
awareness of DLE and related topics, e.g. among the general public,
policy- and decision-makers.
Given this wide-ranging and inclusive remit, the workshop intends to
bring together developers, creators, vendors, distributors, brokers,
users, evaluators and researchers of written, oral/spoken, signed and/or
multimodal LRTs in any (combination of) languages.
3 BACKGROUND AND FIRST TDLE WORKSHOP HELD IN 2022
The second 2024 edition of the workshop builds on the success of the
first _Towards Digital Language Equality (TDLE) workshop_,[1] that was
held at LREC 2022 in Marseille (France) on 20 June 2022, and whose
accepted papers were published in a dedicated volume of proceedings,
Aldabe et al. (2022).[2]
Following this well-received inaugural workshop held in June 2022, the
second event in the series will be co-located with LREC-COLING 2024 in
Turin (Italy) on Saturday 25th May 2024, and will focus specifically on
the highly relevant topic of the sustainability of LRTs in connection
with multilingualism and DLE.
4 SUBMISSIONS
Up-to-date information on the workshop, including materials for authors,
guidelines, templates, stylesheet and key dates can be found at the
dedicated website https://european-language-equality.eu/tdle-2024/. To
contact the organizing committee of the workshop directly, you can email
tdle2024.hitz(a)ehu.eus.
Papers submitted to the workshop should be completely anonymous for
double-blind peer review, written in English, and prepared using the
official LREC-COLING 2024 author's kit and submission
stylesheet/template available at
https://lrec-coling-2024.org/authors-kit/. The submissions to the
workshop should not exceed 8 pages, excluding references, and be saved
in unprotected PDF format. Papers should be submitted no later than 23
February 2024 through the START submission management system available
at https://softconf.com/lrec-coling2024/tdle2024/.
The workshop seeks original papers, i.e. it does not accept submissions
that have been, or will be, published elsewhere. The workshop allows
simultaneous submissions, and in these cases the authors should clearly
indicate in the manuscript to which other conference, workshop or venue
they have submitted the paper for review. Each paper submitted to the
workshop will receive three double-blind peer reviews. Papers accepted
for presentation will be included in the proceedings of the workshop.
In light of the LREC-COLING 2024 Map and the "Share your LRs!"
initiative, when submitting their papers through the START system
authors will be asked to provide essential information about resources
(in a broad sense, i.e. also technologies, standards, evaluation kits,
etc.) that have been used for the work described in the paper or are a
new result of their research. Moreover, ELRA encourages all LREC-COLING
authors to share the described LRs (data, tools, services, etc.) to
enable their reuse and replicability of experiments (including
evaluation ones).
5 KEY DATES
Paper submission deadline: 23 February 2024
Notification of acceptance: 19 March 2024
Camera-ready papers due: 8 April 2024
Half-day workshop date: Saturday, 25th May 2024
6 WORKSHOP ORGANIZERS
* Itziar Aldabe (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
* Begoña Altuna (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
* Aritz Farwell (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
* Federico Gaspari (University of Naples "Federico II", Italy & ADAPT
Centre, Dublin City University, Ireland - co-chair)
* Joss Moorkens (School of Applied Language & Intercultural
Studies/ADAPT Centre, Dublin City University, Ireland - co-chair)
* Stelios Piperidis (Institute of Language and Speech Processing,
Athena Research and Innovation Center in Information, Communication and
Knowledge Technologies, Greece)
* Georg Rehm (Speech and Language Technology Lab, Deutsches
Forschungszentrum für Künstliche Intelligenz, Germany)
* German Rigau (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
7 PROGRAM COMMITTEE
* Antonios Anastasopoulos (GMU, USA)
* Anya Belz (ADAPT, DCU, Ireland)
* Steven Bird (CDU, Australia)
* Fred Blain (Uni. Tilburg, Netherlands)
* Franco Cutugno (Uni. Naples "Federico II", Italy)
* Bessie Dendrinos (NKUA, Greece & ECSPM, Denmark)
* Félix do Carmo (Uni. Surrey, UK)
* Annika Grützner-Zahn (DFKI, Germany)
* Ana Guerberof-Arenas (Uni. Groningen, Netherlands)
* Davyth Hicks (ELEN, Belgium)
* Monja Jannet (ADAPT, DCU, Ireland)
* John Judge (ADAPT, DCU, Ireland)
* Dorothy Kenny (SALIS/CTTS/ADAPT, DCU, Ireland)
* Sabine Kirchmeier (EFNIL, Luxembourg)
* Teresa Lynn (MBZUAI, United Arab Emirates)
* Maite Melero (BSC, Spain)
* Helena Moniz (Uni. Lisbon, Portugal & EAMT)
* Johanna Monti (UniOR, Italy)
* Rachele Raus (UniBO, Italy)
* Wessel Reijers (Uni. Paderborn, Germany)
* Celia Rico Pérez (Universidad Complutense de Madrid, Spain)
* Dimitar Shterionov (TU, Netherlands)
* Carlos S. C. Teixeira (IOTA Localisation Services & Uni. Rovira i
Virgili, Spain)
* Antonio Toral ( Groningen, Netherlands)
* Vincent Vandeghinste (Instituut voor de Nederlandse Taal,
Netherlands & KU Leuven, Belgium)
REFERENCES
Itziar Aldabe, Begoña Altuna, Aritz Farwell and German Rigau, editors.
2022. _Proceedings of the Workshop Towards Digital Language Equality
(TDLE)_ [1]. European Language Resources Association, Marseille, France.
Sheila Castilho, Federico Gaspari, Joss Moorkens, Maja Popović and
Antonio Toral, editors. Forthcoming. _Journal of Specialised
Translation_ [2]. Special Issue n. 41 on "Translation Automation and
Sustainability".
Karën Fort and Alain Couillault, 2016. "Yes, We Care! Results of the
Ethics and Natural Language Processing Surveys [3]". _Proceedings of the
Tenth International Conference on Language Resources and Evaluation
(LREC'16)_ [4]. European Language Resources Association, Portorož,
Slovenia. 1593-1600.
Marius Hessenthaler, Emma Strubell, Dirk Hovy and Anne Lauscher, 2022.
"Bridging Fairness and Environmental Sustainability in Natural Language
Processing [5]". _Proceedings of the 2022 Conference on Empirical
Methods in Natural Language Processing_ [6], Abu Dhabi, United Arab
Emirates. 7817-7836.
András Kornai, 2013. "Digital Language Death [7]". _PLoS ONE_,
8(10):e77056.
Krithika Ramesh, Sunayana Sitaram and Monojit Choudhury, 2023. "Fairness
in Language Models Beyond English: Gaps and Challenges [8]". _Findings
of the Association for Computational Linguistics: EACL 2023_ [9].
Association for Computational Linguistics, Dubrovnik, Croatia.
2106-2119.
Georg Rehm and Andy Way, editors. 2023. _European Language Equality: A
Strategic Agenda for Digital Language Equality_ [10]. Berlin: Springer.
[1] https://european-language-equality.eu/tdle-2022/
[2]
www.lrec-conf.org/proceedings/lrec2022/workshops/TDLE/2022.tdle-1.0.pdf
[11]
Links:
------
[1] https://aclanthology.org/2022.tdle-1.pdf
[2] https://www.jostrans.org/
[3] https://aclanthology.org/L16-1252.pdf
[4] https://aclanthology.org/volumes/L16-1/
[5] https://aclanthology.org/2022.emnlp-main.533.pdf
[6] https://aclanthology.org/volumes/2022.emnlp-main/
[7]
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0077056
[8] https://aclanthology.org/2023.findings-eacl.157.pdf
[9] https://aclanthology.org/2023.findings-eacl.pdf
[10] https://link.springer.com/book/10.1007/978-3-031-28819-7
[11]
http://www.lrec-conf.org/proceedings/lrec2022/workshops/TDLE/2022.tdle-1.0.…
Job offer: Researcher for Multimodal Fake-News and Disinformation Detection at DFKI Berlin
The German Research Center for Artificial Intelligence (DFKI) has operated as a non-profit, Public-Private-Partnership (PPP) since 1988. DFKI combines scientific excellence and commercially-oriented value creation with social awareness and is recognized as a major "Center of Excellence" by the international scientific community. In the field of artificial intelligence, DFKI as Germany’s biggest public and independent organisation dedicated to AI research and development, has focused on the goal of human-centric AI for more than 30 years. Research is committed to essential, future-oriented areas of application and socially relevant topics.
We are looking for a highly motivated research assistant to join our existing team and work on a project focused on fake-news and disinformation detection from speech and multimedia data. Content authenticity verification of speech combined with other modalities like text, visuals or meta-data will be a center part. In any case, xAI and bias analysis are aspects of high relevance to the position as well.
The successful candidate will work closely with high-impact partners in this field, e.g. Technical University of Berlin, RBB (Berlin TV and news broadcaster), Deutsche Welle (Germany's broadcaster abroad), and 5 other partners.
Responsibilities will include developing and testing different AI/NLP models and techniques, analyzing the performance of machine learning models in the context of applicable fake-news and disinformation fighting for journalists, and communicating project progress and results to relevant stakeholders. The position offers opportunities for pursuing a doctorate and publishing research results in scientific journals and conferences.
Qualified candidates will have a completed university degree in (technical) computer science or computational linguistics, excellent programming skills in Python, and a strong background in machine learning/AI and signal processing or NLP. Previous experience in the field of fake-news or spoofing / authenticity detection of multimedia data is an advantage.
DFKI offers an agile and lively international and interdisciplinary environment for working in a self-determined manner. If you are interested in contributing to cutting-edge research and working with a dynamic team, please apply!
More details and link: https://jobs.dfki.de/en/vacancy/researcher-m-f-d-547585.html
Application deadline: Jan 23, 2024.
In terms of questions please don’t hesitate to contact tim.polzehl(a)dfki.de<mailto:tim.polzehl@dfki.de>
--
Dr.-Ing. Tim Polzehl
Senior Researcher
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI)
German Research Center for Artificial Intelligence
Speech & Language Technology
Associate Senior Researcher
Technische Universität Berlin
Quality and Usability Lab
DFKI Labor Berlin
Alt-Moabit 91c, D-10559 Berlin, Germany
Tel.: +49.30.238951863
Fax: +49 30 23895 1810
E-Mail tim.polzehl(a)dfki.de<mailto:tim.polzehl@dfki.de>
-------------------------------------------------------------
Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
Trippstadter Straße 122, 67663 Kaiserslautern, Germany
Geschäftsführung:
Prof. Dr. Antonio Krüger (Vorsitzender)
Helmut Ditzer
Vorsitzender des Aufsichtsrats:
Dr. Ferri Abolhassan
Amtsgericht Kaiserslautern, HRB 2313
-------------------------------------------------------------
Apologies for cross-posting
------------------------------------------------------
Dear colleagues,
We invite you to submit to the special session on “Emergent Phenomena in
Deep Representations and Large Language Models” as a part of IJCNN 2024
and IEEE WCCI 2024, which will be located in Yokohama, Japan.
We are looking forward to your contributions.
Please find the CfP below.
Best wishes,
On behalf of Organising Committee
Özge Alacam
------------------------------------------------------
First Call for Papers: Special Session on Emergent Phenomena in Deep
Representations and Large Language Models @IJCNN 2024 & IEEE WCCI 2024:
Deep learning models trained on large datasets have shown spectacular
performance in a wide range of tasks demonstrated by current
applications of Large Language Models. However, recent works have shown
that the abilities large machine learning models acquire often emerge
unpredictably with increasing model complexity or training dataset size.
These emergent phenomena include the unexpected appearance of abilities
for which the model was not explicitly trained, but they might also be
related to unexpected performance boosts due to the increased model
complexity. Emergent phenomena are not always beneficial: larger models
may pick up new biases from the training data or start hallucinating.
To move towards increasingly sustainable, reliable, and explainable
applications of AI systems, it is necessary to increase the
understanding of the mechanisms surrounding emergent phenomena.
Moreover, this effort provides increased insight into the learning
process behind the acquisition of abilities of large models to perform
specific tasks. Important research questions relate to the definition of
emergent phenomena, their causes (what controls which abilities are
acquired and when?), training efficiency, and training data quality
(e.g., acquiring desired abilities with less computational effort),
prompting strategies to get or test for desired model behaviour (e.g., a
chain of thought), and further verification methods of model abilities
and properties.
The primary goal of this special session is (i) to discuss the emergent
abilities and risks in deep neural networks and representations from
very different angles and (ii) facilitate networking and encourage
collaboration between various research fields that approach this issue
from different perspectives, like computational linguistics, ethics in
AI, computer science, physics, etc.
Topics of interest include, but are not limited to:
• The definition of emergence in the context of NLP and ML
• Prompting strategies
• Physics-based/inspired analyses (e.g. phase transitions in ML
models)
• Explainability and interpretability (XAI)
• Evaluation measures for model ability, monitoring strategies,
assessment of model abilities (e.g. technical or psychology-based)
• Knowledge distillation, model pruning, energy-efficient models.
• Mitigation strategies for emergent risks and model deterioration.
• Fine-tuning and Retrieval-augmented generation (RAG)
• Papers focusing on specific emergent phenomena (reasoning,
creativity, double descent phenomena etc.)
The website for the call for papers is accessible at
https://sites.google.com/view/emergenn/call-for-papers
Organising Committee:
------------------------------
• Dr. Özge Alacam (Ludwig-Maximilian University & Uni Bielefeld,
Germany)
• Dr. Michiel Straat (Uni Bielefeld, Germany)
• Prof. Dr. Hinrich Schütze (Ludwig-Maximilian University, Germany)
• Prof. Dr. Alessandro Sperduti (University of Padova, Italy)
Important Dates:
------------------------------
• January 15, 2024 - Paper Submission Deadline
• March 15, 2024 - Notification of Acceptance
• May 1, 2024 - Camera-ready Deadline & Early
Registration Deadline
• June 30 - July 5, 2024 - Main Conference (IEEE WCCI 2024,
Yokohama, Japan)
* All deadlines are 11:59 PM UTC-12:00 ("anywhere on Earth")
Submission Format and Platform:
------------------------------
• Submissions will be through the IEEE WCCI 2024 Submission page
<https://edas.info/login.php?rurl=aHR0cHM6Ly9lZGFzLmluZm8vTjMxNjE0P2M9MzE2MT…>.
• Each paper is limited to 8 pages, including figures, tables,
and references. Please refer to the author guidelines provided by IEEE
WCCI 2024
• Please specify during the submission that your paper is
intended for the Special Session: Emergent Phenomena in Deep
Representations and Large Language Models.
• Special session webpage:
https://sites.google.com/view/emergenn/call-for-papers
• IEEE WCCI 2024 webpage: https://2024.ieeewcci.org/
Contact information:
------------------------------
• Özge Alacam : oezge.alacam(a)uni-bielefeld.de
• Michiel Straat : mstraat(a)techfak.uni-bielefeld.de
CODI, 5th Workshop on Computational Approaches to Discourse
2024-03-21 or 22 - EACL 2024 - Malta
** Direct Submission deadline: January 17th, 2024 **
Direct submission: We now open submissions for papers rejected at another main conference.
Website link: https://sites.google.com/view/codi2024
CODI considers for publication papers rejected at one of the main conferences, authors will have to submit both the paper and the reviews as a supplemantary pdf file. If modifications have been made since the original submission, please submit an additional file describing briefly the modifications made. The organizers will decide on the acceptance of the papers based on the quality of the paper and its fit with the workshop.
As a reminder, CODI also invites presentations of paper accepted at another main conference. They will be included in the workshop program and handbook, but will not appear in the workshop proceedings.
Please submit your workshop papers (category: "direct submission") at https://softconf.com/eacl2024/CODI-2024/
DSTL (Defence Science and Technology Laboratory, part of the UK Civil Service) is advertising for a computational/corpus linguist to work in their 'Behavioural and Social Science Group'. They are looking for a linguist who understands the potential for (and limitations of) computational approaches to discourse and who would be comfortable interacting with computer / data scientists.
Details can be found at: https://www.civilservicejobs.service.gov.uk/csr/index.cgi?SID=b3duZXJ0eXBlP…
Regards
Paul Thompson
= = = = = = = = = = = = =
Dr Paul Thompson
Reader in Applied Corpus Linguistics
Co-Director, Centre for Corpus Research
Head of Department, English Language and Linguistics
University of Birmingham
Birmingham B15 2TT, UK
Editor-in-Chief, Applied Corpus Linguistics journal
= = = = = = = = = = = = =
Apologies for cross-posting!
GeoLD2024: 6th International Workshop on Geospatial Linked Data
Hersonissos, Greece, May 26-27, 2024
Conference website https://i3mainz.github.io/GeoLD2024/
Submission link https://easychair.org/conferences/?conf=geold2024
Submission deadline March 10, 2024
GeoLD2024
*6th International Workshop on Geospatial Linked Data* at ESWC 2024
<https://2024.eswc-conferences.org/>
Geospatial data is vital for both traditional applications like navigation,
logistics, and tourism and emerging areas like autonomous vehicles, smart
buildings and GIS on demand. Spatial linked data has recently transitioned
from experimental prototypes to national infrastructure. However the next
generation of spatial knowledge graphs will integrate multiple spatial
datasets with the large number of general datasets that contain some
geospatial references (e.g., DBpedia, Wikidata). This integration, either
on the public Web or within organizations has immense socio-economic as
well as academic benefits. The upsurge in Linked data related presentations
in the recent Eurogeographics data quality workshop shows the deep interest
in Geospatial Linked Data (GLD) in national mapping agencies. GLD enables a
web-based, interoperable geospatial infrastructure. This is especially
relevant for delivering the INSPIRE directive in Europe. Moreover,
geospatial information systems benefit from Linked Data principles in
building the next generation of spatial data applications e.g., federated
smart buildings, self-piloted vehicles, delivery drones or automated local
authority services.
This workshop invites papers covering the challenges and solutions for
handling with GLD, especially for building high quality, adaptable,
geospatial infrastructures and next-generation spatial applications. We aim
to demonstrate the latest approaches and implementations and to discuss the
solutions to challenges and issues arising from research and industrial
organizations.
The following topics of interest are covered by GeoLD2024.
*Interoperability and Integration*
- Geospatial Linked Data vocabularies and standards (GeoSPARQL, INSPIRE,
W3C, OGC)
- Extraction/transformation of Geospatial Linked Data from native
geospatial data sources
- Integration (schema mapping, interlinking, fusion) techniques for
Geospatial RDF Data
- Enrichment, quality and evolution of Linked Data with Geospatial
information
- Machine Learning improving Geospatial Linked Data processing
- Natural Language Processing, especially Large Language Models for
improving GLD processing
*Big Geospatial Data Management*
- Distributed solutions for Geospatial Linked Data management (storing,
querying, mapping)
- Algorithms and tools for large scale, scalable Geospatial Linked Data
management
- Efficient Indexing and Querying of Geospatial Linked Data
- Geospatial-specific Reasoning on RDF Data
- Ranking techniques on querying Geospatial RDF Data
- Advanced querying capabilities on Geospatial RDF Data
*Utilization of Geospatial Linked Data*
- Benchmarking of Geospatial Linked Data applications
- Geospatial Linked Data in social web platforms and applications
- Geospatial linked data applications for indoor navigation
- Visualization models/interfaces for browsing/authoring/querying
Geospatial Linked Data
- Real-world applications/use cases/paradigms using Geospatial Linked
Data
- Evaluation/comparison of tools/libraries/frameworks for Geospatial
Linked Data
- Data governance models for Geospatial Linked Data
Submission Guidelines
All papers must be original and not simultaneously submitted to another
journal or conference. The following paper categories are welcome:
- *Long papers (up to 12 pages)*: Presenting novel scientific research
pertaining to geospatial Linked Data.
- *Short papers (up to 6 pages)*: Position papers, System, Library, API
and Dataset descriptions, relevant to the topics of interest.
- *Demo/Tutorial papers (up to 4 pages)*: Describe a demo or hands-on
tutorial of a tool on the workshop topics
Organizing committee
- Timo Homburg (i3mainz -- Institute for Spatial Information Surveying
Technology, Mainz University Of Applied Sciences, Germany)
- Dr. Beyza Yaman (ADAPT Centre, Trinity College Dublin, Ireland)
- Dr. Mohamed Ahmed Sherif (University of Paderborn, Germany)
- Prof. Dr. Axel-Cyrille Ngonga Ngomo (University Of Paderborn, Germany)
Contact
All questions about submissions should be emailed to
Timo.Homburg(a)hs-mainz.de
STAND Workshop on Standardizing Tasks, meAsures and NLP Datasets
https://stand4nlp.github.io/
Full-day workshop in Paris, France, January 29th 2024 (+ partial hybrid)
Abstract submission deadline: January 24th 2024, but earlier submissions
are welcome
Scientific context:
The current lack of standardized practices and definitions in NLP systems
hinders the progress of the field. Indeed, there is not always consensus on
which evaluation methods are meaningful and fruitful, or which of their
implementations are to be used with which parameters (eg. SacreBLEU, Post
2018).
In some cases, there is no general agreement on the very definition of a
task.
This situation calls for work on *standardizing* NLP practices.
The International Organization for Standardization (ISO) has just created *a
dedicated working group on NLP* (as a joint effort of the AI and Language
committees), and *2 standards* are already under way. Topics under
consideration by the ISO standardization committees include NLP
terminology, evaluation metrics, interoperability, annotation guidelines,
good practices in NLP development/evaluation/corpora, documentation.
These topics are already heavily discussed in academia, and a number of
informal guidelines have already been proposed. We believe that the
creation of NLP standards can significantly benefit from the input of both NLP
academics and industry NLP practitioners.
Reciprocally, NLP researchers would benefit from getting involved in the
standardization effort, thus ensuring that academia's views are listened
to, in particular in the context of the *AI Act* (the European regulation
on AI that has been finalized in December), whose enforcement will strongly
rely on those standards.
The STAND workshop is a research initiative whose goal is:
- to foster discussion on existing standards, their creation and use
- to assess the current needs of the community for standardization
- to share experience on the impact on the research activities when
lacking good practices
- to collect existing good practices (and propose new ones)
We invite contributions from NLP practitioners from both the industry and
academia, as well as standardization experts.
We invite two types of submission:
* short abstract: 1 page
* long abstract: 3 pages
Accepted submissions will be presented as posters. Authors accepted in the
long-abstract track will be invited to submit a full paper (5-10 pages)
after the workshop.
Topics for submissions include, but are not limited to:
- Comparability and reproducibility of evaluation setup
- Annotation guidelines
- Evaluation metrics
- Good practices for building, annotating and maintaining corpora
- Good practices for system evaluation
- Interoperability
- Ethical guidelines
- Guidelines for documenting corpora and models
Submission instructions:
- Submissions are expected in PDF form by email at stand4nlp(a)inria.fr
- All submissions should be formatted using the ACL 2023 style files
https://2023.aclweb.org/calls/style_and_formatting/.
============
PROGRAM AT A GLANCE:
[09:00-10:00] Welcome, introduction to standardization, ongoing activities
in NLP standardization, and the AI Act context
[10:15-11:50] Academic keynote (*Joakim Nivre*) and invited talks (*Matt
Post*, other speaker TBC)
[11:50-13:30] Poster session (with boosters) & lunch
[13:30-14:40] Industry keynote (speaker TBC) and invited talk (*Dirk Hovy*)
[15:00-16:30] Moderator-led breakout discussions. Potential topics that
will be discussed include:
- [sharing / drafting] Standardizing good practices for evaluation
- [sharing / drafting] Standardizing good practices for corpus
management (collection, annotation, versioning)
- [sharing / drafting] Standardizing evaluation metrics (definitions,
implementation, sharing scripts)
- [sharing / drafting] Standardizing annotation schemes (formats and
guidelines)
- [debate] Explainability and ethics in NLP: what needs for standards?
- [debate] Comparing standardization needs with limitations of the
state-of-the-art: how to bridge the gap?
- [debate] Towards standardizing translations of technical terminology
in NLP: how to organize i18n?
[16:30-17:30] Reports from breakouts, definition of community-level actions
& wrap-up. Example outcomes that are envisioned include:
- Collection and drafting of existing good practices
- Preparation of a joint submission for a position paper
- Creation of common repositories for evaluation scripts, corpus
documentation
Participants to the workshop will be offered the opportunity to attend a
standardization committee's meeting, which has been scheduled for the day
after the workshop (January 30th). The outputs of that meeting will be used
in direct support of the AI Act.
Remote access will be offered for part of the workshop only. In-person
participation is recommended if possible.
Posters will be in-person only.
IMPORTANT DATES:
Abstract submission: Anytime by January 24
Notification of acceptance: Within a few days of submission
Workshop: January 29
Standardization committee meeting: January 30
ORGANISING COMMITTEE:
Lauriane Aufrant, Timothée Bernard, Maximin Coavoux, Yoann Dupont, Arnaud
Ferré, Taras Holoyad, Rania Wazir
MORE INFORMATION
For the latest information see the workshop page at
https://stand4nlp.github.io/; for any questions contact stand4nlp(a)inria.fr.