Call for Papers
[https://lrec-coling-2024.org/2nd-call-for-papers/]
Two international key players in the area of computational linguistics, the
ELRA Language Resources Association (ELRA) and the International Committee
on Computational Linguistics (ICCL), are joining forces to organize the
2024 Joint International Conference on Computational Linguistics, Language
Resources and Evaluation (LREC-COLING 2024) to be held in Torino, Italy on
20-25 May, 2024.
IMPORTANT DATES
(All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”)
- 22 September 2023: Paper anonymity period starts
- 13 October 2023: Final submissions due (long, short and position
papers)
- 13 October 2023: Workshop/Tutorial proposal submissions due
- 22–29 January 2024: Author rebuttal period
- 5 February 2024: Final reviewing
- 19 February 2024: Notification of acceptance
- 25 March 2024: Camera-ready due
- 20-25 May 2024: LREC-COLING2024 conference
SUBMISSION TOPICS
LREC-COLING 2024 invites the submission of long and short papers featuring
substantial, original, and unpublished research in all aspects of natural
language and computation, language resources (LRs) and evaluation,
including spoken and sign language and multimodal interaction. Submissions
are invited in five broad categories: (i) theories, algorithms, and models,
(ii) NLP applications, (iii) language resources, (iv) NLP evaluation and
(v) topics of general interest. Submissions that span multiple categories
are particularly welcome.
(I) Theories, algorithms, and models
- Discourse and Pragmatics
- Explainability and Interpretability of Large Language Models
- Language Modeling
- CL/NLP and Linguistic Theories
- CL/NLP for Cognitive Modeling and Psycholinguistics
- Machine Learning for CL/NLP
- Morphology and Word Segmentation
- Semantics
- Tagging, Chunking, Syntax and Parsing
- Textual Inference
(II) NLP applications
- Applications (including BioNLP and eHealth, NLP for legal purposes,
NLP for Social Media and Journalism, etc.)
- Dialogue and Interactive Systems
- Document Classification, Topic Modeling, Information Retrieval and
Cross-Lingual Retrieval
- Information Extraction, Text Mining, and Knowledge Graph Derivation
from Texts
- Machine Translation for Spoken/Written/Sign Languages, and Translation
Aids
- Sentiment Analysis, Opinion and Argument Mining
- Speech Recognition/Synthesis and Spoken Language Understanding
- Natural Language Generation, Summarization and Simplification
- Question Answering
- Offensive Speech Detection and Analysis
- Vision, Robotics, Multimodal and Grounded Language Acquisition
(III) Language resource design, creation, and use: text, speech, sign,
gesture, image, in single or multimodal/multimedia data
- Guidelines, standards, best practices and models for LRs,
interoperability
- Methodologies and tools for LRs construction, annotation, and
acquisition
- Ontologies, terminology and knowledge representation
- LRs and Semantic Web (including Linked Data, Knowledge Graphs, etc.)
- LRs and Crowdsourcing
- Metadata for LRs and semantic/content mark-up
- LRs in systems and applications such as information extraction,
information retrieval, audio-visual and multimedia search, speech
dictation, meeting transcription, Computer-Aided Language Learning,
training and education, mobile communication, machine translation, speech
translation, summarisation, semantic search, text mining, inferencing,
reasoning, sentiment analysis/opinion mining, (speech-based) dialogue
systems, natural language and multimodal/multisensory interactions,
chatbots, voice-activated services, etc.
- Use of (multilingual) LRs in various fields of application like
e-government, e-participation, e-culture, e-health, mobile applications,
digital humanities, social sciences, etc.
- LRs in the age of deep neural networks
- Open, linked and shared data and tools, open and collaborative
architectures
- Bias in language resources
- User needs, LT for accessibility
(IV) NLP evaluation methodologies
- NLP evaluation methodologies, protocols and measures
- Benchmarking of systems and products
- Evaluation metrics in Machine Learning
- Usability evaluation of HLT-based user interfaces and dialogue systems
- User satisfaction evaluation
(V) Topics of general interest
- Multilingual issues, language coverage and diversity, less-resourced
languages
- Replicability and reproducibility issues
- Organisational, economical, ethical and legal issues
- Priorities, perspectives, strategies in national and international
policies
- International and national activities, projects and initiatives
PAPER THEME TRACKS
Those topics are organized into 26 main tracks:
- LC01 Applications Involving LRs and Evaluation (including
Applications in Specific Domains)
- LC02 CL and Linguistic Theories, Cognitive Modeling and
Psycholinguistics
- LC03 Corpora and Annotation (including Tools, Systems, Treebanks)
- LC04 Dialogue, Conversational Systems, Chatbots, Human-Robot
Interaction
- LC05 Digital Humanities and Cultural Heritage
- LC06 Discourse and Pragmatics
- LC07 Document Classification, Information Retrieval and
Cross-lingual Retrieval
- LC08 Evaluation and Validation Methodologies
- LC09 Inference, Reasoning, Question Answering
- LC10 Information Extraction, Knowledge Extraction, and Text Mining
- LC11 Integrated Systems and Applications
- LC12 Knowledge Discovery/Representation (including Knowledge Graphs,
Linked Data, Terminology, Ontologies)
- LC13 Language Modeling
- LC14 Less-Resourced/Endangered/Less-studied Languages
- LC15 Lexicon and Semantics
- LC16 Machine Learning Models and Techniques for CL/NLP
- LC17 Multilinguality, Machine Translation, and Translation Aids
(including Speech-to-Speech Translation)
- LC18 Multimodality, Cross-modality (including Sign Languages, Vision
and Other Modalities), Multimodal Applications, Grounded Language
Acquisition, and HRI
- LC19 Natural Language Generation, Summarization and Simplification
- LC20 Offensive and Non-inclusive Language Detection and Analysis
- LC21 Opinion & Argument Mining, Sentiment Analysis, Emotion
Recognition/Generation
- LC22 Parsing, Tagging, Chunking, Grammar, Syntax, Morphosyntax,
Morphology
- LC23 Policy issues, Ethics, Legal Issues, Bias Analysis (including
Language Resource Infrastructures, Standards for LRs, Metadata)
- LC24 Social Media Processing
- LC25 Speech Resources and Processing (including Phonetic Databases,
Phonology, Prosody, Speech Recognition, Synthesis and Spoken Language
Understanding)
- LC26 Trustworthiness, Interpretability, and Explainability of Neural
Models
PAPER TYPES AND FORMATS
LREC-COLING 2024 invites high-quality submissions written in English.
Submissions of three forms of papers will be considered:
1. Regular long papers – up to eight (8) pages maximum*, presenting
substantial, original, completed, and unpublished work.
2. Short papers – up to four (4) pages*, describing a small focused
contribution, negative results, system demonstrations, etc.
3. Position papers – up to eight (8) pages*, discussing key hot topics,
challenges and open issues, as well as cross-fertilization between
computational linguistics and other disciplines.
* Excluding any number of additional pages for references, ethical
consideration, conflict-of-interest, as well as data, and code availability
statements.
Upon acceptance, final versions of long papers will be given one additional
page – up to nine (9) pages of content plus unlimited pages for
acknowledgments and references – so that reviewers’ comments can be taken
into account. Final versions of short papers may have up to five (5) pages,
plus unlimited pages for acknowledgments and references. For both long and
short papers, all figures and tables that are part of the main text must
fit within these page limits.
Furthermore, appendices or supplementary material will also be allowed ONLY
in the final, camera-ready version, but not during submission, as papers
should be reviewed without the need to refer to any supplementary materials.
Linguistic examples, if any, should be presented in the original language
but also glossed into English to allow accessibility for a broader
audience.
Note that paper types are decisions made orthogonal to the eventual, final
form of presentation (i.e., oral versus poster).
PAPER SUBMISSIONS AND TEMPLATES
Submission is electronic, using the Softconf START conference management
system via the link:
https://softconf.com/lrec-coling2024/papers/
Both long and short papers must follow the LREC-COLING 2024 two-column
format, using the supplied official style files. The templates can be
downloaded from the Style Files and Formatting page provided on the
website. Please do not modify these style files, nor should you use
templates designed for other conferences. Submissions that do not conform
to the required styles, including paper size, margin width, and font size
restrictions, will be rejected without review.
AUTHOR RESPONSIBILITIES
Papers must be of original, previously-unpublished work. Papers must be
anonymized to support double-blind reviewing. Submissions thus must not
include authors’ names and affiliations. The submissions should also avoid
links to non-anonymized repositories: the code should be either submitted
as supplementary material in the final version of the paper, or as a link
to an anonymized repository (e.g., Anonymous GitHub
<https://anonymous.4open.science/> or Anonym Share <https://anonymfile.com/>).
Papers that do not conform to these requirements will be rejected without
review.
If the paper is available as a preprint, this must be indicated on the
submission form but not in the paper itself. In addition, LREC-COLING 2024
will follow the same policy as ACL conferences establishing an anonymity
period during which non-anonymous posting of preprints is not allowed.
More specifically, direct submissions to LREC-COLING 2024 may not be made
available online (e.g. via a preprint server) in a non-anonymized form
after September 22, 11:59PM UTC-12:00 (for arXiv, note that this refers to
submission time).
Also included in that policy are instructions to reviewers to not rate
papers down for not citing recent preprints. Authors are asked to cite
published versions of papers instead of preprint versions when possible.
Papers that have been or will be under consideration for other venues at
the same time must be declared at submission time. If a paper is accepted
for publication at LREC-COLING 2024, it must be immediately withdrawn from
other venues. If a paper under review at LREC-COLING 2024 is accepted
elsewhere and authors intend to proceed there, the LREC-COLING 2024
committee must be notified immediately
ETHICS STATEMENT
We encourage all authors submitting to LREC-COLING 2024 to include an
explicit ethics statement on the broader impact of their work, or other
ethical considerations after the conclusion but before the references. The
ethics statement will not count toward the page limit (8 pages for long, 4
pages for short papers).
PRESENTATION REQUIREMENT
All papers accepted to the main conference track must be presented at the
conference to appear in the proceedings, and at least one author must
register for LREC-COLING2024. Papers will be presented either orally or as
posters. The specific presentation modality of a paper will be decided
based on its content, with no difference in quality implied. Papers that
include a demonstration component will be presented as posters.
All papers accepted to the main conference will be required to submit a
presentation video. The conference will be hybrid, with an emphasis on
encouraging interaction between the online and in-person modalities, and
thus presentations can be either on-site or virtual.
--
Enrico Santus, PhD
*Head of Human Computation*
*CTO Office at Bloomberg LP*
*Website*: www.esantus.com
*E-mail*: esantus(a)gmail.com
*All opinions expressed in my private e-mails are my own, and they do not
represent any group, institution or company to which I am associated.*
---
Apologies for multiple posting
---
The Speech Technology group (SpeechTek) at Fondazione Bruno Kessler
<https://www.fbk.eu/en/> (Trento, Italy) in conjunction with the ICT
International Doctorate School of the University of Trento
<https://iecs.unitn.it/> is pleased to announce the availability of the
following fully-funded PhD position:
*TITLE*: Efficient E2E models for automatic speech recognition in
multi-speaker scenarios
*DESCRIPTION*: In spite of the recent progress in speech technologies,
processing and understanding conversational spontaneous speech is still an
open issue, in particular in presence of challenging acoustic conditions as
those posed by dinner party scenarios. Although enormous progresses have
been made recently in a variety of speech processing tasks (such as speech
enhancement, speech separation, speech recognition, spoken language
understanding), targeting also multi-speaker speech recognition, a unified
established solution is still far from being available. Moreover, the
computational complexity of current approaches is extremely high, making an
actual deployment in low-end or IoT devices not feasible in practice.
The candidate will advance the current state-of-the-art in speech
processing (in particular for separation, enhancement and recognition)
towards developing a unified solution, possibly based on self-supervised or
unsupervised approaches, for automatic speech recognition in dinner party
scenarios, as those considered in the CHiME challenges (
https://arxiv.org/abs/2306.13734).
*CONTACTS*: brutti(a)fbk.eu <guerini(a)fbk.eu>
*COMPLETE DETAILS AVAILABLE AT*:
https://iecs.unitn.it/education/admission/reserved-topic-scholarships#A9
------------------------------------------------------------------------------------------------
Giuseppe Daniele Falavigna
Fondazione Bruno Kessler
Via Sommarive 18 - 38123 Povo - Trento, Italy
mail:falavi@fbk.eu - tel:+39(0)461314562 - fax:+39(0)461314591
HomePage: https://speechtek.fbk.eu/people/profile/falavi
-------------------------------------------------------------------------------------------------
--
--
Le informazioni contenute nella presente comunicazione sono di natura
privata e come tali sono da considerarsi riservate ed indirizzate
esclusivamente ai destinatari indicati e per le finalità strettamente
legate al relativo contenuto. Se avete ricevuto questo messaggio per
errore, vi preghiamo di eliminarlo e di inviare una comunicazione
all’indirizzo e-mail del mittente.
--
The information transmitted is
intended only for the person or entity to which it is addressed and may
contain confidential and/or privileged material. If you received this in
error, please contact the sender and delete the material.
Dear colleagues,
you are invited to participate in the Eval4NLP 2023 shared task on **Prompting Large Language Models as Explainable Metrics**.
Please find more information below and on the shared task webpage: https://eval4nlp.github.io/2023/shared-task.html
Important Dates
- Shared task announcement: August 02, 2023
- Dev phase: August 07, 2023
- Test phase: September 18, 2023
- System Submission Deadline: September 23, 2023
- System paper submission deadline: October 5, 2023
- System paper camera ready submission deadline: October 12, 2023
All deadlines are 11.59 pm UTC -12h (“Anywhere on Earth”). The timeframe of the test phase may change. Please regularly check the shared task webpage: https://eval4nlp.github.io/2023/shared-task.html.
** Overview **
With groundbreaking innovations in unsupervised learning and scalable architectures the opportunities (but also risks) of automatically generating audio, images, video and text, seem overwhelming. Human evaluations of this content are costly and are often infeasible to collect. Thus, the need for automatic metrics that reliably judge the quality of generation systems and their outputs, is stronger than ever. Current state-of-the-art metrics for natural language generation (NLG) still do not match the performance of human experts. They are mostly based on black-box language models and usually return a single quality score (sentence-level), making it difficult to explain their internal decision process and their outputs.
The release of APIs to large language models (LLMs), like ChatGPT and the recent open-source availability of LLMs like LLaMA has led to a boost of research in NLP, including LLM-based metrics. Metrics like GEMBA [*] explore the prompting of ChatGPT and GPT4 to directly leverage them as metrics. Instructscore [*] goes in a different direction and finetunes a LLaMA model to predict a fine grained error diagnosis of machine translated content. We notice that current work (1) does not systematically evaluate the vast amount of possible prompts and prompting techniques for metric usage, including, for example, approaches that explain a task to a model or let the model explain a task itself, and (2) rarely evaluates the performance of recent open-source LLMs, while their usage is incredibly important to improve the reproducibility of metric research, compared to closed-source metrics.
This year’s Eval4NLP shared task, combines these two aspects. We provide a selection of open-source, pre-trained LLMs. The task is to develop strategies to extract scores from these LLM’s that grade machine translations and summaries. We will specifically focus on prompting techniques, therefore, fine-tuning of the LLM’s is not allowed.
Based on the submissions, we hope to explore and formalize prompting approaches for open-source LLM-based metrics and, with that, help to improve their correlation to human judgements. As many prompting techniques produce explanations as a side product we hope that this task will also lead to more explainable metrics. Also, we want to evaluate which of the selected open-source models provide the best capabilities as metrics, thus, as a base for fine-tuning.
** Goals **
The shared task has the following goals:
Prompting strategies for LLM-based metrics: We want to explore which prompting strategies perform best for LLM-based metrics. E.g., few-shot prompting [*], where examples of other solutions are given in a prompt, chain-of-thought reasoning (CoT) [*], where the model is prompted to provide a multi-step explanation itself, or tree-of-thought prompting [*], where different explanation paths are considered, and the best is chosen. Also, automatic prompt generation might be considered [*]. Numerous other recent works explore further prompting strategies, some of which use multiple evaluation passes.
Score aggregation for LLM-based metrics: We also want to explore which strategies best aggregate the model scores from LLM-based metrics. E.g., scores might be extracted as the probability of a paraphrase being created [*], or they could be extracted from LLM output directly [*].
Explainability for LLM-based metrics: We want to analyze whether the metrics that provide the best explanations (for example with CoT) will achieve the highest correlation to human judgements. We assume that this is the case, due to the human judgements being based on fine-grained evaluations themselves (e.g. MQM for machine translation)
** Task Description **
The task will consist of building a reference-free metric for machine translation and/or summarization that predicts sentence-level quality scores constructed from fine-grained scores or error labels. Reference-free means that the metric rates the provided machine translation solely based on the provided source sentence/paragraph, without any additional, human written references. Further, we note that many open-source LLMs have mostly been trained on English data, adding further challenges to the reference-free setup.
To summarize, the task will be structured as follows:
- We provide a list of allowed LLMs from Huggingface
- Participants should use prompting to use these LLMs as metrics for MT and summarization
- Fine-tuning of the selected model(s) is not allowed
- We will release baselines, which participants might build upon
- We will provide a CodaLab dashboard to compare participants' solutions to others
We plan to release a CodaLab submission environment together with baselines and dev set evaluation code successively until August 7.
We will allow specific models from Huggingface, please refer to the webpage for more details: https://eval4nlp.github.io/2023/shared-task.html
Best wishes,
The Eval4NLP organizers
[*] References are listed on the shared task webpage: https://eval4nlp.github.io/2023/shared-task.html
Third call for papers DHASA Conference 2023 Extended deadlines
https://dh2023.digitalhumanities.org.za/
Note extended deadlines
Theme: "Digital Humanities for Inclusion"
The Digital Humanities Association of Southern Africa (DHASA) is
pleased to announce its fourth conference, focusing on the theme
"Digital Humanities for Inclusion." In a region where the field of
Digital Humanities is still relatively underdeveloped, this conference
aims to address this gap and foster growth and collaboration in the
field. The conference offers an opportunity for researchers interested
in showcasing their work in the broad field of Digital Humanities to
come together. By doing so, the conference provides a comprehensive
overview of the current state-of-the-art in Digital Humanities,
particularly within the Southern Africa region. As such, we welcome
submissions related to Digital Humanities research conducted by
individuals from Southern Africa or research focused on the
geographical area of Southern Africa.
Furthermore, the conference serves as a platform for information
sharing and networking among researchers passionate about Digital
Humanities. By bringing together experts working on Digital Humanities
in Southern Africa or with a focus on Southern Africa, we aim to
promote collaboration and facilitate further research in this dynamic
field. In addition to the main conference, affiliated workshops and
tutorials will be organized, providing researchers with valuable
insights into novel technologies and tools. These supplementary events
are designed for researchers interested in specific aspects of Digital
Humanities or seeking practical information to enter or advance their
knowledge in the field.
The DHASA conference welcomes interdisciplinary contributions from
researchers in various domains of Digital Humanities, including, but
not limited to, language, literature, visual art, performance and
theatre studies, media studies, music, history, sociology, psychology,
language technologies, library studies, philosophy, methodologies,
software and computation, and more. Our goal is to cultivate an
inclusive scientific community of practice within Digital Humanities.
Suggested topics include the following:
* Digital archives and the preservation of marginalized voices;
* Intersectionality and the digital humanities: exploring the
intersections of race, gender, sexuality, and class in digital research
and activism;
* Activism and social change through digital media: how digital
humanities tools and methodologies can be used to promote inclusion;
* Engaging marginalized communities in the creation and use of digital
tools and resources;
* Exploring the role of digital humanities in decolonizing knowledge
and promoting indigenous perspectives;
* The ethics of data collection and analysis in digital humanities
research related;
* The role of digital humanities in promoting inclusive and equitable
pedagogy;
* Digital humanities and inclusion in the context of global
perspectives and international collaborations;
* Critical approaches to digital humanities and inclusion: examining
the limitations and possibilities of digital tools and methodologies in
promoting inclusion; and
* Collaborative digital humanities projects with non-profit
organizations, community groups, and cultural institutions;
* Any other digital humanities-related topic that serves the Southern
African community.
Submission Guidelines
The DHASA conference 2023 asks for three types of submissions:
* Long papers: Authors may submit long papers consisting of a maximum
of 8 content pages and unlimited pages for references and appendix. The
final versions of accepted long papers will be granted an additional
page (up to 9 pages) to incorporate reviewers' comments.
* Short papers: Authors may submit short papers with a maximum of 5
content pages and unlimited pages for references and appendix. The
final versions of accepted short papers will be allowed an extra page
(up to 6 pages) to accommodate reviewers' comments. Short papers
accepted for the conference will be presented as posters.
* Abstracts: Authors can submit abstracts of 250-300 words.
Note that before submitting your contribution, you are required to
submit an abstract before the abstract submission deadline. This holds
for *all* submissions. The actual submission will need to be submitted
before the submission deadline.
More information on the submission process can be found on the
submission page: https://dh2023.digitalhumanities.org.za/submission/
We particularly encourage student submissions where the first author is
a student.
All accepted long and short paper submissions that are presented at the
conference will be published in the Journal of Digital Humanities
Association of Southern Africa, see
https://upjournals.up.ac.za/index.php/dhasa. In addition, the abstracts
of the full papers and the lightning talks will be published in a book
of abstracts before the conference.
Important dates
Abstract submission deadline: *22 August 2023*
Submission deadline: *29 August 2023*
Date of notification: 30 September 2023
Camera-ready copy deadline: 6 November 2023
Conference: 27 November 2023 - 1 December 2023
Conference format: Face-to-face
Conference venue: Nelson Mandela University, Eastern Cape South Africa
NOTE: Non-presenting delegates have the option to attend online.
Co-located events
Several co-located events are currently being prepared, including
workshops and tutorials. These will be updated on the conference
website.
Organizing Committee
* Johannes Sibeko, Nelson Mandela University
* Aby Louw, Council for Scientific and Industrial Research
* Alan Murdoch, Nelson Mandela University
* Amanda du Preez, University of Pretoria
* Andiswa Bukula, South African Centre for Digital Language Resources
* Andiswa Mvanyashe, Nelson Mandela University
* Avashna Govender, Council for Scientific and Industrial Research
* Gabby Dlamini, Nelson Mandela University
* Ilana Wilken, Council for Scientific and Industrial Research
* Jonathan van der Walt, Nelson Mandela University
* Laurette Marais, Council for Scientific and Industrial Research
* Mukhtar Raban, Nelson Mandela University
* Nomfundo Khumalo, Nelson Mandela University
* Menno Van Zaanen, South African Centre for Digital Language Resources
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
*** Call for Participation ***
SIGDIAL & INLG 2023 Conferences
September 11-15, 2023
Prague, Czechia & online
https://sigdialinlg2023.github.io/
Early Registration Deadline: **August 10**
Late Registration Deadline: September 15
Non-presenters Free Registration: August 12 - September 15
Workshops: September 11-12
Main Conferences: September 13-15
The 24th Annual Meeting of the Special Interest Group on Discourse and
Dialogue (SIGdial 2023) and the 16th International Natural Language
Generation Conference (INLG 2023) will be held jointly this year in
Prague, Czechia. The event will be hybrid, but in-person participation
is strongly encouraged! Virtual attendance will be free for
non-presenters.
The organizers of SIGDIAL & INLG 2023 invite all researchers and
practitioners, SIGDial & SIGGEN members, and SIGDIAL & INLG 2023
industry partners and sponsors to join the conference.
The registration is now open, with early rates available until August
10 (see https://sigdialinlg2023.github.io/registration.html). A
limited number of hotel rooms at the conference venue is available for
booking at special rates until August 10.
**SIGDIAL** provides a regular forum for the presentation of
cutting-edge research in discourse and dialogue to both academic and
industry researchers. Continuing a series of 23 successful previous
meetings, this conference spans the research interest areas of
discourse and dialogue. The conference is sponsored by the SIGdial
organization, which serves as the Special Interest Group on discourse
and dialogue for both ACL and ISCA.
**INLG** is a yearly venue for presentations related to all aspects of
Natural Language Generation (NLG), including data-to-text,
concept-to-text, text-to-text and vision-to-text approaches. The event
is organized under the auspices of SIGGEN, the Special Interest Group
on Natural Language Generation of ACL.
The joint conference on Sep 13-15 will feature 4 keynote speeches by
Barbara Di Eugenio, Emmanuel Dupoux, Elena Simperl and Ryan Lowe, as
well as a number of regular paper presentations and system
demonstrations.
- Keynotes: https://sigdialinlg2023.github.io/speakers.html
- SIGDIAL accepted papers: https://2023.sigdial.org/accepted-papers/
- INLG accepted papers: https://inlg2023.github.io/accepted_papers.html
The event includes several workshops on Sep 11-12 (see
https://sigdialinlg2023.github.io/workshops.html):
- YRRSDS: 19th Young Researchers' Roundtable on Spoken Dialogue Systems
- The 1st Workshop on Counter Speech for Online Abuse
- DSTC11: The 11th Dialog System Technology Challenge
- PracticalD2T: 1st Workshop on Practical LLM-assisted Data-to-Text Generation
- Taming Large Language Models: Controllability in the era of
Interactive Assistants
- Workshop on Multimodal, Multilingual Natural Language Generation and
Multilingual WebNLG Challenge
- Connecting multiple disciplines to AI techniques in
interaction-centric autism research and diagnosis
- Designing divergent agent tasks for SDS data collection
We thank you for your support and look forward to welcoming you at the
conference!
Best regards,
SIGDIAL & INLG 2023 Organizers
(Apologies for cross-posting)
Dear Corpora members,
this is to announce that the LREC COLING 2024 website is now available
at: https://lrec-coling-2024.org/
On the website you will find the 2nd Call for Papers for the main
conference, the Workshops CfP and the Tutorials CfP, the Author's kit
plus other information about Torino.
All the best,
LREC-COLING 2024 Organizers
Hi guys,
I am going to implement a summarization system in the medical domain in
Italian and Spanish. So I am looking for free summarization datasets both
in the public and medical domains in both languages.
Any help would be appreciated.
sincerely
Ciao
--
*Dr. Saeed Farzi,*
Faculty of Computer Engineering,
K. N. Toosi University of Technology, Tehran, Iran.
Phone: +98-21-8462450-401
Fax: +98-21-88462066
P.O. Box: 16315-1355,
Web: http://wp.kntu.ac.ir/saeedfarzi/
Lab: https://www.trlab.ir/
--
*Dr. Saeed Farzi,*
Faculty of Computer Engineering,
K. N. Toosi University of Technology, Tehran, Iran.
Phone: +98-21-8462450-401
Fax: +98-21-88462066
P.O. Box: 16315-1355,
Web: http://wp.kntu.ac.ir/saeedfarzi/
Lab: https://www.trlab.ir/
*** Apologies for Cross-Posting ***
The First Arabic Natural Language Processing Conference (ArabicNLP 2023)
co-located with EMNLP 2023 in Singapore.
What's in a name? To mark our move from a workshop to a conference, we
changed our acronym from WANLP to ArabicNLP.
Conference URL: <https://wanlp2023.sigarab.org/>
https://arabicnlp2023.sigarab.org/
Submission URL:
https://openreview.net/group?id=SIGARAB.org/ArabicNLP/2023/Conference
ArabicNLP 2023 invites the submission of original long, short, or demo
papers in the area of Arabic Natural Language Processing. ArabicNLP 2023
builds on seven previous workshop editions, which have been extremely
successful, drawing in a large active participation in various capacities.
This conference is timely given the continued rise in research projects
focusing on Arabic NLP. ArabicNLP 2023 will also feature shared tasks,
allowing participants to work on specific NLP challenges related to Arabic
language processing. The conference is organized by the Special Interest
Group on Arabic NLP (SIGARAB), an Association for Computational Linguistics
Special Interest Group on Arabic Natural Language Processing.
Important Dates
-
May 7, 2023: submission of shared tasks proposals
-
May 14, 2023: notification of acceptance of shared tasks
-
September 5, 2023: conference papers due date
-
October 12, 2023: notification of acceptance
-
October 20, 2023: camera-ready papers due
-
December 7, 2023: conference day
All deadlines are 11:59 pm UTC -12h
<https://www.timeanddate.com/time/zone/timezone/utc-12> (“Anywhere on
Earth”).
We accept long (up to 8 pages), short (up to 4 pages), and demo paper (up
to 4 pages) submissions. Long and short papers will be presented orally or
as posters as determined by the program committee.
Submissions are invited on topics that include, but are not limited to, the
following:
-
Enabling core technologies: language models and large language models,
morphological analysis, disambiguation, tokenization, POS tagging, named
entity detection, chunking, parsing, semantic role labeling, sentiment
analysis, Arabic dialect modeling, etc.
-
Applications: dialog modeling, machine translation, speech recognition,
speech synthesis, optical character recognition, pedagogy, assistive
technologies, social media, etc.
-
Resources: dictionaries, annotated data, corpora, etc.
Submissions may include work in progress as well as finished work.
Submissions must have a clear focus on specific issues pertaining to the
Arabic language whether it is standard Arabic, dialectal, classical, or
mixed. Papers on other languages sharing problems faced by Arabic NLP
researchers, such as Semitic languages or languages using Arabic script,
are welcome provided that they propose techniques or approaches that would
be of interest to Arabic NLP, and they explain why this is the case.
Additionally, papers on efforts using Arabic resources but targeting other
languages are also welcome. Descriptions of commercial systems are welcome,
but authors should be willing to discuss the details of their work.
If you have any questions, please contact us at:
<https://groups.google.com/u/1/>arabicnlp-pc-chairs(a)sigarab.org
The ArabicNLP 2023 Publicity Chairs,
Amr Keleg and Salam Khalifa
On 8/3/23, Toms Bergmanis <toms.bergmanis(a)tilde.lv> wrote:
...
I, for one, have benefited from Ada's, as well as other member's
suggestions and comments as I hope they have somehow benefited from
mine.
lbrtchx
1st Call for Papers: Special Issue of the Computational Linguistics journal
on Language Learning, Representation, and Processing in Humans and
MachinesGuest
Editors
Marianna Apidianaki (University of Pennsylvania)
Abdellah Fourtassi (Aix Marseille University)
Sebastian Padó (University of Stuttgart)
*Submission deadline: December, 10*
Large language models (LLMs) acquire rich world knowledge from the data
they are exposed to during training, in a way that appears to parallel how
children learn from the language they hear around them. Indeed, since the
introduction of these powerful models, there has been a general feeling
among researchers in both NLP and cognitive science that a systematic
understanding of how these models work and how they use the knowledge they
encode, would shed light on the way humans acquire, represent, and process
this same knowledge (and vice versa).
Yet, despite the similarities, there are important differences between
machines and humans that have prevented a direct translation of insights
from the analysis of LLMs to a deeper understanding of human learning.
Chief among these differences is that the size of data required to train
LLMs far exceeds -- by several orders of magnitude -- the data children
need to acquire sophisticated conceptual structures and meanings. Besides,
the engineering-driven architectures of LLMs do not appear to have obvious
equivalents in children's cognitive apparatus, at least as studied by
standard methods in experimental psychology. Finally, children acquire
world knowledge not only via exposure to language but also via sensory
experience and social interaction.
This edited volume aims to create a forum of exchange and debate between
linguists, cognitive scientists and experts in deep learning, NLP and
computational linguistics, on the broad topic of learning in humans and
machines. Experts from these communities can contribute with empirical and
theoretical papers that advance our understanding of this question.
Submissions might address the acquisition of different types of linguistic
and world knowledge. Additionally, we invite contributions that
characterize and address challenges related to the mismatch between humans
and LLMs in terms of the size and nature of input data, and the involved
learning and processing mechanisms.
Topics include, but are not limited to:
- Grounded learning: comparison of unimodal (e.g., text) vs multimodal
(e.g., images and video) learning.
- Social learning: comparison of input-driven mechanisms vs.
interaction-based learning.
- Exploration of different knowledge types (e.g., procedural /
declarative); knowledge integration and inference in LLMs.
- Methods to characterize and quantify human-like language learning or
processing in LLMs.
- Interpretability/probing methods addressing the linguistic and world
knowledge encoded in LLM representations.
- Knowledge enrichment methods aimed at improving the quality and
quantity of the knowledge encoded in LLMs.
- Semantic representation and processing in humans and machines in terms
of, e.g., abstractions made, structure of the lexicon, property inheritance
and generalization, geometrical approaches to meaning representation,
mental associations, and meaning retrieval.
- Bilingualism in humans and machines; second language acquisition in
children and adults; construction of multi-lingual spaces and cross-lingual
correspondences.
- Exploration of language models that incorporate cognitively plausible
mechanisms and reasonably-sized training data.
- Use of techniques from other disciplines (e.g., neuroscience or
computer vision) for analyzing and evaluating LLMs.
- Open-source tools for analysis, visualization, or explanation.
Submission Instructions
Papers should be formatted according to the Computational Linguistics style
guidelines: https://cljournal.org/
We accept both long and short papers. Long papers are between 25 and 40
journal pages in length; short papers are between 15 and 25 pages in length.
Papers for this special issue will be submitted through the CL electronic
submission system, just like regular papers:
https://cljournal.org/submissions.html
Authors of special issue papers will need to select ‟Special Issue on LLRP‟
under the Journal Section heading in the CL submission system. Please note
that papers submitted to a special issue undergo the same reviewing process
as regular papers.
Timeline
Deadline for submissions : December, 10 2023
Notification after 1st round of reviewing : February, 10 2024
Revised versions of the papers : April, 30 2024
Final decisions : June, 10 2024
Final version of the papers : July, 1 2024Guest Editors
Marianna Apidianaki
marapi(a)seas.upenn.edu
Abdellah Fourtassi
abdellah.fourtassi(a)gmail.com
Sebastian Padó
pado(a)ims.uni-stuttgart.de
*Computational Linguistics* is the longest-running flagship journal of the
Association for Computational Linguistics. The journal has a high impact
factor: 9.3 in 2022 and 7.778 in 2021. Average time to first decision of
regular papers and full survey papers (excluding desk rejects) is 34 days
for the period January to May 2023, and 47 days for the period January to
December 2022.
--
This email was sent from my smartphone. Forgive the brevity, the typos, and
the lack of nuance.