(Please find Turkish and Russian Translations below.)
We are excited to invite you to join SIGTURK, the new ACL special interest
group working on improving and promoting computational linguistics in
Turkic languages. As a product of our recent initiatives for preparing
public resources for language studies in Turkic languages, including
pre-trained language models, evaluation benchmarks and multilingual
corpora that
can be used throughout the community of researchers working on topics
around …
[View More]natural language processing, linguistics and are interested in
Turkic languages or related problems around low-resourced languages and few
shot learning, we are happy to officially have the opportunity to foster a
new interdisciplinary scientific community aiming to promote an inclusive
and diverse environment for exchanging ideas, promoting collaborations and
allowing language research and technologies become applicable in more
languages and domains. As part of our ongoing projects, we have recently
started working on a new multilingual natural language processing benchmark
in Turkic languages, and one of our future plans is also to hold an annual
workshop to allow productive knowledge exchange and collaborations.
As an international community of researchers from all over the world
already composed of scientists and professionals from many different
backgrounds, we would be glad to see anyone interested in joining our group
and we are happy to welcome new ideas and promote innovative research in
different directions.
If you are interested in becoming a member of SIGTURK, please fill out the
form below: https://forms.gle/NwWJ8B4LSRfEi2Wx5
<https://urldefense.proofpoint.com/v2/url?u=https-3A__forms.gle_NwWJ8B4LSRfE…>
And for more information visit our website: https://sigturk.github.io
<https://urldefense.proofpoint.com/v2/url?u=https-3A__sigturk.github.io_&d=D…>
Sincerely,
Duygu Ataman
SIGTURK President
Assistant Professor/Faculty Fellow
New York University
—
(Turkish)
Merhaba,
Sizleri ACL SIGTURK, Türki dillerin doğal dil işlemesinin geliştirilmesi ve
yaygınlaştırılması adına kurduğumuz yeni uluslararası çalışma grubuna davet
etmekten sevinç duyuyorum. Geçtiğimiz yıl içinde Türkçe ve Türk dillerinde
makine çevirisi, dil modellemesi gibi halihazırda sistem oluşturulması için
kullanılabilecek hazır geliştirilmiş dil modelleri, ve diğer doğal dil
işleme uygulamalarında kullanılabilecek Türki dillerde dilbilim ve doğal
dil işleme gruplarının kullanımına açık kaynak ve derlemler hazırlanması
gibi konularda faaliyetler organize etmek üzere başlattığımız uluslararası
bir çalışma grubunu büyük bir özveriyle resmi bir hale getirmiş
bulunuyoruz, şu anda çok dilli bir Türki dil ailesi doğal dil işleme
derlemi geliştirilmesi üzerinde çalışmaktayız. Çalışmalarımızı mail
grubumuz üzerinden yürütüyoruz, önümüzdeki yıl da projelerimizi
paylaşabileceğimiz, ve uluslararası ortamda alanımızda çalışan herkese
profesyonel ağlarını geliştirme fırsatı sunabilecek bir çalıştay
düzenlemeyi umuyoruz.
Grubun bir parçası olmak ve bizimle Türki dillerde dilbilim ve dil işlemesi
üzerine çalışan meslektaşlarımızı resmi bir çatı altında toplayarak
çalışmalarımıza katkı sağlamak isteyen herkesi aramızda görmekten çok
memnun oluruz. Topluluğumuzda şimdiden birçok ülkeden farklı alanlarda
Türki diller konuşan, dilbilim ve bilgisayar bilimi de dahil olmak üzere
farklı alanlarda çalışan ve gönüllü işbirliği yapmak için oldukça hevesli
birçok üyemiz bulunuyor.
Gruba aşağıdaki forma iletişim bilgilerinizi ekleyerek kayıt olabilirsiniz.
https://forms.gle/NwWJ8B4LSRfEi2Wx5
<https://urldefense.proofpoint.com/v2/url?u=https-3A__forms.gle_NwWJ8B4LSRfE…>
Tanıdığınız ve bu alanda çalışmalarımıza katılmak isteyebileceğini
düşündüğünüz çalışma arkadaşlarınızı da davet ederseniz çok seviniriz.
Daha fazla bilgi için web sitemizi de ziyaret edebilirsiniz.
https://sigturk.github.io
<https://urldefense.proofpoint.com/v2/url?u=https-3A__sigturk.github.io_&d=D…>
Saygılarımla,
Duygu Ataman
—
(Russian)
Мы рады пригласить вас присоединиться к SIGTURK, новой специальной группе
ассоциации компьютерной лингвистики (ACL SIG), работающей над улучшением и
продвижением компьютерной лингвистики для тюркских языков. Наши недавние
инициативы по подготовке общедоступных ресурсов для языковых исследований
тюркских языков включают предварительно обученные языковые модели,
бенчмарки (оценочные тесты) и многоязычные корпуса для исследователей,
работающих над темами, связанными с обработкой естественного языка,
лингвистикой, и которые интересуются тюркскими языками и для тех, кто
интересуется машинным обучением и обработкой естественных языков для
малоресурсных языков. Мы рады создать новое междисциплинарное научное
сообщество, целью которого является создание инклюзивной и разнообразной
среды для обмена идеями, развития сотрудничества и предоставления
возможности для языковых исследований и технологий для большого количества
языков и областей. В рамках наших текущих проектов мы недавно начали работу
над новым эталоном многоязычной обработки естественного языка для тюркских
языков, и одним из наших будущих планов также является проведение
ежегодного семинара (workshop), чтобы обеспечить продуктивный обмен
знаниями и сотрудничество.
Как международное сообщество исследователей со всего мира, уже состоящее из
ученых и профессионалов из самых разных областей, мы будем рады видеть
всех, кто заинтересован в присоединении к нашей группе, и мы рады
приветствовать новые идеи и будем рады продвигать инновационные
исследования в различных направлениях.
Если вы заинтересованы стать членом сообщества, пожалуйста, заполните
следующую форму: https://forms.gle/NwWJ8B4LSRfEi2Wx5
<https://urldefense.proofpoint.com/v2/url?u=https-3A__forms.gle_NwWJ8B4LSRfE…>
Для дополнительной информации, пожалуйста, посетите наш сайт:
https://sigturk.github.io
<https://urldefense.proofpoint.com/v2/url?u=https-3A__sigturk.github.io&d=Dw…>
C уважением,
Дуюгу Атаман
-
Duygu Ataman, Ph.D.
Assistant Professor/Faculty Fellow
Courant Institute of Mathematical Sciences
New York University
[View Less]
Please share this invitation with your network
thanks
Eric Atwell, School of Computing, University of Leeds
________________________________
School of Computing Colloquium, University of Leeds, Friday 24.2.23 12:00 (UK)
Guest speaker: Dr Rana Malhas, Qatar University
Title: Arabic Reading Comprehension on the Holy Qur’an using CL-AraBERT
Venue: Teams Click here to join the meeting<https://emea01.safelinks.protection.outlook.com/ap/t-59584e83/?url=https%3A…>
Abstract: In this talk,…
[View More] we tackle the problem of machine reading comprehension (MRC) on the Holy Qur’an to address the lack of Arabic datasets and systems for this important task. We construct QRCD as the first Qur’anic Reading Comprehension Dataset, composed of 1,337 question passage-answer triplets for 1,093 question-passage pairs, of which 14% are multi-answer questions. We then introduce CLassical-AraBERT (CL-AraBERT for short), a new AraBERTbased pre-trained model, which is further pre-trained on about 1.0B-word Classical Arabic (CA) dataset, to complement the Modern Standard Arabic (MSA) resources used in pre-training the
initial model, and make it a better fit for the task. Finally, we leverage cross-lingual transfer learning from MSA to CA, and fine-tune CL-AraBERT as a reader using two MSA-based MRC datasets followed by our QRCD dataset to constitute the first (to the best of our knowledge) MRC system on the Holy Qur’an. To evaluate our system, we introduce Partial Average Precision (𝑝𝐴𝑃 ) as an adapted version of the traditional rank-based Average Precision measure, which integrates partial matching in the evaluation over multi-answer and single-answer MSA questions. Adopting two experimental evaluation setups (hold-out and cross validation (CV)), we empirically show that the fine-tuned CL-AraBERT reader model significantly outperforms
the baseline fine-tuned AraBERT reader model by 6.12 and 3.75 points in 𝑝𝐴𝑃 scores, in the hold-out and CV setups, respectively. To promote further research on this task and other related tasks on Qur’an and Classical Arabic text, we make both the QRCD dataset and the pre-trained CL-AraBERT model publicly available.
________________________________________________________________________________
Microsoft Teams meeting
Join on your computer, mobile app or room device
Click here to join the meeting<https://emea01.safelinks.protection.outlook.com/ap/t-59584e83/?url=https%3A…>
Meeting ID: 355 926 814 82
Passcode: 4dJRfD
Download Teams<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.micr…> | Join on the web<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.micr…>
Learn more<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2…> | Meeting options<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fteams.mi…>
________________________________________________________________________________
[View Less]
*** Call for Participation for GUA-SPA at IberLEF 2023 ***
GUA-SPA - Guarani-Spanish Code Switching Analysis at IberLEF 2023
https://codalab.lisn.upsaclay.fr/competitions/11030
Guarani is a South American indigenous language that has been in contact with Spanish and other Indo-European languages for about 500 years, which has resulted in many interesting varieties with different levels of mixture. In Paraguay, according to the most recent census, most of the population of the country speak …
[View More]at least some Guarani, and there is a high prevalence of Guarani-Spanish bilingualism. Bilingual speakers often make use of the two languages at the same time, mixing them in different ways, in a phenomenon called code-switching. We propose a challenge for analyzing code-switched texts in Guarani and Spanish, trying to identify the language used in each span of text, the named entities mentioned in the text, and the way Spanish is used. For this, we will provide a corpus of news and tweets where each token is labeled with an appropriate language or category identifier.
Three tasks are presented:
* Language identification in code-switched data: identify each token in the text as Guarani, Spanish, mix, named entity, or other categories.
* Named entity classification: classify the named entities found in the text as locations, organizations or people.
* Spanish code classification: classify the spans in Spanish as code changes or unadapted loans.
How to participate:
If you want to participate in this task, please join our Codalab competition: https://codalab.lisn.upsaclay.fr/competitions/11030
Important Dates:
* March 22nd, 2023: training set.
* May 24th, 2023: test set and open for submissions.
* June 7th, 2023: publication of results.
* June 14th, 2023: paper submission.
* June 28th, 2023: notification of acceptance.
* July 3rd, 2023: camera-ready paper submission.
* September, 2023: IberLEF 2023 Workshop.
[View Less]
Binary Gender Prediction From Long Texts Competition on Classification of
authors' binary gender from their writing style
Competition site:
https://www.kaggle.com/competitions/binary-gender-prediction-from-long-text…
Timeline
- February 20, 2023: Competitions are published on the website and
opened to participants.
- February 20, 2023: Registration for the competition. In addition to
the registration on Kaggle, also fill out the registration form
<https://forms.gle/…
[View More]LhYuGcjnYSBj5YgY6>.
- February 20, 2023: Training and validation datasets are available for
registered participants.
- April 13, 2023: Test dataset available.
- April 20, 2023,23:59 UTC: Delivery of results with a short description
of the method.
This competition is organized in conjunction with the 1st International
Workshop on Implicit Author Characterization from Texts for Search and
Retrieval (IACT’23) <https://en.sce.ac.il/news/iact23>, which will be held
in Taipei, Taiwan on July 27.
The ability to accurately determine the binary gender of an author based on
their writing style opens up a wealth of possibilities in a variety of
fields. In historical document analysis, for example, researchers can gain
deeper insights into the role and representation of women in different
societies. In forensic investigations, this information can be used to
focus the investigation on a smaller and more targeted group, thereby
increasing the efficiency and effectiveness of the investigation.
By participating in this competition, researchers will have the chance to
contribute to this growing field and make a meaningful impact on society.
Whether you're a seasoned NLP professional or just starting out, this
competition presents a unique opportunity to challenge yourself and take
your skills to the next level. So don't miss out on this exciting
opportunity to make your mark in the world of IR.
Organizers:
- Marina Litvak - marinal(a)ac.sce.ac.il; Shamoon College of Engineering
Beer Sheva; Israel
- Irina Rabaev - irinar(a)ac.sce.ac.il; Shamoon College of Engineering
Beer Sheva; Israel
- Alípio Mário Jorge - amjorge(a)fc.up.pt; University of Porto; Porto,
Portugal
- Ricardo Campos - ricardo.campos(a)ipt.pt; Polytechnic Institute of Tomar
INESC TEC, Portugal; Porto, Portugal
- Adam Jatowt - adam.jatowt(a)uibk.ac.at; University of Innsbruck;
Innsbruck, Austria
- Vladimir Yonkin - vladiyo(a)ac.sce.ac.il; Shamoon College of
Engineering, Beer Sheva, Israel
--
Best regards,
Marina Litvak
[View Less]
Call for Papers: The 1st International Workshop on Implicit Author
Characterization from Texts for Search and Retrieval (IACT’23)
The workshop will be held in conjunction with the 46th International ACM
SIGIR Conference on Research and Development in Information Retrieval
Workshop website: https://en.sce.ac.il/news/iact23
July 27, 2023. Taipei, Taiwan.
Paper submission deadline: April 25, 2023, AoE
Submission link: https://easychair.org/conferences/?conf=iact23
To bring the research …
[View More]community's attention to the limitations of current
models at capturing the hidden author characteristics from texts, we
organize the first edition of IACT workshops under the umbrella of the
SIGIR conference. Research works submitted to the workshop should foster
the scientific advance on all aspects of implicit author information
extraction from text.
All papers must be original and not simultaneously submitted to another
journal or conference. The following paper categories are welcome:
Research papers: up to 8 pages. Original and high-quality unpublished
contributions to the theory and practical aspects of the implicit author
characterization task. Full papers should introduce existing approaches and
describe the methodology and the experiments conducted in detail. Negative
result papers highlighting tested hypotheses that did not get the expected
outcome are also welcomed.
Shared task papers up to 5 pages.
The submissions must be anonymous and will be peer-reviewed by at least two
program committee members.
Papers must be submitted electronically in PDF format through EasyChair
<https://easychair.org/conferences/?conf=iact23>. All submissions must be
in English and formatted according to the one-column CEUR-ART style with no
page numbers. Templates in Word or LaTeX can be found in the following zip
folder at http://ceur-ws.org/Vol-XXX/CEURART.zip. There is also an Overleaf
page
<https://www.overleaf.com/latex/templates/template-for-submissions-to-ceur-w…>
for LaTeX users.
Research works submitted to the workshop should foster the scientific
advance on all aspects of implicit author information extraction from text,
including but not limited to the following:
- Query understanding
- Personalized information search and retrieval
- Demographics (gender, age, location, etc.) prediction and analysis
- Profiling protagonists
- Multi-modal, multi-genre, and multilingual author analysis
- Detecting implicit expressions of sentiment, emotion, opinion, and bias
- Transfer learning for implicit author characterization
- Implicit author profiling annotation schema
- Evaluation of implicit author profiling
- Differentiation between AI-generated content and human-generated
content and bot profiling
- Author profiling in low-resource languages and under-studied domains
- Exploration of the relations between self-reported or binary gender
identity and implicit personality traits
- Mental disorders and risk prediction
- Author identification and verification
Organizing Committee:
- Marina Litvak - marinal(a)ac.sce.ac.il; Shamoon College of Engineering
Beer Sheva; Israel
- Irina Rabaev - irinar(a)ac.sce.ac.il; Shamoon College of Engineering
Beer Sheva; Israel
- Alípio Mário Jorge - amjorge(a)fc.up.pt; University of Porto; Porto,
Portugal
- Ricardo Campos - ricardo.campos(a)ipt.pt; Polytechnic Institute of Tomar
INESC TEC, Portugal; Porto, Portugal
- Adam Jatowt - adam.jatowt(a)uibk.ac.at; University of Innsbruck;
Innsbruck, Austria
Invited Speakers:
- Prof. Mark Last - Ben-Gurion University of the Negev, Israel
- Prof. Dr. Valia Kordoni - Humboldt-Universität Berlin, Germany
IACT’23 proceedings will be published at CEUR workshop proceedings (indexed
in Scopus and DBLP) as long as they do not conflict with previous
publication rights.
Contact:
- Dr. Marina Litvak: litvak.marina(a)gmail.com
- Dr. Irina Rabaev: irinar(a)ac.sce.ac.il
--
Best regards,
Marina Litvak
[View Less]
The 24th Annual Meeting of the Special Interest Group on Discourse and
Dialogue (SIGDIAL) and the 16th International Natural Language Generation
Conference (INLG) will be held jointly in Prague on September 11-15, 2023.
The SIGDIAL venue provides a regular forum for the presentation of cutting
edge research in dialogue and discourse to both academic and industry
researchers, continuing a series of 23 successful previous meetings. The
conference is sponsored by the SIGDIAL organization - the …
[View More]Special Interest
Group in discourse and dialogue for ACL and ISCA.
Topics of Interest
We welcome formal, corpus-based, implementation, experimental, or
analytical work on discourse and dialogue including, but not restricted to,
the following themes:
* Discourse Processing: Rhetorical and coherence relations, discourse
parsing and discourse connectives. Reference resolution. Event
representation and causality in narrative. Argument mining. Quality and
style in text. Cross-lingual discourse analysis. Discourse issues in
applications such as machine translation, text summarization, essay
grading, question answering and information retrieval. Discourse issues in
text generated by large language models.
* Dialogue Systems: Task oriented and open domain spoken, multi-modal,
embedded, situated, and text-based dialogue systems, their components,
evaluation and applications. Knowledge representation and extraction for
dialogue. State representation, tracking and policy learning. Social and
emotional intelligence. Dialogue issues in virtual reality and human-robot
interaction. Entrainment, alignment and priming. Generation for dialogue.
Style, voice, and personality. Safety and ethics issues in Dialogue.
* Corpora, Tools and Methodology: Corpus-based and experimental work on
discourse and dialogue, including supporting topics such as annotation
tools and schemes, crowdsourcing, evaluation methodology and corpora.
* Pragmatic and Semantic Modeling: Pragmatics and semantics of
conversations(i.e., beyond a single sentence), e.g., rational speech act,
conversation acts, intentions, conversational implicature, presuppositions.
* Applications of Dialogue and Discourse Processing Technology.
Submissions
The program committee welcomes the submission of long papers, short papers,
and demo descriptions. Submitted long papers may be accepted for oral or
for poster presentation. Accepted short papers will be presented as posters.
* Long paper submissions must describe substantial, original, completed
and unpublished work. Wherever appropriate, concrete evaluation and
analysis should be included. Long papers must be no longer than 8 pages,
including title, text, figures and tables. An unlimited number of pages is
allowed for references. Two additional pages are allowed for appendices
containing sample discourses/dialogues and algorithms, and an extra page is
allowed in the final version to address reviewers’ comments.
* Short paper submissions must describe original and unpublished work.
Please note that a short paper is not a shortened long paper. Instead short
papers should have a point that can be made in a few pages, such as a
small, focused contribution; a negative result; or an interesting
application nugget. Short papers should be no longer than 4 pages including
title, text, figures and tables. An unlimited number of pages is allowed
for references. One additional page is allowed for sample
discourses/dialogues and algorithms, and an extra page is allowed in the
final version to address reviewers’ comments.
* Demo descriptions should be no longer than 4 pages including title,
text, examples, figures, tables and references. A separate one-page
document should be provided to the program co-chairs for demo descriptions,
specifying furniture and equipment needed for the demo.
Authors are encouraged to also submit additional accompanying materials,
such as corpora (or corpus examples), demo code, videos and sound files.
Multiple Submissions
SIGDIAL 2023 cannot accept work for publication or presentation that will
be (or has been) published elsewhere and that have been or will be
submitted to other meetings or publications whose review periods overlap
with that of SIGDIAL. Overlap with the SIGDIAL workshop submissions is
permitted for non-archived workshop proceedings. Any questions regarding
submissions can be sent to program-chairs [at] sigdial.org.
Blind Review
Building on previous years’ move to anonymous long and short paper
submissions, SIGDIAL 2023 will follow the ACL policies for preserving the
integrity of double blind review (see author guidelines
<https://www.aclweb.org/adminwiki/index.php?title=ACL_Author_Guidelines>).
Unlike long and short papers, demo descriptions will not be anonymous. Demo
descriptions should include the authors’ names and affiliations, and
self-references are allowed.
Submission Format
All long, short, and demonstration submissions must follow the two-column
ACL format, which are available as an Overleaf template
<https://www.overleaf.com/read/crtcwgxzjskr> and also downloadable directly
<https://github.com/acl-org/acl-style-files> (Latex and Word)
Submissions must conform to the official ACL style guidelines, which are
contained in these templates. Submissions must be electronic, in PDF format.
Submission Deadline
SIGDIAL will accept regular submissions through the Softconf/START system,
as well as commitment of already reviewed papers through the ACL Rolling
Review (ARR) system.
Regular submission
Authors have to fill in the submission form in the Softconf/START system
and upload an initial pdf of their papers before May 15, 2023 (23:59
GMT-11). Details and the submission link will be posted on the conference
website <https://2023.sigdial.org/>.
Submission via ACL Rolling Review (ARR) <https://aclrollingreview.org/>
Please refer to the ARR Call for Papers <https://aclrollingreview.org/cfp>
for detailed information about submission guidelines to ARR. The commitment
deadline for authors to submit their reviewed papers, reviews, and
meta-review to SIGDIAL 2023 is June 19, 2023. Note that the paper needs to
be fully reviewed by ARR in order to make a commitment, thus the latest
date for ARR submission will be April 15, 2023.
Mentoring
Acceptable submissions that require language (English) or organizational
assistance will be flagged for mentoring, and accepted with a
recommendation to revise with the help of a mentor. An experienced mentor
who has previously published in the SIGDIAL venue will then help the
authors of these flagged papers prepare their submissions for publication.
Best Paper Awards
In order to recognize significant advancements in dialogue/discourse
science and technology, SIGDIAL 2023 will include best paper awards. All
papers at the conference are eligible for the best paper awards. A
selection committee consisting of prominent researchers in the fields of
interest will select the recipients of the awards.
SIGDIAL 2023 Program Committee
Svetlana Stoyanchev and Shafiq Rayhan Joty
Conference Website: https://2023.sigdial.org/
[View Less]
Dear colleagues,
A while ago we sent out a survey asking about your experiences with and opinions about error analysis. We are happy to announce that our paper based on this survey has been published in NEJLT: https://nejlt.ep.liu.se/article/view/4529
We hope that our paper will help foster a discussion about the publication culture in NLG, and in NLP more generally. We would also like to thank you again for filling in our survey. I consider myself very lucky to be part of this community.
…
[View More]Best wishes, also on behalf of my co-authors,
Emiel van Miltenburg
[View Less]
==============================================================
WebNLG Challenge 2023: Call for Participation
Special focus on multilingual NLG for under-resourced languages
https://webnlg-challenge/challenge_2023/
===============================================================
We are delighted to announce a new edition of the WebNLG challenge, which will take place in 2023. WebNLG 2023 will focus on multilingual generation for under-resourced languages: Maltese, Irish, Breton and Welsh. …
[View More]In addition, WebNLG 2023 will once again include Russian, which was first featured in WebNLG 2020.
* MOTIVATION
With the development of large-scale pretrained models, research in automatic text generation has acquired new impetus. Yet, the current state-of-the-art is dominated by a handful of languages, for which training data is relatively easy to acquire. At the same time, the field has recently witnessed some encouraging developments which focus on generation for under-resourced and under-represented languages. This trend is paralleled by a growing interest in multilingual models and applications in NLP more broadly.
The WebNLG 2023 Challenge is being organised in response to these trends and specifically addresses generation in few-shot and/or zero-shot settings for four under-resourced languages.
* About WebNLG
The WebNLG Challenge consists in mapping data, in the form of RDF triples, to natural language text. The input is a set of RDF triples sourced from DBPedia for example:
(John_E_Blaha birthDate 1942_08_26)
(John_E_Blaha birthPlace San_Antonio)
(John_E_Blaha occupation Fighter_pilot)
where the corresponding output text might be:
John E Blaha, born in San Antonio on 1942-08-26, worked as a fighter pilot
The WebNLG challenge was launched in 2017. A second edition, in 2020, extended the task to Russian, in addition to English.
* WebNLG 2023
The new edition of WebNLG focuses on four under-resourced languages which are severely under-represented in research on text generation, namely Maltese, Irish, Breton and Welsh. In addition, WebNLG 2023 will again include Russian, a language featured in WebNLG 2020.
For WebNLG 2023, we are soliciting submissions encompassing a variety of approaches to automatic text generation, from neural architectures to rule-based systems. We especially encourage submissions addressing generation in few-shot or zero-shot settings.
* DATA
Development and test data is now available for all 5 languages, namely Breton, Maltese, Irish and Welsh (the target languages for WebNLG 2023), as well as Russian. Participants can download the development data; the test data will be reserved for the final evaluation.
Data for each language was obtained by sourcing high-quality, professional translations of the original English texts in the WebNLG 2020 dev and test sets.
Training data is also available for the original WebNLG English data and, as per WebNLG 2020, for Russian. In addition, we provide ‘noisy’ training data for the target languages (Maltese, Breton, Welsh and Irish), obtained via machine translation of the texts in the English WebNLG 2020 train split.
* EVALUATION
As in previous editions of WebNLG, submitted results will be evaluated using both automatic and human evaluation methods.
* INSTRUCTIONS FOR PARTICIPANTS
Data and instructions for the task are available from the WebNLG repo:
https://github.com/WebNLG/2023-Challenge
Teams who submit systems for evaluation at WebNLG 2023 will subsequently be invited to contribute a short paper describing their approach and results. The task as a whole, as well as individual submissions, will be presented at a special session in an event to be announced later.
General information about the WebNLG challenges can be found on the following URL: https://webnlg-challenge/challenge_2023/
* TIMELINE
February 2023: First call for participation.
Development data and noisy training data available.
8 June 2023: Release of test data
15 June 2023: Deadline for submission of system outputs.
15 August 2023: Deadline for submission of short papers describing systems.
The final presentation of results will be held during a workshop. Current plans are to hold this in September 2023.
* ORGANISATION:
WebNLG 2023 is being organised under the auspices of LT-Bridge, supported by the Horizon 2020 Work Programme Spreading Excellence and Widening Participation (WIDESPREAD) 2018-2020 and the ANR funded xNLG Chair on multi-lingual, multi-source NLG.
Claire Gardent, CNRS/LORIA, Nancy, France
Albert Gatt, Utrecht University, The Netherlands and University of Malta
Claudia Borg, University of Malta
Enrico Aquilina, University of Malta
Anya Belz, Dublin City University, Ireland
John Judge, Dublin City University, Ireland
Liam Cripwell, CNRS/LORIA and Université de Lorraine, Nancy, France
William Soto-Martinez, CNRS/LORIA and Université de Lorraine, Nancy, France
Contact: webnlg-challenge(a)inria.fr
To unsubscribe from the SIGGEN list, click the following link:
[ https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=SIGGEN&A=1 | https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=SIGGEN&A=1 ]
[View Less]
UPDATE: Following requests, we extend the deadline for system submissions for the MultiGED-2023 shared task by one week according to the following:
27 February, 2023 06 March, 2023 - test data released
03 March, 2023 10 March, 2023 - system submission deadline (system output)
10 March, 2023 14 March, 2023 - results announced
03 April, 2023 - paper submission deadline with system descriptions.
Call for participation -- Shared task on Multilingual Grammatical Error Detection (MultiGED-2023)…
[View More] on Czech, English, German, Italian and Swedish
Official website for the shared task: https://github.com/spraakbanken/multiged-2023
The Computational SLA<https://spraakbanken.gu.se/en/compsla> working group invites you to participate in the first shared task on Multilingual Grammatical Error Detection, MultiGED-2023, which includes five languages: Czech, English, German, Italian and Swedish.
The aim of this shared task is to detect tokens in need of correction across five different languages, labeling them as either correct ("c") or incorrect ("i"), i.e. performing binary classification at the token level. You can work on one of the provided languages or any combination of languages.
More details about the task: https://github.com/spraakbanken/multiged-2023
The shared task is part of the NLP4CALL workshop<https://spraakbanken.gu.se/en/research/themes/icall/nlp4call-workshop-serie…>, which will take place on 22 May 2023, co-located with the NoDaLiDa conference <https://www.nodalida2023.fo/> to be held in the Faroe Islands. Accepted papers with systems descriptions will be published in the workshop proceedings and double-published through the ACL anthology.
Timeline:
* 23 January, 2023 - first call for participation. Training and validation data released, CodaLab opens for team registrations.
*
14 February, 2023 - second call/reminder
*
03 March, 2023 10 March, 2023 - system submission deadline (system output)
*
10 March, 2023 14 March, 2023 - results announced
*
03 April, 2023 - paper submission deadline with system descriptions. We encourage you to share models, code, fact sheets, extra data, etc. with the community through github or other repositories on paper publication.
* 21 April, 2023 - paper reviews sent to the authors
* 01 May, 2023 - camera-ready deadline
* 22 May, 2023 - presentations of the systems at NLP4CALL workshop
To register for/express interest in the shared task, please fill in this form <https://forms.gle/DgwTNmTCQhsmrbxq6>.
To ask questions and to get important information and updates about the shared task, please join the MultiGED-2023 Google Group <https://groups.google.com/g/multiged-2023>.
Official system evaluation will be carried out on CodaLab <https://codalab.lisn.upsaclay.fr/competitions/9784>.
Organizers:
* Elena Volodina<https://spraakbanken.gu.se/en/about/staff/elena>, University of Gothenburg, Sweden
* Chris Bryant<https://www.cst.cam.ac.uk/people/cjb255>, University of Cambridge, UK
* Andrew Caines<https://www.cl.cam.ac.uk/~apc38/>, University of Cambridge, UK
* Orphee De Clercq<https://research.flw.ugent.be/nl/orphee.declercq>, Ghent University, Belgium
* Jennifer-Carmen Frey<https://www.eurac.edu/en/people/jennifer-carmen-frey>, EURAC Research, Italy
* Elizaveta Ershova, JetBrains, Cyprus
* Alexandr Rosen<http://utkl.ff.cuni.cz/~rosen/>, Charles University, Czech Republic
* Olga Vinogradova, Independent researcher, Israel
Please, feel free to forward this call to those who might be interested.
___________________
Elena Volodina, PhD, Docent
https://spraakbanken.gu.se/en/about/staff/elena
Life is like a mirror. Smile at it and it smiles back at you.
Peace Pilgrim
[View Less]