We have a vacancy for a professor of Natural Language Processing (from September 15, 2023).
The successful candidate will join the CLiPS Research Centre with research on NLP and Machine Learning, and teach in the MA on Digital Text Analysis.
The deadline for applications is 28 March 2023.
Vacancy description and link to application site:
https://www.uantwerpen.be/en/jobs/vacancies/academic-staff/?q=2634&descr=Ac…
For more information, candidates can contact me by email.
Best wishes,
Walter Daelemans
Call for Papers
*International Conference on CMC and Social Media Corpora for the Humanities*
14–15th September 2023, University of Mannheim, Germany
The 10th International Conference on CMC and Social Media Corpora for
the Humanities (CMC-Corpora) will be held at the University of
Mannheim, Germany in collaboration with the Leibniz Institute for the
German Language (IDS). Specialized corpora of the language of CMC and
social media are increasingly vital for the analysis of the
“unparalleled and rapidly evolving diversity in terms of speakers and
settings” in digital contexts, as well as of “language evolution seen
through the lens of user-generated content, which gives access to a
number of variants, socio- and idiolects” (Barbaresi 2019: 29-30).
The conference brings together language-centered research on CMC and
social media in linguistics, philologies, communication sciences,
media, and social sciences with research questions from the fields of
corpus and computational linguistics, language technology, text
technology, and machine learning. It features research in which
computational methods and tools are used for language-centered
empirical analysis of CMC and social media phenomena as well as
research on building, processing, annotating, representing, and
exploiting CMC and social media corpora, including their integration
in digital research infrastructures. We adhere to a wide definition of
CMC and Social Media, covering various media of digital communication,
including email, newsgroups, forums, chat and messenger applications
(e.g. WhatsApp), social networks (Facebook, Instagram), gaming
platforms, as well as interactions in the communication areas of video
portals (YouTube), learning platforms, gaming apps, online games and
virtual worlds.
We invite submissions on CMC-related topics, including but not limited to:
* Development of CMC corpora / social media corpora
* Building CMC corpora: from data collection to publication
* Open access data for CMC research: ethical and GDPR issues
* Annotating CMC data: genres, linguistic aspects, metadata
* Multimodal corpora
* Big data corpora
* Legal issues concerning the sampling, distribution and (long-term) archiving of social media data
* Analysis of CMC corpora / social media corpora
* Sociolinguistic studies of CMC
* Discourse analysis of CMC
* Linguistic characteristics of CMC
* Multimodal (incl. visual) aspects of CMC
* Multilingualism and code-switching in CMC
* CMC in language education
* Natural language processing (NLP) of CMC data / social media data
* Normalization
* PoS tagging
* Anonymisation and Pseudonymisation
* Lemmatization
* Syntactic parsing
* Semantic Annotation
=================
*Important Dates*
=================
* Abstract submission: 30 April, 23:59 CEST
* Notification of acceptance: Friday, 30 June 2023, 23:59 CEST
* Deadline revised abstract submission: Sunday, 6 August 2023, 23:59 CEST
* Deadline registration for participation: Sunday, 20 August 2023, 23:59 CEST
* Arrival, Get-together: Wednesday, 13 September 2023
* Conference: Thursday 14 - Friday 15 September 2023
============
*Submission*
============
We invite submissions for talks and for posters or software/corpus demonstrations on any topic relevant to the list of themes mentioned above. We invite two types of submissions:
* short papers (2-4 pages, following the existing template, i.e between 800 and 1600 words) for oral presentations
* abstracts (max. 300 words) for poster presentations
Each paper and abstract will be double blind peer reviewed by two or
three members of the scientific committee. Authors of accepted papers
can present their work at the conference (30 minute time slots: 20
minute talks, followed by 10 minutes of discussion). Authors of
accepted abstracts can present their work in progress, early-stage
research, software/corpus demonstrations during the poster session. At
the start of the conference, all accepted papers will be made
available in online proceedings. After the conference, speakers with
the best short papers will be invited to submit extended papers for a
special issue journal or a volume publication.
*Instructions for authors*
All submissions have to be written in English and have to be
anonymised. The short papers for oral presentations should not exceed
4 pages and the paper format should adhere to the template which you
can download from the links below. The abstracts for poster
presentations should not exceed 300 words, bibliographical references
not included. All contributions will be collected through the online
platform EasyChair under the link
https://easychair.org/conferences/?conf=cmc2023). (If you do not have
an EasyChair account, you need to create one first.)
Template for MSWord (40 kB): https://www.uni-mannheim.de/media/Lehrstuehle/phil/deutsche_philologie/LS_G…
Template for LaTeX (260 kB): https://www.uni-mannheim.de/media/Lehrstuehle/phil/deutsche_philologie/LS_G…
For all enquiries, please contact the organizers at cmc-corpora2023(a)uni-mannheim.de
We look forward to seeing you there!
The organizing committee:
Jutta Bopp, Louis Cotgrove, Laura Herzberg, Harald Lüngen, Andreas Witt
Conference website: https://www.uni-mannheim.de/cmc-corpora2023/
======================
*Scientific Committee*
======================
(confirmed so far):
* Paul Baker (Lancaster University)
* Adrien Barbaresi (Berlin-Brandenburgische Akademie der Wissenschaften)
* Michael Beißwenger (University of Duisburg-Essen)
* Mario Cal-Varela (Universidade de Santiago de Compostela)
* Steven Coats (University of Oulu)
* Luna DeBruyne (Ghent University)
* Orphée DeClercq (Ghent University)
* Francisco-Javier Fernández-Polo (University of Santiago de Compostela)
* Jenny Frey (European Academy of Bozen)
* Alexandra Georgakopoulou-Nunes (King's College London)
* Klaus Geyer (University of Southern Denmark)
* Aivars Glaznieks (Eurac Research Bolzano)
* Claire Hardaker (Lancaster University)
* Iris Hendrickx (Radboud University Nijmegen)
* Axel Herold (Berlin-Brandenburgische Akademie der Wissenschaften)
* Lisa Hilte (University of Antwerp)
* Mai Hodac (Université Toulouse)
* Wolfgang Imo (University of Hamburg)
* Pawel Kamocki (IDS Mannheim)
* Erik-Tjong Kim-Sang (Netherlands eScience Center)
* Alexander Koenig (CLARIN ERIC)
* Florian Kunneman (Vrije Universiteit Amsterdam)
* Marc Kupietz (IDS Mannheim)
* Els Lefever (Ghent University)
* Julien Longhi (Cergy Paris Université)
* Maja Miličević-Petrović (University of Bologna)
* Nelleke Oostdijk (Radboud University)
* Celine Poudat (Université Côte d'Azur)
* Thomas Proisl (Friedrich-Alexander-Universität Erlangen-Nürnberg)
* Sebastian Reimann (Ruhr-Universität Bochum)
* Unn Røyneland (University of Oslo)
* Müge Satar (Newcastle University)
* Tatjana Scheffler (Ruhr-Universität Bochum)
* Stefania Spina (Università per Stranieri di Perugia)
* Egon Stemle (Eurac Research)
* Caroline Tagg (The Open University)
* Simone Ueberwasser (University of Zurich)
* Lieke Verheijen (Radboud University)
Dear all,
I would like to point you to two open positions (deadline for
application is February 28) at our newly founded Chair of Multilingual
Computational Linguistics at the University of Passau (Germany).
The first position is for an "assistant professor" (Akademischer Rat auf
Zeit, m/w/d) with broad interest in linguistic typology, comparative
linguistics, and computational linguistics. The position is for three
years, can be prolonged by three more years, and can be used to write a
habilitation thesis with me. Condition is a PhD that has been acquired
before starting the position (preferably in comparative linguistics or
computational linguistics).
https://www.uni-passau.de/fileadmin/dokumente/beschaeftigte/Stellenangebote…
The second position is for either an "assistant professor" (Akademischer
Rat auf Zeit, m/w/d) or a "research and teaching assistant" (Wiss.
Mitarbeiter, m/w/d), again for three years with possible extension by
three more years, devoted to the enhancement and extension of our work
on the standardization of cross-linguistic data. Condition is a PhD
("assistant professor") or a master in computer science or computational
linguistics ("research and teaching assistant"). For detailed
requirements (web administration and Python), please see the detailed call.
https://www.uni-passau.de/fileadmin/dokumente/beschaeftigte/Stellenangebote…
Note that only the German versions of these calls are legally binding.
Please circulate these across all channels, we hope to find strong
applications.
All the best,
Mattis
--
Prof. Dr. Johann-Mattis List
Chair of Multilingual Computational Linguistics
University of Passau
Dr.-Hans-Kapfinger-Str. 16
04032 Passau
Germany
Chair Website: https://phil.uni-passau.de/multilinguale-computerlinguistik/
Personal Website: https://lingulist.de
Telephone: +49(0)851/509-3480
*1 PhD-Position in Neural Language Generation
*
We invite applications for one PhD student position in data-to-text
natural language generation in a low resource context. The goal of the
project is to develop methods that generalize well to settings where
little training data for the domain of interest is available.
The position, to be established in thegroup "Computer Science and
Computational Linguistics" (Prof. Vera Demberg)
<https://www.uni-saarland.de/lehrstuhl/demberg.html>, is part of the E2
project of DFG-funded transregional collaborative research center on
perspicuous computing, CPEC
<https://www.perspicuous-computing.science/>. There will also be the
opportunity to closely collaborate with researchers working on the
DFG-fundedCollaborative Research Center on Information Density and
Linguistic Encoding <http://www.sfb1102.uni-saarland.de/>(SFB 1102)
<http://www.sfb1102.uni-saarland.de/>.
Candidates for this position should have a master's degree in
computational linguistics, computer science or a related discipline.
Experience with machine learning including deep learning is expected;
background and previous experience in natural language processing is
also expected. The research will be conducted in English.
Dates: *application deadline: Feb 15, 2023*
start date: spring or summer 2023
The expected duration of the PhD is 3.5 years, the position is paid
according to 75% TV-L E13, see
alsohttps://oeffentlicher-dienst.info/c/t/rechner/tv-l/west?id=tv-l-2020&matrix=12
<https://oeffentlicher-dienst.info/c/t/rechner/tv-l/west?id=tv-l-2020&matrix…>.
The job does not come with any teaching obligation. You can however
choose to participate in teaching activities (tutoring or co-teaching).
Applicants are requested to submit their application, including a cover
letter that specifies why you would like to work on this topic and what
qualifies you for it, an academic CV, a list of academic publications,
your MSc thesis (or a current draft), copies of academic degree
certificates and names of two potential references. Please also include
a 2-page research proposal in your application which outlines how you
would approach the topic (choose one topic among multitask learning,
domain adaptation or connective generation for discourse coherence).
Saarland University <https://www.uni-saarland.de/en/home.html> is one of
the leading centres for computational linguistics and computer science
in Europe, and offers a dynamic and stimulating research environment. It
is famous for its interdisciplinary research in language, translation,
computation and cognition. The group is affiliated with both
theDepartment of Computer Science
<https://www.uni-saarland.de/fachrichtung/informatik.html>and with the
Department of Language Science and Technology
<https://www.lst.uni-saarland.de/>.
The Department of Language Science and Technology organizes about 100
research staff in ten research groups in the fields of computational
linguistics, psycholinguistics, speech processing, and corpus linguistics.
Both departments are part of the Saarland Informatics Campus
<https://saarland-informatics-campus.de/en>, which brings together 800
researchers and 2000 students from 81 countries. We collaborate closely
with the university's Department of Computer Science, the Max Planck
Institute for Informatics <https://www.mpi-inf.mpg.de/home/>, the Max
Planck Institute for Software Systems <https://www.mpi-sws.org/>, and
the German Research Center for Artificial Intelligence
<https://www.dfki.de/en/web/> (DFKI).
Our researchers and students come from all over the world, and our
primary working language is English.
Saarland University is an equal opportunity employer. Applications of
women are strongly encouraged; applications of disabled persons will be
given preferential treatment to those of other candidates with equal
qualifications.
Applications should be sent via email directly to Prof. Vera Demberg
(vera(at)coli.uni-saarland.de), quoting opening number W2229.
Call for Participation - VarDial Evaluation Campaign 2023
Within the scope of the tenth VarDial workshop, co-located with EACL 2023, we are organizing an evaluation campaign on similar languages, varieties and dialects with three shared tasks. To participate and to receive the training data please fill the registration form on the workshop website:
https://sites.google.com/view/vardial-2023/shared-tasks
We are organizing the following tasks this year (please check the website for more information):
1. SID for low-resource language varieties (SID4LR)
This task is Slot and Intent Detection (SID) for low-resource language varieties. Slot detection is a span labeling task, intent detection a classification task. The test set will contain Swiss German (GSW), South Tyrolean (DE-ST), and Neapolitan (NAP). This shared task seeks to answer the following question: How can we best do zero-shot transfer to low-resource language varieties without standard orthography?
The training data consists of the xSID-0.4 corpus, containing data from Snips and Facebook. The original training data is in English, but we also provide automatic translations of the training data into German, Italian and other languages (the projected nmt-transfer data from van der Goot et al., 2021). Participants are allowed to use other data to train on, as long as it is not annotated for SID in the target languages.
Participants are not required to submit systems for both tasks, it is also possible to only participate in one of the two tasks, intent detection (classification) or slot detection (span labeling). The systems will be evaluated with the span F1 score for slots and accuracy for intents as the main evaluation metric as is standard for these tasks. Participants may also submit systems for a subset of the three target languages.
2. Discriminating Between Similar Languages - True Labels (DSL-TL)
Discriminating between similar languages (e.g., Croatian and Serbian) and language varieties (e.g., Brazilian and European Portuguese) has been a popular topic at VarDial since its first edition. The DSL shared tasks organized in 2014, 2015, 2016, and 2017 have addressed this issue by providing participants with the DSL Corpus Collection (DSLCC), a collection of journalistic texts containing texts written in multiple similar languages and language varieties. The DSLCC was compiled under the assumption that each instance's gold label is determined by where the text is retrieved from. While this is a straightforward (and mostly accurate) practical assumption, previous research has shown the limitations of this problem formulation as some texts may present no linguistic marker that allows systems or native speakers to discriminate between two very similar languages or language varieties.
We tackle this important limitation by introducing the DSL True Labels (DSL-TL) task. DSL-TL will provide participants with a human-annotated DSL dataset. A sub-set of nearly 13,000 sentences were retrieved from the DSLCC and annotated by multiple native speakers of the included language and varieties included, namely English (American and British), Portuguese (Brazilian and European), Spanish (Argentinian and Peninsular). To the best of our knowledge this is the first dataset of its kind opening exciting new avenues for language identification research.
3. Discriminating Between Similar Languages - Speech (DSL-S)
In the DSL-S 2023 shared task, participants use training and development sets from the Mozilla Common Voice (CV) to develop a language identifier for speech. The nine languages selected for the task come from four different subgroups of Indo-European or Uralic language families. The test data used in this task is the Common Voice test data for the nine languages. The participants are asked not to evaluate their systems themselves nor in any other way investigate the test data before the shared task results have been published. The total amount of unpacked speech data is around 15 gigabytes. Only the .mp3 files from the test set must be used when generating the results. The metadata concerning the test audio files, including their transcriptions, must not be used. This task is audio only.
The 9-way classification task is divided into two separate tracks. Only the training and development data in the Common Voice dataset are allowed in the closed track, and no other data must be used. This prohibition includes systems and models trained (unsupervised or supervised) on any other data. On the open track, the use of any openly available (available to any possible shared task participant) datasets and models not including or trained on the Mozilla Common Voice test set is allowed.
Dates
Training set release: January 23, 2023
Test set release: February 6, 2023
Submissions due: February 17, 2023
Paper submission deadline: February 27, 2023
Notification of acceptance: March 13, 2023
Camera-ready papers due: March 27, 2023
Of course, VarDial also accepts research papers focusing on computational methods and language resources for closely related languages, language varieties, and dialects. The full call for papers can be found here:
https://sites.google.com/view/vardial-2023/call-for-papers
Contact: yves.scherrer(a)helsinki.fi<mailto:yves.scherrer@helsinki.fi> or tommi.jauhiainen(a)helsinki.fi<mailto:tommi.jauhiainen@helsinki.fi>
*The Second Ukrainian Natural Language Processing Workshop (UNLP 2023)*
<https://unlp.org.ua/>
*Update: *the submission link can be found at
https://softconf.com/eacl2023/UNLP/.
*Call For Papers*
UNLP 2023 <https://unlp.org.ua/call-for-papers/> will be held online in
conjunction with the EACL 2023 conference in May 2023.
The workshop will bring together academics, researchers, and practitioners
in the fields of Natural Language Processing and Computational Linguistics
who work with the Ukrainian language or do cross-Slavic research that can
be applied to the Ukrainian language.
The workshop will facilitate developments in the processing of the Ukrainian
language, as well as provide a platform for discussion and sharing of
ideas, encourage collaboration between different research groups, and
improve the visibility of the Ukrainian research community.
Topics of interest lie in the area of Ukrainian NLP and Computational
Linguistics and include, but are not limited to, the following tasks:
- morphosyntactic tagging,
- named-entity recognition,
- syntactic and semantic parsing,
- coreference resolution,
- information extraction and text mining,
- automated question answering and information retrieval,
- language modelling and natural language generation,
- grammatical error correction,
- text summarization,
- machine translation,
- sentiment analysis,
- argument mining,
- disinformation detection and fact verification,
- development of language resources and evaluation methods,
- speech recognition and generation,
- knowledge representation and computational pragmatics,
- computational semantics,
- computational methods for phonology,
- cross-Slavic models,
- Ukrainian NLP in interaction with other artificial intelligence
technologies.
*Shared Task*
The Second UNLP features the first *Shared Task in Grammatical Error
Correction for Ukrainian*. The Shared Task focuses on correction of
grammatical errors and disfluencies, and we see this shared task as an
opportunity to facilitate research of GEC for Slavic languages.
You can find more details on the web page of the Shared Task
<https://unlp.org.ua/shared-task/>.
*Important dates*
December 22, 2023 — First call for workshop papers
January 9, 2023 — Second call for workshop papers
February 13, 2023 — Workshop paper due
March 13, 2023 — Notification of acceptance
March 27, 2023 — Camera-ready papers due
May 5 or 6, 2023 — Workshop dates
*Keynote speakers*
Mona Diab <https://www.linkedin.com/in/mona-diab-55946614/>, The George
Washington University, US
Gulnara Muratova <https://www.linkedin.com/in/gulnara-muratova-0206/>,
QIRI`M YOUNG, Ukraine
*Submissions*
The workshop will provide Grammarly Premium to all authors. To request
Grammarly Premium, please submit the form on the website
<https://unlp.org.ua/>.
UNLP invites submissions of completed and ongoing projects. Submissions
describing resources or solutions that have been made available to the
wider public are strongly encouraged. The workshop will also accept papers
with negative results.
We invite two types of submissions: long and short papers. Long papers
should describe original, unpublished and completed work. The short papers
may describe work in progress, small focused contributions, system
demonstrations, new linguistic resources, or experiments based on existing
software and resources.
All submissions will be judged on correctness, novelty, technical strength,
clarity of presentation, usability, and significance/relevance to the
Workshop. Every submission will be reviewed by at least three members of
the Program Committee.
Paper review will be blind. The papers must not include the authors’ names
and affiliations. Self-citations and other references that reveal the
authors’ identity must be avoided.
Long papers should follow the two-column format of EACL 2023 proceedings
not exceeding eight (8) pages of content plus two (2) pages for references.
Short paper submissions should follow the same format, and should not
exceed five (5) pages for content plus two (2) pages for references.
All submissions must conform to the official style guidelines of EACL 2023
<https://unlp.org.ua/call-for-papers/#:~:text=style%20guidelines%20of%20EACL…>
contained
in the style files and must be in PDF. Camera-ready versions of accepted
papers must be provided both in LaTeX and PDF format.
*Workshop Organizers*
Andrii Hlybovets, National University of Kyiv-Mohyla Academy, Ukraine
Oleksii Ignatenko, Ukrainian Catholic University, Ukraine
Oleksii Molchanovskii, Ukrainian Catholic University, Ukraine
Mariana Romanyshyn, Grammarly, Ukraine
Oleksii Syvokon, Microsoft, Ukraine
*Program Committee*
Andrii Babii, Kharkiv National University of Radio Electronics, Ukraine
Andrii Liubonko, Grammarly, Ukraine
Anna Rogers, University of Copenhagen, Denmark
Artem Chernodub, Grammarly, Ukraine
Bogdan Babych, Heidelberg University, Germany
Bogdana Oliynyk, National University of Kyiv-Mohyla Academy, Ukraine
Bohdan Kolchygin, Shelf, Ukraine
Dmytro Karamshuk, Meta, UK
Dmytro Sytnyk, Institute of Mathematics NAS, Ukraine
Galyna Kriukova, National University of Kyiv-Mohyla Academy, Ukraine
Igor Samokhin, Grammarly, Ukraine
Iuliia Makogon, Semantrum, Ukraine
Julia Rogushina, Institute of Software Systems NAS, Ukraine
Kostiantyn Omelianchuk, Grammarly, Ukraine
Maksym Tarnavskyi, Shelf, Poland
Mariana Romanyshyn, Grammarly, Ukraine
Natalia Grabar, CNRS, Université de Lille, France
Natalia Kocyba, Samsung Research Poland, Poland
Nataliia Cheilytko, Friedrich Schiller University Jena, Germany
Oleksandr Marchenko, Taras Shevchenko National University of Kyiv, Ukraine
Oleksandr Skurzhanskyi, Grammarly, Ukraine
Oleksii Turuta, Kharkiv National University of Radio Electronics, Ukraine
Olena Siruk, Bulgarian Academy of Sciences, Bulgaria
Olga Kanishcheva, Friedrich Schiller University Jena, Germany
Ruslan Chorney, National University of Kyiv-Mohyla Academy, Ukraine
Serhii Havrylov, University of Edinburgh, UK
Svitlana Galeshchuk, Université Paris Dauphine, BNP Paribas, France
Taras Lehinevych, Amazon, Ireland
Taras Shevchenko, Proxet (Giphy project), Ukraine
Tatjana Scheffler, Ruhr-University Bochum, Germany
Thierry Hamon, Université Paris-Saclay, CNRS, LIMSI & Université Sorbonne,
France
Veronika Solopova, FU Berlin, Germany
Volodymyr Taranukha, Taras Shevchenko National University of Kyiv, Ukraine
Vsevolod Dyomkin, Projector, Ukraine
Yevhen Kupriianov, National Technical University “Kharkiv Polytechnic
Institute”, Ukraine
*Contact*
Email: info(a)unlp.org.ua.
Website: https://unlp.org.ua/.
Twitter: https://twitter.com/UNLP_workshop.
Telegram: https://t.me/UNLP_workshop.
Second call for papers
Fourth workshop on Resources for African Indigenous Language (RAIL)
https://bit.ly/rail2023
The 4th RAIL (Resources for African Indigenous* Languages) workshop
will be co-located with EACL 2023 in Dubrovnik, Croatia. The Resources
for African Indigenous Languages (RAIL) workshop is an
interdisciplinary platform for researchers working on resources (data
collections, tools, etc.) specifically targeted towards African
indigenous languages. In particular, it aims to create the conditions
for the emergence of a scientific community of practice that focuses on
data, as well as computational linguistic tools specifically designed
for or applied to indigenous languages found in Africa.
Previous workshops showed that the presented problems (and solutions)
are not only applicable to African languages. Many issues are also
relevant to other low-resource languages, such as different scripts and
properties like tone. As such, these languages share similar
challenges. This allows for researchers working on these languages with
such properties (including non-African languages) to learn from each
other, especially on issues pertaining to language resource
development.
The RAIL workshop has several aims. First, it brings together
researchers working on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances sharing of knowledge on the development of
low-resource languages. Finally, it enables discussions on how to
improve the quality as well as availability of the resources.
The workshop has “Impact of impairments on language resources” as its
theme, but submissions on any topic related to properties of African
indigenous languages (including non-African languages) may be accepted.
Suggested topics include (but are not limited to) the following:
Digital representations of linguistic structures
Descriptions of corpora or other data sets of African indigenous
languages
Building resources for (under resourced) African indigenous languages
Developing and using African indigenous languages in the digital age
Effectiveness of digital technologies for the development of African
indigenous languages
Revealing unknown or unpublished existing resources for African
indigenous languages
Developing desired resources for African indigenous languages
Improving quality, availability and accessibility of African indigenous
language resources
*: The term indigenous languages used in the RAIL workshop is intended
to refer to non-colonial languages (in this case those used in Africa).
In no way is this term used to cause any harm or discomfort to anyone.
Many of these languages were or are still marginalised, and the aim of
the workshop is to bring attention to the creation, curation, and
development of resources for these languages in Africa.
Submission requirements:
We invite papers on original, unpublished work related to the topics of
the workshop. Submissions, presenting completed work, may consist of up
to eight (8) pages of content plus additional pages of references. The
final camera-ready version of accepted long papers are allowed one
additional page of content (so up to 9 pages) so that reviewers’
feedback can be incorporated.
Submissions need to use the EACL stylesheets. These can be found at
https://2023.eacl.org/calls/styles. Submission is electronic in PDF
through the START system (link will be provided once available).
Reviewing is double-blind, so make sure to anonymize your submission
(e.g., do not provide author names, affiliations, project names, etc.)
Limit the amount of self citations (anonymized citations should not be
used). Accepted papers will be published in the ACL workshop
proceedings.
Please make sure you also go through the responsible NLP checklist
(https://aclrollingreview.org/responsibleNLPresearch/). Also,
submissions should have a section titled “Limitations” (as described in
the stylesheets). Authors are also encouraged to include an explicit
ethics statement.
Important dates:
Submission deadline 13 February 2023
Date of notification 13 March 2023
Camera ready deadline 27 March 2023
RAIL workshop 5 or 6 May 2023
Organising Committee
Rooweither Mabuya, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Don Mthobela, Cam Foundation
Mmasibidi Setaka, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources
(SADiLaR), South Africa
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
*apologies for cross-postings*
�
CODI, 4th Workshop on Computational Approaches to Discourse
�
2023-07-13–14 - ACL 2023 - Toronto, Canada
�
** Submission deadline: April 24th, 2023 **
�
Aims and scope
�
The last ten years have seen a dramatic improvement in the ability of NLP systems to understand and produce words and sentences. This development has created a renewed interest in discourse phenomena as researchers move towards the processing of long-form text and conversations. There is a surge of activity in discourse parsing, coherence models, text summarization, corpora for discourse level reading comprehension, and discourse related/aided representation learning, to name a few, but the problems in computational approaches to discourse are still substantial. At this juncture, we have organized three Workshops on Computational Approaches to Discourse (CODI) at EMNLP 2020, EMNLP 2021 and COLING 2022 to bring together discourse experts and upcoming researchers. These workshops have catalyzed work to improve the speed and knowledge needed to solve such problems and have served as a forum for the discussion of suitable datasets and reliable evaluation methods.
�
The previous workshops on discourse in machine translation (DiscoMT), linking lexical, sentential and discourse semantics (LSDSem), discourse structure in natural language generation (DSNNLG), discourse relation parsing and treebanking (DISRPT) and coreference (CORBON/CRAC), have shown that there is considerable interest and success in bringing together the community working on specific problems in discourse. We believe that the discourse community will also benefit from a general forum where work ranging from corpus development/analysis to computational models, and evaluation is discussed, and desiderata can be drawn for future progress.
�
The 4th CODI workshop is planned as a 2 day event which brings together different subcommunities. It will feature invited talks and regular papers on the first day. The second day will be dedicated to shared tasks and special sessions which focus on the issues mentioned above. After a first successful iteration in 2019 and 2021 the shared task on Discourse Relation Parsing and Treebanking (DISRPT) will be held again in 2023, with three tasks: discourse segmentation, discourse connective identification and discourse relation classification, including new datasets and languages. For more information on the shared task see:
�
<https://sites.google.com/view/disrpt2023/> https://sites.google.com/view/disrpt2023/ �
�
Topics of interest
�
We welcome symbolic and probabilistic approaches, corpus development and analysis, as well as machine and deep learning approaches to discourse. We appreciate theoretical contributions as well as practical applications, including demos of systems and tools. The goal of the workshop is to provide a forum for the community of NLP researchers working on all aspects of discourse. �
Topics of interest include, but are not limited to: �
* discourse structure �
* discourse connectives �
* discourse relations �
* annotation tools and schemes for discourse phenomena �
* corpora annotated with discourse phenomena �
* discourse parsing �
* cross-lingual discourse processing �
* cross-domain discourse processing �
* anaphora and coreference resolution �
* event coreference �
* argument mining �
* coherence modeling �
* discourse and semantics �
* discourse in applications such as machine translation, summarization, etc. �
* evaluation methodology for discourse processing �
�
Submissions �
�
We solicit four categories of papers: regular workshop papers, demos, shared task papers and extended abstracts. Only regular workshop papers, shared task papers and demos will be included in the proceedings as archival publications. �
�
Regular papers must describe original unpublished research. Long papers may consist of up to 8 pages of content, plus unlimited pages for references. �
�
Short papers can be up to 4 pages, plus unlimited pages for references. �
�
Demo submissions may describe systems, tools, visualizations, etc., and may consist of up to 4 pages, plus unlimited pages for references. �
�
Each submission can contain unlimited pages for Appendices but the paper submissions need to remain fully self-contained, as these supplementary materials are completely optional, and reviewers are not even asked to review them.
Accepted long, short, and demo papers will be presented orally. �
�
Extended abstracts can describe work in progress or those already published elsewhere. These may be two pages long (without references). Extended abstracts are non-archival. They will be presented orally, and included in the workshop program and handbook, but will not appear in the workshop proceedings.
�
Double submission of papers is allowed but will need to be indicated at submission. �
�
Submission website
�
All submissions must be anonymous and follow the ACL 2023 formatting instructions described here: <https://2023.aclweb.org/calls/style_and_formatting/> https://2023.aclweb.org/calls/style_and_formatting/ � �
Please submit your workshop papers at <https://www.softconf.com/acl2023/CODI2023> https://www.softconf.com/acl2023/CODI2023
Shared task papers should be submitted to the links specified on the shared task pages.
�
Important dates
* 2023-04-24: CODI papers due
* 2023-05-22: Notification of acceptance
* 2023-06-06: Camera ready deadline for main conference and CODI
* 2023-07-13 – 2022-07-14: CODI workshop
�
All deadlines are 11.59 pm UTC -12h ("anywhere on Earth").
�
Invited Speakers �
* Yufang Hou, IBM Research �
* Giuseppe Carenini, University of British Columbia �
�
Organizers
* Chloé Braud, CNRS-IRIT
* Christian Hardmeier, IT University of Copenhagen and Uppsala University
* Jessy Li, University of Texas, Austin
* Sharid Loáiciga, University of Gothenburg
* Michael Strube, Heidelberg Institute for Theoretical Studies
* Amir Zeldes, Georgetown University
To contact the organizers, please send an email to: codi-workshop(a)googlegroups.com <mailto:codi-workshop@googlegroups.com>
�
�
�
The Department of Computer Science (DCC) at the University of Chile
<https://www.dcc.uchile.cl> is offering 2 full-time permanent positions to
carry out research and teaching at both undergraduate and graduate level.
As assistant professors in tenure-track positions, candidates are expected
to develop a strong and competitive research program. *Details and
application link (in Spanish) can be found at:
https://concurso-academico.uchile.cl/
<https://concurso-academico.uchile.cl/> (look for code FM2202). The
application deadline is March 1st, 2023.*
*Candidates*
Candidates should be pursuing internationally recognized research in the
following areas (one position per area is offered):
- Theoretical Computer Science
- Artificial Intelligence
- Software Engineering
- Computer Security and Privacy
- Networking
Successful candidates are expected to complement the existing strengths of
the Department and its current activities. Candidates must hold a Ph.D.
degree and must have demonstrated excellence in research and scholarship
within the field of interest. Successful candidates will be expected to:
• move to Chile (if currently living elsewhere),
• establish and lead their own research team,
• develop innovative research at the highest international standards,
• deliver high quality teaching,
• supervise undergraduate, master and PhD students,
• obtain competitive research funding from external funding bodies,
• write and speak English fluently, and
• develop a good command of Spanish within 1 year of appointment.
*The Department of Computer Science (DCC)*
The DCC is the leading Computer Science Department in Chile, and one of the
most prominent in Latin America. It pioneered the use of Unix and the
adoption of the Internet in Chile. Today it has 21 full-time professors,
where 8 come from Argentina, Uruguay, France, Ireland, and Peru. They carry
out research** at ** at the highest level, publishing around 100 articles
per year in international conferences and journals. The department is
responsible for teaching Computer Science, with about 600 undergraduate, 80
MSc and 40 PhD students majoring in Computer Science.
For further information, please contact Prof. Gonzalo Navarro (
gnavarro(a)dcc.uchile.cl).
*CENIA and IMFD*
Many academics at DCC form part of two well-funded research centers in Data
Science and Artificial Intelligence: the National Center for Artificial
Intelligence Research (CENIA) (Funded by Programa Centros Basales, ANID,
Chile) and the Millennium Institute for Foundational Research on Data
(IMFD) (Funded by Programa Milenio, ANID, Chile). Joining DCC will provide
the possibility to cooperate with these centers.
https://comunicaciones.dcc.uchile.cl/news/650-llamado-a-concurso-academico-…