*The Second Ukrainian Natural Language Processing Workshop (UNLP 2023)*
<https://unlp.org.ua/>
*Update: *February 20, 2023 — Workshop paper due
*Call For Papers*
UNLP 2023 <https://unlp.org.ua/call-for-papers/> will be held online in
conjunction with the EACL 2023 conference on May 5 or 6, 2023.
The workshop will bring together academics, researchers, and practitioners
in the fields of Natural Language Processing and Computational Linguistics
who work with the Ukrainian language or do cross-Slavic research that can
be applied to the Ukrainian language.
The workshop will facilitate developments in the processing of
the Ukrainian language, as well as provide a platform for discussion and
sharing of ideas, encourage collaboration between different research
groups, and improve the visibility of the Ukrainian research community.
Topics of interest lie in the area of Ukrainian NLP and Computational
Linguistics and include, but are not limited to, the following tasks:
- morphosyntactic tagging,
- named-entity recognition,
- syntactic and semantic parsing,
- coreference resolution,
- information extraction and text mining,
- automated question answering and information retrieval,
- language modelling and natural language generation,
- grammatical error correction,
- text summarization,
- machine translation,
- sentiment analysis,
- argument mining,
- disinformation detection and fact verification,
- development of language resources and evaluation methods,
- speech recognition and generation,
- knowledge representation and computational pragmatics,
- computational semantics,
- computational methods for phonology,
- cross-Slavic models,
- Ukrainian NLP in interaction with other artificial intelligence
technologies.
*Shared Task*
The Second UNLP features the first *Shared Task in Grammatical Error
Correction for Ukrainian*. The Shared Task focuses on the correction of
grammatical errors and disfluencies, and we see this shared task as an
opportunity to facilitate research of GEC for Slavic languages.
February 12, 2023 — Deadline for registration for the shared task
You can find more details on the web page of the Shared Task
<https://unlp.org.ua/shared-task/>.
*Important dates*
December 22, 2023 — First call for workshop papers
January 9, 2023 — Second call for workshop papers
February 20, 2023 — Workshop paper due (extended)
March 13, 2023 — Notification of acceptance
March 27, 2023 — Camera-ready papers due
May 5 or 6, 2023 — Workshop dates
*Keynote speakers*
Mona Diab <https://www.linkedin.com/in/mona-diab-55946614/>, The George
Washington University, US
Gulnara Muratova <https://www.linkedin.com/in/gulnara-muratova-0206/>,
QIRI`M YOUNG, Ukraine
*Submissions*
The workshop will provide Grammarly Premium to all authors. To request
Grammarly Premium, please submit the form on the website
<https://unlp.org.ua/>.
UNLP invites submissions of completed and ongoing projects. Submissions
describing resources or solutions that have been made available to the
wider public are strongly encouraged. The workshop will also accept papers
with negative results.
We invite two types of submissions: long and short papers. Long papers
should describe original, unpublished and completed work. The short papers
may describe work in progress, small focused contributions, system
demonstrations, new linguistic resources, or experiments based on existing
software and resources.
All submissions will be judged on correctness, novelty, technical strength,
clarity of presentation, usability, and significance/relevance to the
Workshop. Every submission will be reviewed by at least three members of
the Program Committee.
Paper review will be blind. The papers must not include the authors’ names
and affiliations. Self-citations and other references that reveal the
authors’ identity must be avoided.
Long papers should follow the two-column format of EACL 2023 proceedings
not exceeding eight (8) pages of content plus two (2) pages for references.
Short paper submissions should follow the same format, and should not
exceed five (5) pages for content plus two (2) pages for references.
All submissions must conform to the official style guidelines of EACL 2023
<https://unlp.org.ua/call-for-papers/#:~:text=style%20guidelines%20of%20EACL…>
contained
in the style files and must be in PDF. Camera-ready versions of accepted
papers must be provided both in LaTeX and PDF format.
*Workshop Organizers*
Andrii Hlybovets, National University of Kyiv-Mohyla Academy, Ukraine
Oleksii Ignatenko, Ukrainian Catholic University, Ukraine
Oleksii Molchanovskii, Ukrainian Catholic University, Ukraine
Mariana Romanyshyn, Grammarly, Ukraine
Oleksii Syvokon, Microsoft, Ukraine
*Program Committee*
Andrii Babii, Kharkiv National University of Radio Electronics, Ukraine
Andrii Liubonko, Grammarly, Ukraine
Anna Rogers, University of Copenhagen, Denmark
Artem Chernodub, Grammarly, Ukraine
Bogdan Babych, Heidelberg University, Germany
Bogdana Oliynyk, National University of Kyiv-Mohyla Academy, Ukraine
Bohdan Kolchygin, Shelf, Ukraine
Dmytro Karamshuk, Meta, UK
Dmytro Sytnyk, Institute of Mathematics NAS, Ukraine
Galyna Kriukova, National University of Kyiv-Mohyla Academy, Ukraine
Igor Samokhin, Grammarly, Ukraine
Iuliia Makogon, Semantrum, Ukraine
Julia Rogushina, Institute of Software Systems NAS, Ukraine
Kostiantyn Omelianchuk, Grammarly, Ukraine
Maksym Tarnavskyi, Shelf, Poland
Mariana Romanyshyn, Grammarly, Ukraine
Natalia Grabar, CNRS, Université de Lille, France
Natalia Kocyba, Samsung Research Poland, Poland
Nataliia Cheilytko, Friedrich Schiller University Jena, Germany
Oleksandr Marchenko, Taras Shevchenko National University of Kyiv, Ukraine
Oleksandr Skurzhanskyi, Grammarly, Ukraine
Oleksii Turuta, Kharkiv National University of Radio Electronics, Ukraine
Olena Siruk, Bulgarian Academy of Sciences, Bulgaria
Olga Kanishcheva, Friedrich Schiller University Jena, Germany
Ruslan Chorney, National University of Kyiv-Mohyla Academy, Ukraine
Serhii Havrylov, University of Edinburgh, UK
Svitlana Galeshchuk, Université Paris Dauphine, BNP Paribas, France
Taras Lehinevych, Amazon, Ireland
Taras Shevchenko, Proxet (Giphy project), Ukraine
Tatjana Scheffler, Ruhr-University Bochum, Germany
Thierry Hamon, Université Paris-Saclay, CNRS, LIMSI & Université Sorbonne,
France
Veronika Solopova, FU Berlin, Germany
Volodymyr Taranukha, Taras Shevchenko National University of Kyiv, Ukraine
Vsevolod Dyomkin, Projector, Ukraine
Yevhen Kupriianov, National Technical University “Kharkiv Polytechnic
Institute”, Ukraine
*Contact*
Email: info(a)unlp.org.ua.
Website: https://unlp.org.ua/.
Twitter: https://twitter.com/UNLP_workshop.
Telegram: https://t.me/UNLP_workshop.
We invite you to participate in the 2023 edition of the CheckThat! Lab at CLEF 2023. This year, we feature five tasks ---one follow-up and four new--- that correspond to important components within and around the full fact-checking pipeline in multiple languages:
Task 1 Check-worthiness in tweets: This is the sixth round of the check-worthiness task. It allows us to reduce the workload of listening to social media for tweets and claims that would require the attention of a journalist. We offer two task modalities..
Subtask 1A: Multimodal tweets including text and picture (for the first time!). Available in Arabic and English.
Subtask 1B: Unimodal tweets and claims. Available in Arabic, English and Spanish.
Subtask 1C: US political debates, text only. Available in English
Task 2 Subjectivity in news articles: Distinguish whether a sentence from a news article expresses the subjective view of the author behind it or presents an objective view on the covered topic instead. Available in Arabic, Dutch, English, Italian, German, and Turkish.
Task 3 Political bias of news articles and news media. Detect political bias of news reporting at the article and at the media level. It includes two subtasks:
Subtask 3A: Given an article, classify its political leaning as left, center or right.
Subtask 3B: Given the URL to a news outlet (e.g., www.cnn.com), predict the overall political bias of that news outlet as left, center or right leaning.
Available in English.
Task 4 Factuality of reporting of news media. Identify the factuality of reporting at the media level. Given the URL to a news outlet the task asks to predict the factuality of reporting of that news outlet: low, mixed, and high. Available in English.
Task 5 Authority finding on twitter. Given a tweet stating a rumor, a model has to retrieve a ranked list of authority Twitter accounts that can help verify the rumor; i.e. they may tweet evidence that supports or denies the rumor. Available in Arabic.
Further information: https://checkthat.gitlab.io/
Datasets: https://gitlab.com/checkthat_lab/clef2023-checkthat-lab
Register and participate: https://clef2023-labs-registration.dei.unipd.it/registrationForm.php
Important Dates
---------------------
November 2022: Lab registration opens
December 2022: Release of the training materials
April 2023: Lab registration closes
May 2023: Beginning of the evaluation cycle
May 2023: End of the evaluation cycle (run submission)
May 2023: Deadline for the submission of working notes
June 2023: Notification of acceptance of working notes
July 2023: Deadline for submission of camera-ready working notes
July 2023: Preview of working notes
18-21 September: CLEF 2023 Conference in Thessaloniki, Greece
Best
The CLEF-2023 CheckThat! Lab Shared Task Organizers
[apologies for cross-posting]
SEPLN 2023: 39th INTERNATIONAL CONFERENCE OF THE SPANISH SOCIETY FOR NATURAL LANGUAGE PROCESSING
Jaén, Spain
September 27-29, 2023
http://sepln2023.sepln.org/en/home/
The Spanish Society for Natural Language Processing (SEPLN <http://www.sepln.org/>) is pleased to invite you to participate in the 39th edition of the SEPLN Conference. The SEPLN Conference will take place on 27-29 September 2022 at Jaén (Spain), at the Museo Íbero of Jaén, where the participants will discover the history of the Iberians.
The main aim of the SEPLN 2023 Conference is to provide both to the scientific community and to the industry a forum where the latest research and developments in the field of NLP can be presented and shared. The SEPLN 2023 Conference also gives the possibility to present real NLP applications and R&D projects. Finally, the conference intends to be an appropriate forum for helping new professionals to become active members in this field.
Topics of interest
Topics related to NLP, including but not limited to:
Linguistic, mathematical and psycholinguistic models of language.
Machine learning in NLP.
Computational lexicography and terminology.
Corpus linguistics.
Development of linguistic resources and tools.
Morphological and syntactic analysis.
Semantics, pragmatics and discourse.
Word sense disambiguation.
Monolingual and multilingual text generation.
Machine translation.
Knowledge and common sense.
Multimodality.
Spoken language processing.
Dialogue systems and interactive systems / Conversational assistants.
Multimedia indexing and retrieval.
Monolingual and multilingual information extraction and retrieval.
Question answering systems.
Evaluation of NLP systems.
Automatic textual content analysis.
Sentiment analysis and argument mining.
Plagiarism detection.
Negation and speculation processing.
Text mining in social media.
Text summarization.
Text simplification.
NLP in the biomedical domain.
NLP-based generation of teaching resources.
NLP for languages with limited resources.
NLP industrial applications.
Low-resource NLP tasks, data augmentation.
Ethics and NLP.
Interpretability and Analysis of Models for NLP.
Structure of the Conference
The SEPLN 2023 Conference will be a three-day event and will include sessions to present papers, ongoing research projects and prototype or product demos related to the topics of the conference. Likewise, the 26th of September will take place the Workshop Day, where the main workshop will be IberLEF 2023.
Paper types and author guidelines
The SEPLN 2023 Conference will accept three kinds of papers: (1) scientific contributions, (2) research project summaries and (3) system demonstration papers.
Scientific contributions. The accepted scientific contributions will be published in the Journal Procesamiento del Lenguaje Natural, whose aim is to promote the development of areas related to NLP, disseminate research carried out, identify future guidelines for basic research, and present software applications in this field. The scientific quality of the Journal is supported by the 2021 JCR index (JCI: 0.21, Q4-Linguistics - ESCI), the SCImago Journal Ranking (SJR: 0.217, Q4-Computer Science Applications, Q2-Linguistics and Language), the Scopus Index (CiteScore: 1.5, Q4-Computer Science Applications, Q2-Linguistics and Language) among others. More information at http://www.sepln.org/en/journal/quality.
The papers can be written in Spanish or English and must be at most 10 A4-size pages of content, plus unlimited pages for references. The papers must include the following sections:
The title of the communication (in English and Spanish).
The paper must be anonymized, since the Journal follows a double-blind review process.
An abstract with a maximum of 150 words (in English and Spanish).
A list of keywords or related topics (in English and Spanish).
The documents must not include headers or footers.
The information about the format of the papers and the Latex and Microsoft Word template are at: http://www.sepln.org/en/journal/author-guidelines.
Camera ready - the final version of the paper should be submitted together with a cover letter explaining how the suggestions of the reviewers were implemented in the final version. This cover letter will be considered in order to accept or finally reject the selected paper.
Preprint policy - The Journal allows the publication of preprints (non-refereed paper posted online, such as ArXiv) anytime, but during the review period the preprint must indicate that the paper is “under review” in the Journal Procesamiento del Lenguaje Natural. Likewise, if the paper is accepted, the preprint must be updated with the DOI, name of the Journal and the bibliographic information of the paper.
Research project summaries. They are summaries of ongoing research projects. This kind of papers must include the following information:
Project title.
Author name, affiliation and contact information. The review of this kind of paper is not blind review.
Funding institutions.
Research Groups participating in the project.
Language: English. We will not accept research project summaries in Spanish or other languages.
An abstract of a maximum of 150 words and a list of keywords.
Minimum length: 5 pages.
Maximum length: 6 pages (including references).
In the submission platform you have to choose “Projects and Demos” as main topic.
System demonstration papers. These papers must be related to NLP applications, and they must describe the technical details and the NLP components used or developed. The paper must be written in English, the minimum length of the paper must be 5 A4-size pages and the maximum length is 6 A4-size pages of content with the references included.
In the submission platform you have to choose “Projects and Demos” as main topic.
The research project summaries and the system demonstration papers will be published in CEUR Workshop Proceedings platform, which is widely known by the computer science research community. Accordingly, the paper format must match the CEUR template. We have adapted the CEUR Latex Template to SEPLN 2023 and you can download it here.
Submission Information. The papers must be submitted by March 19th, 2023. All submissions must be in PDF format and submitted electronically using the MyReview system available at: http://myreview.sepln.org/myreview-sepln71.
Submitted papers will be subjected to a blind review by at least three members of the SEPLN advisory council.
Important dates
Deadline for the submission of papers, projects and demos: March 19th, 2023.
Notification of acceptance: May 16th, 2023.
Camera Ready: May 31st, 2023.
Workshops: September 26th, 2023.
Conference: September 27th-29th, 2023.
Organizing Committee
L. Alfonso Ureña López (Chairman) University of Jaén (Spain).
M. Teresa Martín Valdivia (Chairwoman) University of Jaén (Spain).
Eugenio Martínez Cámara (Coordinator) University of Granada (Spain).
M. Carlos Díaz Galiano University of Jaén (Spain).
Miguel Ángel García Cumbreras University of Jaén (Spain).
Manuel García Vega University of Jaén (Spain).
Salud María Jiménez Zafra University of Jaén (Spain).
Fernando Martínez Santiago University of Jaén (Spain).
M. Dolores Molina González University of Jaén (Spain).
Arturo Montejo Ráez University of Jaén (Spain).
Flor Miriam Plaza del Arco University of Jaén (Spain).
Collaborators
Alba María Mármol Romero University of Jaén (Spain).
Estrella Vallecillo Rodríguez University of Jaén (Spain).
Mariia Chizhikova University of Jaén (Spain).
Alberto Gutierrez Mejías University of Jaén (Spain).
Jaime Collado University of Jaén (Spain).
Contact
All information related to the conference can be found at http://sepln2023.sepln.org/
For all general enquiries, please contact: sepln2023jaen(a)googlegroups.com.
---
Eugenio Martínez Cámara
Profesor Ayudante Doctor | Junior Lecturer
DaSCI, Instituto Andaluz de Inteligencia Artificial | DaSCI, Andalusian Institute in Artificial Intelligence.
Dpto. Ciencias de la Computación e Inteligencia Artificial | Computer Science and Artificial Intelligence department.
Universidad de Granada
Dear Colleagues
My name is CK Jung and I’m Director of Institute for Corpus Research at
Incheon National University. I will be President of the Korea Association
of Secondary English Education (KASEE) from 1st March 2023 and I would like
to introduce you to the 2023 Joint International Conference on English
Language Teaching in Korea.
The conference takes place at Konkuk University, Seoul, South Korea from 6
to 8 July and our plenary speakers are:
- Tony McEnery (Lancaster University, UK)
- Joan Kelly Hall (Penn. State University, USA)
- Julio C. Rodriguez (University of Hawai‘i at Mānoa, USA)
- Kazuya Saito (University College London, UK)
- Yuko Goto Butler (University of Pennsylvania, USA)
- Youngju Yi (The Ohio State University, USA)
If you’re interested in presenting a paper or poster (we welcome all
research areas), please visit the following website and fill out the online
registration form by 13 February (we only need your presentation title at
this time):
https://docs.google.com/forms/d/e/1FAIpQLSevA4eQyAN5lnLeQFRQF4d3g5sUllRQdzo…
*Please note that it is very important to choose ‘KASEE’ in the Affiliated
Association (소속학회) (Choose One)’ section while you’re filling out the form.*
If you would like to know more about the conference, please visit
http://jointconference2023.com
Looking forward to seeing you in Seoul in July.
Best regards
CK Jung
---
*CK Jung BEng(Hons) Birmingham MSc Warwick EdD Warwick Cert Oxford*
Department of English Language and Literature, Incheon National
University, *South
Korea*
Director | Institute for Corpus Research, Incheon National University, *South
Korea* (http://icr.or.kr)
Editor | Asia Pacific Journal of Corpus Research, ICR, *International* (
http://icr.or.kr/apjcr)
Editorial Board | Corpora, Edinburgh University Press, *UK*
Editorial Board | English Today, Cambridge University Press, *UK*
E: ckjung(a)inu.ac.kr / T: +82 (0)32 835 8129
H(EN): http://ckjung.org
*Apologies for cross-posting*
[image.png]
Deadline Extended: Special Issue on The Role of Context in Neural Machine Translation Systems and its Evaluation in Natural Language Engineering
Submission deadline extended to 15 Feb., 2023.
Guest editors:
- Sheila Castilho (The ADAPT Centre, School of Applied Languages and Intercultural Studies, Dublin City University)
- Rebecca Knowles (National Research Council Canada)
For this special issue, we invite the submission of papers focusing on the variety of novel implementations of context into neural machine translation systems as well as novel approaches to its evaluation. Recent claims that machine translation systems are reaching (near) human parity at the sentence level have been followed by subsequent analyses that indicate remaining gaps in translation quality at the document level. How best to evaluate machine translation at the document level (and what exactly constitutes document level evaluation) remains an open question. At the same time, there is work seeking to add discourse and context into neural machine translation systems. Papers that focus on topics of context in neural machine translation, machine translation evaluation, or both are welcome.
For full details, see: https://sites.google.com/dcu.ie/nlecontextnmt/home
Topics of interest include, but are not limited to:
- Novel language processing techniques for implementing discourse in NMT systems
- Document-level NMT and evaluation
- Use of target and source context
- Context-aware techniques for quality evaluation
- Context-aware automatic and human evaluation metrics
- The size and composition of the training data and its effect on context-aware systems
- The effect of the quality of training data and test sets on context-aware systems
- Translationese and its effect on document-level training
- Lexical diversity and lexical density in discourse NMT
- Discourse NMT for different domains
Publication Timeline:
- Article deadline submission: now 15 February 2023
- Return of reviews to contributors: 1 April 2023
- Revised articles deadline submission: 1 May 2023
- Return of second reviews to contributors (if applicable): 1 July 2023
- Final Submission: 15 September 2023
- Publication: November 2023 / January 2024
Format and Submission:
Typical submissions will be 12-25 pages in length. Authors should follow the "Author Instructions" section on the journal website: https://www.cambridge.org/core/journals/natural-language-engineering/inform…
We highly recommend using the LaTeX template found under "Preparing your materials" at the link above.
All manuscripts must be submitted online via the NLE ScholarOne website: http://mc.manuscriptcentral.com/nle. Under "Special Issue Designation", choose "The Role of Context in Neural Machine Translation Systems and its Evaluation".
Queries:
Any queries related to this special issue should be addressed to sheila.castilho(a)dcu.ie<mailto:sheila.castilho@dcu.ie> with NLE-ContextNMT in the subject line.
**** English version below ****
Bonjour,
Suite à diverses demandes, la date limite pour soumettre une proposition
pour les prochaines Journées de la Linguistique de Corpus a été
repoussée. Veuillez trouver ci-dessous le nouvel appel.
*ATTENTION : nouvelle date limite*
* 11es Journées Internationales de Linguistique de Corpus (JLC2023) *
3-6 juillet 2023, Grenoble, France
* Appel à communications *
https://jlc2023.sciencesconf.org/
Lancées en 2001 par Geoffrey Williams à l’université de
Lorient-Bretagne Sud, les Journées Internationales de Linguistique de
Corpus (JLC) réunissent régulièrement la communauté interdisciplinaire
dont l’objet de recherche porte sur les corpus linguistiques. Après
sept éditions, puis un passage à Orléans en 2015, elles s’installent à
Grenoble en 2017 puis 2019, et pour cette nouvelle édition en 2023.
Elles sont co-organisées par le Laboratoire LIDILEM (UGA) et d’autres
laboratoires de l’UGA (ILCEA4, LIG, Litt&Arts) et d’universités
partenaires (Lyon, Montpellier, Toulouse) : DDL, ICAR, Praxiling,
CLLE.
*Conférencier.e.s. invité.e.s* : Florence Mourlhon-Dallies **
(Université Paris Cité), Jérôme Jacquin (Université de Lausanne)
Les JLC2023 ont pour vocation de rassembler une communauté autour
d'approches variées, aussi bien du point de vue méthodologique que
disciplinaire. Elles s’attachent à mener une réflexion sur la
linguistique de corpus et à contribuer à l'évolution des pratiques
scientifiques dans ce domaine. Ces journées visent ainsi à créer des
passerelles entre différentes approches des corpus numériques.
Dans la lignée des précédentes conférences, les JLC2023 proposeront,
durant trois jours, des présentations scientifiques, des conférences
invitées et des sessions de poster et discussions entre les
participants. Des sessions de formation aux outils et à l’exploitation
des corpus, particulièrement pour des objectifs didactiques, seront
proposées. Les participant.e.s sont invité.e.s à confronter leurs
outils et leurs expériences et à présenter leurs résultats, dans tous
les champs dans lesquels l’utilisation des corpus est présente.
Cette édition des JLC mettra un focus particulier sur corpus et
didactique. Une partie des journées sera spécifiquement dédiée à cette
thématique. On attend ainsi pour celle-ci des propositions de
communication qui montrent et questionnent l’utilisation de corpus dans
l’enseignement, qu’il s’agisse de retours d’expérience, d’exposé des
démarches et approches méthodologiques, pour des publics variés, aussi
bien que de points de vue plus théoriques...
Ces journées ne se limiteront pas à cette thématique et restent
accueillantes pour toutes sortes de contributions autour des corpus
écrits, oraux ou multimodaux, qui pourront concerner, de manière non
exhaustive :
1. Approches linguistiques et corpus
2. Méthodes et outils
3. Variations, genres, discours
4. Applications et usages des corpus : formation, traduction,
terminologie...
La présentation des contributions en français ou en anglais ne
dépassera pas 3 pages (hors références bibliographiques et figures).
Les soumissions anonymes seront déposées via le système SciencesConf
pour une évaluation par deux relecteurs. À côté des communications
classiques, il sera possible de proposer une démonstration (mêmes
modalités de soumission).
Une proposition de publication en ligne est envisagée à l'issue de la
conférence.
Calendrier :
1. Diffusion de l’appel : début novembre 2022
2. Date-limite de réception des soumissions : 3 février 2023 *17
février 2023*
3. Notification aux auteurs : Mi-avril 2023
4. Version définitive de la soumission : 19 mai 2023
5. Inscriptions : mai 2023
---------------------------------------------------------
* The 11th International Conference on Corpus Linguistics *
3-6 July 2023, Grenoble, France
https://jlc2023.sciencesconf.org/
* Call for Papers *
*NEW DEADLINE*
The International Conference on Corpus Linguistics (JLC), founded by
Geoffrey Williams in 2001 at the University of South Brittany, Lorient,
France, regularly draws together an interdisciplinary community whose
research focus is corpus linguistics. After seven gatherings in Lorient
and an interlude in Orleans in 2015 (8th International Conference on
Corpus Linguistics), the conference alighted in Grenoble in early July
2017 and in November 2019, organized by the LIDILEM Laboratory with
contributions from LIG, ILCEA4, Litt&Arts and the MSH-Alpes. Université
Grenoble Alpes is honored to host this international conference again
from July 3rd to July 6th 2023. The JLC’23 are organized in
collaboration with other labs from French universities (Lyon,
Montpellier, Toulouse): DDL, ICAR, Praxiling, CLLE.
The objective of JLC'23 is to (re)unite a community that adopts various
approaches, be they methodological or disciplinary, to promote corpus
linguistics, and to contribute to the evolution of practices in the
field by building bridges between different approaches to digital
corpora. The participants are invited to share and compare their
knowledge of tools, experiences, and findings.
In the tradition of previous conferences, the JLC in Grenoble will
offer three days of presentations, guest speakers and discussion
sessions among the participants. Training sessions on tools and methods
will be organized over a half day.
This edition of the JLC will put a particular focus on corpora and
didactics. A part of the conference will be specifically dedicated to
this theme. We expect papers that show and question the use of corpora
in teaching, be they feedback from real uses, presentation of
methodological approaches for various audiences, or more theoretical
points of view...
These days will not be limited to this theme and will be open to all
kinds of contributions on written, oral or multimodal corpora, which
may concern, in a non-exhaustive way :
1. Linguistic approaches to corpora
2. Methods and tools
3. Variations, genres, and discourse
4. Applications and uses of corpora for teaching and learning,
translation, terminology...
*Guest speakers include*: Florence Mourlhon-Dallies ** (Université
Paris Cité), Jérôme Jacquin (Université de Lausanne)
Submissions for a presentation or a demonstration in French or English
should not exceed three pages (excluding figures and bibliographic
references) and must be anonymous. They will get double peer-reviewing
by members of the scientific board. JLC2023 will adopt the SciencesConf
system to manage communication proposals. In addition to classic
presentations, you may also propose a demonstration (identical
submission guidelines).
Publication: following the colloquium, authors are welcome to submit an
article. This collection of articles will be reviewed and published
online.
Timetable:
1. First CFP: November 2022
2. Submission deadline: Friday February 3rd 2023*17th 2023*
3. Notification of acceptance: Mid-April 2023
4. Final submission version: Friday May 19th 2023
5. Registration begins: May 2023
--
Marie-Paule Jacques /Mobilisée pour la défense du service public de
l'enseignement supérieur et de la recherche/ Maitre de conférences HDR
Sciences du langage - Senior Lecturer in Linguistics INSPE et LIDILEM
(Laboratoire de linguistique et didactique des langues étrangères et
maternelles) Université Grenoble Alpes
(Apologies for cross-postings)
*** The GUM Corpus - Release 9.0.0 ***
*** Georgetown University Multilayer corpus ***
Corpling@GU <https://gucorpling.org/corpling/> is happy to announce the first release of series 9 of the Georgetown University Multilayer corpus (GUM V9.0.0):
https://gucorpling.org/gum/
New in this version:
- 20 new documents added including more conversational data (total tokens: 203,879)
- Abstractive summaries for each document
- Annotations for salient/non-salient entities in each document
- Foreign language tags to identify individual source languages where relevant
- New easier process for reconstructing Reddit text data
- Many corrections to all annotation layers
GUM is an open source corpus of richly annotated English texts from multiple genres: academic, bio, conversation, fiction, interview, news, speeches, textbooks, travel, vlogs, how-to and Reddit forum discussions. The corpus is created by students as part of the Computational Linguistics curriculum at Georgetown University and is available under Creative Commons licenses.
This is the first version of GUM series 9, containing roughly 200K tokens annotated for:
- Multiple POS tags (100% manual gold PTB, extended PTB, converted CLAWS5 and UPOS) and UD morphological features
- Manually corrected lemmatization
- Sentence segmentation and rough speech act (manual)
- Document structure using TEI tags (paragraphs, headings, figures, captions etc., all manual)
- Constituent and dependency syntax (manually corrected Universal Dependencies, and PTB parses from gold tags with function labels)
- Information status (given-active/inactive, accessible-inferable/common ground/aggregate, and new)
- Entity type, salience and coreference annotation (including non-named entities, singletons, appositions, cataphora and several types of bridging)
- Entity linking (Wikification) of all named entities with Wikipedia articles, including their non-named and pronominal mentions
- Discourse parses in Rhetorical Structure Theory and discourse dependencies
- Abstractive summaries
Note on Reddit data: token text is not contained in the release but can be downloaded with an included script.
For more information and to search or download the corpus online, see the corpus website <https://gucorpling.org/gum/> .
Best wishes,
The GUM team
We invite you to participate in our multilingual stance classification shared task, as part of the Touché Lab, which will be held in conjunction with the CLEF'23 conference in Thessaloniki, Greece [1].
Context:
Participatory Democracy at the scale of a continent like Europe brings many difficulties due to the high diversity of languages and cultures. At the same time, Machine Learning is an interesting tool for stance recognition in a large-scale context, in terms of data size, but also regarding the topics and themes addressed or the languages employed by the participants. Public consultations of citizens using Online Participatory Democracy platforms offer this kind of setting and are good use cases for automatic stance recognition systems.
In the context of the Touché Lab at CLEF 2023 [2], we are proposing a shared task on data coming from the platform used during the Conference for the Future of Europe [2] which was inaugurated in 2021, where users can submit proposals and comment over them in any of the 24 official EU languages. A particularity of this platform is the use of a Machine Translation system in order to give the possibility to the users to interact between each others in their native languages, leading to what we call Intra-Multilingual data: pairs of proposal and comment in different languages.
[1] https://clef2023.clef-initiative.eu/
[2] https://touche.webis.de/
[3] https://futureu.europa.eu/
Tasks: Given a proposal on a socially important issue, the task is to classify whether a comment is in favor, against, or neutral towards the proposal.
Subtask1: Cross-debate Stance Classification.
Subtask2: All-data-available Classification
Learn more about this and other argumentation- and causality-related tasks at https://touche.webis.de/
Data available at https://touche.webis.de/clef23/touche23-web/multilingual-stance-classificat…
Register via the CLEF website: https://clef2023-labs-registration.dei.unipd.it/
-------------------------------------------------------------------------------
Important Dates
-------------------------------------------------------------------------------
Now open: Registration
Jan. 15, 2023: Development data available
April 30, 2023: Test data available
May 2, 2023: Approaches submission on the test data
June 5, 2023: Participant paper submission
July 7, 2023: Camera-ready participant papers submission
Sep. 18-21, 2023: Conference
One of the conference days: Touché Workshop on Argument and Causal Retrieval
-------------------------------------------------------------------------------
Special Announcements
-------------------------------------------------------------------------------
Touché Open Source Proceedings
Touché will host a collection of software developed by participants at GitHub.
The Touché team invite you to publish your software too and invite software submissions using TIRA [ https://www.tira.io/ ].
In case of questions / suggestions / etc., please reach us at touche(a)webis.de.
Best regards,
CoFE Team @ Touché
Dear colleagues,
The Fourth Workshop on Insights from Negative Results in NLP Co-located
with EACL, May 2 or 6, 2023
First Call for Participation
Insights Website: <https://insights-workshop.github.io/
<https://insights-workshop.github.io/index>>
Contact email: insights-workshop-organizers(a)googlegroups.com
*Overview
Publication of negative results is difficult in most fields, but in NLP the
problem is exacerbated by the near-universal focus on improvements in
benchmarks. This situation implicitly discourages hypothesis-driven
research, and it turns creation and fine-tuning of NLP models into art
rather than science. Furthermore, it increases the time, effort, and carbon
emissions spent on developing and tuning models, as the researchers have no
opportunity to learn what has already been tried and failed.
This workshop invites both practical and theoretical unexpected or negative
results that have important implications for future research, highlight
methodological issues with existing approaches, and/or point out pervasive
misunderstandings or bad practices. In particular, the most successful NLP
models currently rely on different kinds of pretrained meaning
representations (from word embeddings to Transformer-based models like BERT
and GPT-3). To complement all the success stories, it would be insightful
to see where and possibly why they fail. Any NLP tasks are welcome:
sequence labeling, question answering, inference, dialogue, machine
translation - you name it.
A successful negative results paper would contribute one of the following:
** broadly applicable recommendations for training/fine-tuning, especially
if X that didn’t work is something that many practitioners would think
reasonable to try, and if the demonstration of X’s failure is accompanied
by some explanation/hypothesis;
** ablation studies of components in previously proposed models, showing
that their contributions are different from what was initially reported;
** datasets or probing tasks showing that previous approaches do not
generalize to other domains or language phenomena;
** trivial baselines that work suspiciously well for a given task/dataset;
** cross-lingual studies showing that a technique X is only successful for
a certain language or language family;
** experiments on (in)stability of the previously published results due to
hardware, random initializations, preprocessing pipeline components, etc;
** theoretical arguments and/or proofs for why X should not be expected to
work;
** demonstration of issues with data processing/collection/annotation
pipelines, especially if they are widely used;
** demonstration of issues with evaluation metrics (e.g. accuracy, F1 or
BLEU), which prevent their usage for fair comparison of methods.
* Important Dates
** Submission due: February 13, 2023
** Submission due for papers reviewed through ACL Rolling Review: March 17,
2023
** Notification of acceptance: March 13, 2023
** Camera-ready papers due: March 27, 2023
** Workshop: May 5 or 6, 2023
* Submission
Submission is electronic, using the Softconf START conference management
system.
Submission link: <https://softconf.com/eacl2023/insights2023/>
The workshop will accept short papers (up to 4 pages, excluding
references), as well as 1-2 page non-archival abstract submissions for
papers published elsewhere (e.g. in one of the main conferences or in
non-NLP venues). The goal of this event is to stimulate a meaningful
community-wide discussion of the deep issues in NLP methodology, and the
authors of both types of submissions will be welcome to take part in our
get-togethers.
The workshop will run its own review process, and papers can be submitted
directly to the workshop by Feb 13, 2023. It is also possible to submit a
paper accompanied with reviews from the ACL Rolling Review system by March
17, 2023. The submission deadline for ARR papers follows the ACL RR
calendar. Both research papers and abstracts must follow the ACL two-column
format. Official style sheets:
<https://www.overleaf.com/read/crtcwgxzjskr>
<https://github.com/acl-org/ACLPUB/tree/master/templates>
Please do not modify these style files, nor should you use templates
designed for other conferences. Submissions that do not conform to the
required styles, including paper size, margin width, and font size
restrictions, will be rejected without review.
* Multiple Submission Policy
The workshop cannot accept work for publication or presentation that will
be (or has been) published elsewhere and that have been or will be
submitted to other meetings or publications whose review periods overlap
with that of Insights. Any questions regarding submissions can be sent to
insights-workshop-organizers(a)googlegroups.com.
If the paper has been rejected from another venue, the authors will have
the option to provide the original reviews and the author response. The new
reviewers will not have access to this information, but the organizers will
be able to take into account the fact that the paper has already been
revised and improved.
* Anonymity Period
We are not enforcing any anonymity period.
* Presentation
All accepted papers must be presented at the workshop to appear in the
proceedings. Authors of accepted papers must notify the program chairs by
the camera-ready deadline if they wish to withdraw the paper. At least one
author of each accepted paper must register for the workshop.
Previous presentations of the work (e.g. preprints on arXiv.org) should be
noted in a footnote in the camera-ready version (but not in the anonymized
version of the paper).
The workshop will take place on May 2 or 6 2023. The workshop will be
hybrid with both in-person and virtual presentations.
* Organization Committee
** Shabnam Tafreshi, University of Maryland: ARLIS
** Arjun Reddy Akula, Google
** João Sedoc, New York University
** Anna Rogers, University of Copenhagen
** Aleksandr Drozd, RIKEN
** Anna Rumshisky, University of Massachusetts Lowell / Amazon Alexa
* Contact info
Any questions regarding the workshop can be sent to
insights-workshop-organizers(a)googlegroups.com.
Please continue reading about: Authorship, Citation and Comparison, Ethics
Policy, Reproducibility, Anonymity Period, and Presentation in the call for
paper page on our website: https://insights-workshop.github.io/2023/cfp/
Regards,
Insights 2023 Organizers
--
*Shabnam Tafreshi, PhD*
*Assistant Research Scientist*
*Computational Linguistics, NLP*
*UMD: ARLIS @ College Park*
*"All the problems of the world could be settled easily, if people only
willing to think."*
*-Thomas J. Watson*
Fully funded 4-year PhD position on NLP for video subtitling at the University of Amsterdam, Language Technology Lab. This is a collaboration with RTL and part of the LTP ROBUST program. The call text is below my signature, mirrored from the official listing:
https://vacatures.uva.nl/UvA/job/PhD-Candidate-in-Natural-Language-Processi…
Apply, only through the link above, before Feb 24. For more context, see also my web site https://vene.ro/jobs.html. For further questions, don’t hesitate to e-mail me—please include [PhD 11053] in the subject line so my filters can catch your email.
Vlad Niculae [he/him]
Asst. Prof. @ LTL, IvI, University of Amsterdam
https://vene.ro
---
PhD Candidate in Natural Language Processing for Video Subtitling
Faculteit/Dienst: Faculteit der Natuurw., Wiskunde & Informatica
Opleidingsniveau: Master
Functie type: Promotieplaats
Sluitingsdatum: 24 februari 2023
Vacaturenummer: 11053
We are inviting applications for a fully-funded, four-year PhD position in natural language processing for video subtitling. This is a collaboration between core Computer Science, Science, Technology, and Social Studies. Are you eager to work on applied research models for accessibility language technologies? Do you want to research controllability of language generation for generating adequate, appropriate, and faithful subtitles? This position might be the one for you!
What are you going to do?
You will be embedded in the Language Technology Lab (LTL) under the supervision of Dr. Vlad Niculae and lead a project to investigate and improve NLP generative models for semi-automatic subtitling for Dutch and English television and video-on-demand. As captions provide access to information to many, high quality and unbiased performance are of critical societal importance. Powerful speech recognition systems are available today and provide a solid basis, but do not solve subtitling. We aim towards a subtitling system that is:
contextualized: it uses speaker identity, available scripts, and visual cues for improved accuracy;
machine-in-the-loop: it quantifies its own uncertainty, giving control to expert human operators;
faithful: it maintains good performance across languages, topics, and speaker identities (such as gender, age, region).
The PhD position will be part of the large LTP ROBUST program “Trustworthy AI-based Systems for Sustainable Growth” consortium, comprising 17 universities, 19 industry partners, and 15 collaborating partners representing diverse stakeholder groups. You will gain valuable experience working with an industry partner and will be able to tap into a wealth of networking, career development, and training opportunities in conjunction with ICAI, the Innovation Center for Artificial Intelligence at the University of Amsterdam. You will be part of one of the 17 new ICAI labs, named TAIM (Trustworthy AI for Media Lab) consisting of 5 PhD students, who will collaborate on developing methods, metrics and tools to evaluate and improve diversity and inclusion in media.
Tasks and responsibilities
With our help and support, you will:
innovate in research on contextual, uncertainty-aware, faithful generative models of language for subtitling;
deploy prototypes and evaluate subtitling in the applied setting of RTL;
complete and defend your PhD thesis;
become an active participant in the research community and collaborate within and outside the TAIM lab and the Language Technology Lab;
publish and present work regularly at international conferences, workshops, and journals;
assist in educational tasks (labs / tutorials, supervising bachelor and Master projects.)
Additionally, you will have the opportunity to closely collaborate with a leading entertainment brand, RTL. You are expected to work at their premises one day per week in Hilversum and one day remotely, making use of their resources and deployment context.
We care strongly about respecting work-life balance and contractual hours.
What do you have to offer?
Your experience and profile:
A Master’s degree (completed or near completion) with a thesis in Natural Language Processing, Machine Learning, Computer Science, or similar relevant areas;
Serious interest in pursuing fundamental research with concrete applications;
A good background in Natural Language Processing, Machine Learning, and Deep Learning;
Advanced programming skills;
Professional command of the English language;
A commitment to maintaining an inclusive, collaborative, diverse, and supportive work environment.
Interdisciplinary collaborations and backgrounds are appreciated, especially along fields related to linguistics and communication science. Experience with using subtitles or similar accessibility language technologies is a pre. If this describes you, we encourage your application.
If you are interested but unsure if you are qualified, please contact Dr. Vlad Niculae before applying. If your Master’s degree is near completion, it must be completed before the start date. Knowledge of the Dutch language is not required for this position, but can help both for living in Amsterdam and for a good understanding of the video content. The UvA provides the opportunity to attend Dutch language classes.
Our offer
A temporary contract for 38 hours per week for the duration of 4 years (the initial contract will be for a period of 18 months and after satisfactory evaluation it will be extended for a total duration of 4 years). The preferred starting date is April 2023. Your work should lead to a dissertation (PhD thesis). We will draft an educational plan that includes attendance of courses and (international) meetings. We also expect you to assist in teaching undergraduates and master students.
The gross monthly salary, based on 38 hours per week and dependent on relevant experience, ranges between € 2,541 in the first year to € 3,247 in the last year (scale P). UvA additionally offers an extensive package of secondary benefits, including 8% holiday allowance and a year-end bonus of 8.3%. The UFO profile PhD Candidate is applicable. A favourable tax agreement, the ‘30% ruling’, may apply to non-Dutch applicants. The Collective Labour Agreement of Universities of the Netherlands is applicable.
Besides the salary and a vibrant and challenging environment at Science Park we offer you multiple fringe benefits:
232 holiday hours per year (based on fulltime) and extra holidays between Christmas and 1 January.
Multiple courses to follow from our Teaching and Learning Centre.
A complete educational program for PhD students.
Multiple courses on topics such as leadership for academic staff.
Multiple courses on topics such as time management, handling stress and an online learning platform with 100+ different courses.
7 weeks birth leave (partner leave) with 100% salary.
Partly paid parental leave.
The possibility to set up a workplace at home;
A pension at ABP for which UvA pays two third part of the contribution.
The possibility to follow courses to learn Dutch;
Help with housing for a studio or small apartment when you’re moving from abroad.
Are you curious to read more about our extensive package of secondary employment benefits, take a look here.
About us
The University of Amsterdam is the Netherlands' largest university, offering the widest range of academic programmes. At the UvA, 42,000 students, 6,000 staff members and 3,000 PhD candidates study and work in a diverse range of fields, connected by a culture of curiosity.
The Faculty of Science has a student body of around 8,000, as well as 1,800 members of staff working in education, research or support services. Researchers and students at the Faculty of Science are fascinated by every aspect of how the world works, be it elementary particles, the birth of the universe or the functioning of the brain.
The mission of the Informatics Institute (IvI) is to perform curiosity-driven and use-inspired fundamental research in Computer Science. The main research themes are Artificial Intelligence, Computational Science and Systems and Network Engineering. Our research involves complex information systems at large, with a focus on collaborative, data driven, computational and intelligent systems, all with a strong interactive component.
The Language Technology Lab (LTL) is a research group focusing on information access from natural language data. Our work ranges from basic research in natural language processing to key applications in human language technology, and covers areas such as machine translation, summarization, question answering, language modeling, and image captioning. LTL positions itself primarily in the AI research theme, with some links to the Data Science theme of the Informatics Institute.
You will be part of one of the 17 new ICAI labs, named TAIM (Trustworthy AI for Media Lab) consisting of 5 PhD students, who will collaborate on developing methods, metrics and tools to evaluate and improve diversity and inclusion in media. You are joining a unique team also including the Department of Advanced Computing Sciences at Maastricht University (UM) and media and entertainment company RTL Nederland.
The TAIM lab will bring together two of the strongest groups on personalization and recommender systems in the Netherlands (UM and UvA), with a leading media organization (RTL), to develop trustworthy and personalized media. The lab will focus on the development of media that is inclusive, informed by democratic norms, and in line with RTLs values to represent, and give a voice to, all of the Netherlands in the design of their personalization algorithms.
Want to know more about our organisation? Read more about working at the University of Amsterdam.
Any questions?
Do you have any questions or do you require additional information? Please contact:
E: Dr. Vlad Niculae, Assistant Professor.
Job application
If you feel the profile fits you, and you are interested in the job, we look forward to receiving your application. You can apply online via the button below. We accept applications until and including 24 February 2023.
Applications should include the following information (all files besides your CV should be submitted in one single pdf file):
a letter of motivation (max 2 pages) in which you:
motivate your choice for this position and your interest in the proposed project;
indicate your preferred starting date and availability;
sketch out some thoughts and ideas about tackling the project (not a fully-detailed or binding proposal).
a Curriculum Vitae (including start/end months of education and work experience);
a summary of, or a copy of, your Master’s thesis;
a copy of your Master’s and Bachelor’s transcript/diploma.
If your MSc thesis is not finished or not in English, submit a brief summary in 1-4 pages. If your transcripts or diplomas are not available yet, please attach a note clearly stating which documents are not available, and when they will be available. This note can be in your own words.
Before submitting, please make sure to provide ALL requested documents mentioned above.
You can use the CV field to upload your resume as a separate pdf document. Use the Cover Letter field to upload the other requested documents, including the motivation letter, as one single pdf file.
Please do not submit applications by e-mail.
Only complete applications received within the response period via the link below will be considered.
The interviews will be held in March 2023.
The UvA is an equal-opportunity employer. We prioritize diversity and are committed to creating an inclusive environment for everyone. We value a spirit of enquiry and perseverance, provide the space to keep asking questions, and promote a culture of curiosity and creativity.
If you encounter Error GBB451/ GBC451, please try using a VPN connection when outside of the European Union. Please reach out directly to our to our HR Department directly. They will gladly help you continue your application.
No agencies please.