LaTeCH-CLfL 2025:
The 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
to be held on May 3rd or 4th, 2025 in conjunction with NAACL 2025 <https://2025.naacl.org/> in Albuquerque, NM.
https://sighum.wordpress.com/latech-clfl-2025/
First Call for Papers (with apologies for cross-posting)
Organisers: Diego Alves, Yuri Bizzoni, Stefania Degaetano-Ortlieb, Anna Kazantseva, Janis Pagel, Stan Szpakowicz
LaTeCH-CLfL 2025 is the ninth in a series of meetings for NLP researchers who work with data from the broadly understood arts, humanities and social sciences, and for specialists in those disciplines who apply NLP techniques in their work. The workshop continues a long tradition of annual meetings. The SIGHUM Workshops on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH) ran ten times in 2007-2016. The five Workshops on Computational Linguistics for Literature (CLfL) took place in 2012-2016. The first eight joint workshops (LaTeCH-CLfL) were held in 2017-2024.
Topics and content
In the Humanities, Social Sciences, Cultural Heritage and literary communities, there is increasing interest in, and demand for, NLP methods for semantic and structural annotation, intelligent linking, discovery, querying, cleaning and visualization of both primary and secondary data. This is even true of primarily non-textual collections, given that text is also the pervasive medium for metadata. Such applications pose new challenges for NLP research: noisy, non-standard textual or multi-modal input, historical languages, vague research concepts, multilingual parts within one document, and so no. Digital resources often have insufficient coverage; resource-intensive methods require (semi-) automatic processing tools and domain adaptation, or intense manual effort (e.g., annotation).
Literary texts bring their own problems, because navigating this form of creative expression requires more than the typical information-seeking tools. Examples of advanced tasks include the study of literature of a certain period, author or sub-genre, recognition of certain literary devices, or quantitative analysis of poetry.
Topics of interest include, but are not limited to, the following:
• adaptation of NLP tools to Cultural Heritage, Social Sciences, Humanities and literature;
• automatic error detection and cleaning of textual data;
• complex annotation schemas, tools and interfaces;
• creation (fully- or semi-automatic) of semantic resources;
• creation and analysis of social networks of literary characters;
• discourse and narrative analysis/modelling, notably in literature;
• emotion analysis for the humanities and for literature;
• generation of literary narrative, dialogue or poetry;
• identification and analysis of literary genres;
• interpretability of large language models output for DH-related tasks (explainable AI);
• linking and retrieving information from different sources, media, and domains;
• low-resource and historical language processing;
• modelling dialogue literary style for generation;
• modelling of information and knowledge in the Humanities, Social Sciences, and Cultural Heritage;
• profiling and authorship attribution;
• search for scientific and/or scholarly literature;
• work with linguistic variation and non-standard or historical use of language.
Information for authors
We invite papers on original, unpublished work in the topic areas of the workshop. In addition to long papers, we will consider short papers and system descriptions (demos). We also welcome position papers. Please find submission requirements on the website https://sighum.wordpress.com/latech-clfl-2025/.
Important dates (tentative)
Workshop paper due: January 30, 2025
Notification of acceptance: March 1, 2025
Camera-ready papers due: March 10, 2025
Workshop date: May 3rd or 4th, 2025
More on the organizers
Diego Alves, Language Science and Technology, Saarland University
Yuri Bizzoni, Center for Humanities Computing / School for Communication and Culture, Århus University
Stefania Degaetano-Ortlieb, Language Science and Technology, Saarland University
Anna Kazantseva, National Research Council Canada
Janis Pagel, Department of Digital Humanities, University of Cologne
Stan Szpakowicz, School of Electrical Engineering and Computer Science, University of Ottawa
Contact
latech-clfl(a)googlegroups.com <mailto:latech-clfl@googlegroups.com>
AbjadNLP 2025 [1]
The 1st Workshop on NLP for Languages Using Arabic Script
https://wp.lancs.ac.uk/abjad/cfp/
CALL FOR PAPERS
CALL FOR PAPERS: THE 1ST WORKSHOP ON NLP FOR LANGUAGES USING ARABIC
SCRIPT (ABJADNLP 2025)
Co-located with COLING 2025 Conference, Abu Dhabi, UAE (19-20 January
2025)
Submission URL [2]
AbjadNLP is dedicated to advancing innovation and gaining deeper
insights into Natural Language Processing (NLP) for languages that use
the Arabic script. Our primary focus is on Abjad and Ajami languages
that utilise the Arabic script or its variations. Traditionally
associated with Semitic languages, Abjad scripts represent consonants in
every syllable. In contrast, Ajami scripts denote the alphabetic use of
the Arabic script in various African contexts, representing non-Arabic
languages. We are interested in research on languages that fall under
the Abjad or Ajami categories that use the Arabic script or any
variations of it.
We invite contributions, discussions, and explorations that delve deep
into the unique linguistic structures, resources, challenges, and
untapped potential presented by Abjad and Ajami languages within the
realm of NLP and language resources. Our goal is to create synergies
among researchers by addressing the diverse phenomena and challenges
inherent in these rich linguistic traditions.
The workshop is proud to highlight our connections with the Masakhane
NLP community and collaborations with institutions worldwide, such as
COMSATS on Urdu, and the long-standing UCREL NLP Group at Lancaster
University, whose work encompasses over 20 languages worldwide,
including Abjad and Ajami languages.
Note: We chose the name Abjad for simplicity, but our focus includes
Abjad and other languages that have adopted the Arabic and Perso-Arabic
scripts, as well as Ajami languages. We acknowledge that Sorani Kurdish,
when written in Arabic script, follows an alphabet style rather than an
Abjad style.
TOPICS OF INTEREST:
* Core Technologies: morphological analysis, disambiguation,
tokenisation, POS tagging, named entity detection, chunking, parsing,
semantic role labelling, sentiment analysis, language modelling, etc.
* Applications: machine translation, speech recognition, speech
synthesis, optical character recognition, assistive technologies, social
media, etc.
* Resources and Tools: dictionaries, annotated data, corpora,
orthography descriptions, font technology, glyph rendering, text input
methodologies, spell-checking, speech-to-text solutions, BLARK
descriptions, open access corpora.
* Cultural and Sociolinguistic Considerations: text processing,
transliteration challenges, and solutions, cultural contexts in NLP
applications.
SUBMISSION GUIDELINES:
We follow the COLING 2025 standards for submission format and
guidelines. Submissions should conform to the following types:
* Long papers: Up to eight (8) pages, presenting substantial,
original, completed, and unpublished work.
* Short papers: Up to four (4) pages, describing a small focused
contribution, negative results, system demonstrations, etc.
KEY DATES:
* 1st Call for Papers Announcement: 16 July 2024
* 2nd Call for Papers Announcement: 16 August 2024
* Paper Submission Deadline: 2 December 2024
* Notification of Paper Acceptance: 6 December 2024
* Camera-ready Paper Deadline: 13 December 2024
* Workshop Date: 19 or 20 January 2025
ORGANISING COMMITTEE:
General Chair: Mo El-Haj, Lancaster University
Programme Chairs:
* Hugh Paterson III, Collaborative Scholar
* Saad Ezzini, Lancaster University
* Ignatius Ezeani, Lancaster University
Review Committee:
* Mahum Hayat Khan, University of La Rioja
* Muhammad Sharjeel, COMSATS University Islamabad
Publication Chair: Sina Ahmadi, University of Zurich
Publicity Chairs:
* Cynthia Amol, Maseno University
* Amal Haddad Haddad, University of Granada
* Jaleh Delfani, University of Surrey
Advisory Committee:
* Ruslan Mitkov, Lancaster University
* Paul Rayson, Lancaster University
--
Amal Haddad Haddad (She/her)
Facultad de Traducción e Interpretación
Universidad de Granada |https://www.ugr.es/personal/amal-haddad-haddad
Lexicon Research Group |http://lexicon.ugr.es/haddad
Co-Convenor, BAAL SIG 'Humans, Machines,
Language'|https://r.jyu.fi/humala
Event Coordinator, BAAL SIG 'Language, Learning and Teaching'
===============
Cláusula de Confidencialidad: "Este mensaje se dirige exclusivamente a
su destinatario y puede contener información privilegiada o
confidencial. Si no es Ud. el destinatario indicado, queda notificado de
que la utilización, divulgación o copia sin autorización está prohibida
en virtud de la legislación vigente. Si ha recibido este mensaje por
error, se ruega lo comunique inmediatamente por esta misma vía y proceda
a su destrucción.
This message is intended exclusively for its addressee and may contain
information that is CONFIDENTIAL and protected by professional
privilege. If you are not the intended recipient you are hereby notified
that any dissemination, copy or disclosure of this communication is
strictly prohibited by law. If this message has been received in error,
please immediately notify us via e-mail and delete it"
===============
Links:
------
[1] https://wp.lancs.ac.uk/abjad/
[2] https://softconf.com/coling2025/AbjadNLP25/
NAKBA-NLP 2025
The 1st International Workshop on Nakba Narratives as Language Resources
Part of the COLING-2025 [1] Conference
Abu Dhabi, UAE (Fully Virtual)
January 20, 2025
CALL FOR PAPERS
We invite submissions for Nakba-NLP 2025, a workshop dedicated to the
exploration and preservation of Nakba narratives through the application
of artificial intelligence, natural language processing, and corpus
linguistics. All submitted papers should explain their relevance to the
topic of 'Nakba Narratives as Language Resources'. The organisers
reserve the right to reject any papers that incite hatred, refute
established facts, or undermine the suffering of individuals.
We seek contributions on the following issues of interest:
* Digitisation of oral and written narratives
* Creation and labeling of language corpora and datasets
* Digital archives, metadata, and semantic/content mark-up
* Annotation tools and annotation guidelines
* Document classification, topic modeling, and information retrieval
* Named entity recognition for identifying people, places,
organizations, and events
* Entity linking and relationship extraction
* Event detection and event argument extraction
* Knowledge Graphs and Linked Data
* Vocabularies, dictionaries, and ontologies
* Data visualisation
* Knowledge representation
* Machine translation, summarisation, and paraphrasing
* Natural Language Generation
* Large Language Models
* Sentiment analysis and emotional content extraction
* Discourse analysis (e.g., bias, offensive language, and
misinformation) related to Nakba narratives
* Voice & dialogue-based systems; ASR
* Palestinian dialects (written and spoken)
Participants are invited to use the following archives: Institute for
Palestine Studies [2], The Palestinian Museum [3], Nakba-Archive [4],
POHA [5],Alhaq [6],ICHR [7], as well as Wikipedia and the Wikidata
Knowledge Graph.
SUBMISSION DETAILS
All submitted papers must clearly state and explain their relevance to
the topic of 'Nakba Narratives as Language Resources'. The organisers
reserve the right to reject any papers that incite hatred, refute
established facts, or undermine the suffering of individuals.
Submissions may be of two types:
* Long papers - up to eight (8) pages maximum, presenting substantial,
original, completed, and unpublished work.
* Short papers - up to four (4) pages, describing a small focused
contribution, negative results, system demonstrations, etc.
The workshop supports the COLING anti-harassment policy: Policy [8].
COLING 2025 submission templates: Template [9].
Submission URL: Please submit here [10].
IMPORTANT DATES
* Submission Deadline: 25 November 2024
* Notifications of Acceptance: 5 December 2024
* Camera Ready Deadline: 13 December 2024 (cannot be changed).
Links:
------
[1] https://coling2025.org/
[2] https://www.palestine-studies.org/
[3] https://palmuseum.org/en
[4] https://www.nakba-archive.org/
[5] https://libraries.aub.edu.lb/poha/
[6] https://www.alhaq.org/
[7] https://www.ichr.ps/en
[8] https://coling2022.org/policy
[9] https://coling2025.org/calls/main_conference_papers/
[10] https://softconf.com/coling2025/Nakba-NLP25/
We are pleased to announce the Call for Abstracts for the 2025 IC2S2 conference taking place July 22-24 in Norrköping, Sweden
Conference website: https://www.ic2s2-2025.org
Important dates
Abstract submission deadline: February 24, 2025
Notification of acceptance: April 18, 2025
Early-bird registration deadline: May 9, 2025
Conference days: July 22-24, 2025
Call for abstracts
The International Conference on Computational Social Science (IC2S2) is the premier conference bringing together researchers from different disciplines interested in using computational and data-intensive methods to address relevant societal problems. IC2S2 hosts academics and practitioners in computational science, social science, complexity, and network science, and provides a platform for new research in the field of computational social science.
Submission instructions
Submissions are in the form of extended abstracts (max 2 pages) in PDF format, formatted according to the official LaTeX<https://www.ic2s2-2025.org/files/ic2s2_2025_latex_template.zip> or MS Word<https://www.ic2s2-2025.org/files/ic2s2_2025_word_template.docx> templates that can be downloaded from the submission website. The submission should include a title, a list of 5 keywords, and an extended abstract (serving as the main text of the submission). The abstract should outline the impact of the work, along with (if relevant) the main theoretical contribution, data and methods used, and findings. Authors are strongly encouraged to include figures and/or tables in their submission (note that figures will not count towards the page limit). Submitted abstracts will undergo a double-blind review process. Therefore, abstracts must be anonymized: do not include the author(s) names or affiliation(s) in the paper, and do not include funding or other acknowledgments. When submitting, authors will be also asked to provide a short summary paragraph that will be used during the review bidding phase. Submissions that violate these guidelines will be automatically rejected.
Submissions will be non-archival, and thus the presented work can be already published, in preparation for publication elsewhere, or ongoing research. Abstracts will be reviewed by multiple members of a Program Committee composed of experts in computational social science. The accepted contributions will be selected for one of the following presentations: (i) a lightning talk (~6 mins) in a plenary session, (ii) an oral presentation in parallel tracks (~15 mins), or (iii) a poster presentation session. Lightning talks will be preferentially assigned to those requesting this form of presentation at submission and to early career researchers. In order to be included in the program, at least one of the authors must register for the conference by the early-bird registration deadline.
Topics
We welcome submissions on any topic in the field of computational social science, including (a) work that advances methods and approaches for computational social science, (b) data-driven work that describes and discovers social and cultural phenomena or explains and estimates relations between them and other things, and (c) theoretical work that generates new insights, connections and frameworks for computational social science research. Researchers across disciplines, faculty, graduate students, industry researchers, policy makers, and nonprofit workers are all encouraged to submit computational data-driven research and innovative computational methodological or theoretical contributions on social phenomena for consideration. Topics include but are not limited to:
*
Network analysis of social systems
*
Large-scale social experiments
*
Empirically calibrated simulation models
*
Large language models for social research
*
Text analysis and natural language processing (NLP) of social phenomena
*
Analysis of meaning through computational analysis of text, images, audio, video, etc.
*
Computational methods to map and study cultural patterns and dynamics
*
Agent-based or other simulation of social phenomena
*
Methods and issues of social data collection
*
Images as social data
*
Causal inference and machine learning
*
Methods and analyses of biased, selective, or incomplete observational social data
*
Integration and triangulation of multi-modal social and cultural data
*
Methods and analyses for social information / digital communication dynamics
*
Neural network methods for social analysis and policy exploration
*
Reproducibility in computational social science research
*
Theoretical discussions/concepts in computational social science
*
Ethics of computational research on human behavior
*
Issues of inclusivity in computational social science
*
Methods and analyses of algorithmic accountability and trustworthiness
*
Novel digital data and/or computational analyses for addressing societal challenges
*
Social news curation and collaborative filtering
*
Building and evaluating socio-technical systems
*
Methods and analyses of integrated human-machine decision-making
*
Science and technology studies approaches to computational science work
*
Infrastructure to facilitate industry/academic cooperation in computational social science
*
Computational social science research in industry, government, and philanthropy
*
Practical problems in computational social science
Enquiries
For any questions regarding abstract submissions, please write to: ic2s2-2025(a)liu.se<mailto:ic2s2-2025@liu.se>
Please find attached a call for tutorials held on July 21 the day before the conference.
We look forward to receiving your submissions.
The organizers,
Marc Keuschnigg, Peter Hedström, Sonia Yeh, Nina Tahmasebi, Yuan Liao, and Martin Arvidsson
Nina N. Tahmasebi, Associate Professor
Change is Key! • University of Gothenburg
nina.tahmasebi(a)gu.se
https://changeiskey.org/https://languagechange.org/http://tahmasebi.se/https://gu-se.zoom.us/my/ninatahmasebi
“Intelligence + Effort =
Achievement"
S. Mendaglio
Dear Colleagues,
We're delighted to announce that registrations for the 22nd Australasian Language Technology Association Workshop - #ALTA2024 - are now open via Humanitix<https://events.humanitix.com/alta-2024?hxchl=mkt-sch>.
ALTA, aligned with the Association for Computational Linguistics (ACL), is the premier venue for research on natural language processing (NLP), information retrieval and extraction and related topics in Australasia. This year, we're delighted to host outstanding keynotes and panellists including Professor Eduard Hovy of the University of Melbourne, Professor Steven Bird of Charles Darwin University and Professor Hannah Suominen of the Australian National University, with others to be announced.
Our accepted papers this year were of a very high calibre and oral presentations will be delivered for long and short papers and for abstracts.
ALTA 2024 will take place 2nd-4th December in the award-winning Birch building at Australian National University's Acton campus, Canberra, Australian National Territory.
Website: https://alta2024.alta.asn.au
Schedule to be announced shortly.
With kind regards, on behalf of the ALTA 2024 Team:
Dr Gabriela Ferraro, General Chair
Professor Tim Baldwin, Program Chair
Dr Sergio José Rodríguez Méndez, Program Chair
Dr Nicholas Kuo, Program Chair
Dr Anton Malko, Publication Chair
Dr Dawei Chen, Technology Chair
A/Prof Shunichi Ishihara, Finance Chair
Charbel El-Khaissi, PhD candidate, Sponsorship Chair
Ned Cooper, PhD candidate, Local Chair
Kathy Reid, PhD candidate, Publicity Chair
*Call for Papers *
*
Slav-NLP: The10thWorkshoponNLP for Slavic languages
<http://bsnlp.cs.helsinki.fi/>
co-located with ACL 2025, Vienna, Austria
31 July or 1 August 2025
http://bsnlp.cs.helsinki.fi/ <http://bsnlp.cs.helsinki.fi/>
Submission Deadline: 27 April 2025
WORKSHOPDESCRIPTION
The 10th edition of the Slav-NLP Workshop at ACL 2025Sponsored by
SIGSLAV: The ACL Special Interest Group on Slavic NLP
Slavic languages play an important role due to their diverse cultural
heritage and wide use — over 400M speakers worldwide. Current political
and economic developments in Central/ Eastern Europe thrust the
Slavic-speaking societies — and their languages — into sharp focus,
especially in light of rapid technological advancements and expanding
consumer markets.
Research on theoretical and applied topics in the context of Slavic
languages is still lagging in the community. Linguistic phenomena that
are common to the Slavic languages — rich morphology, free word order,
etc. — make NLP for these languages a challenging task. The Slav-NLP
Workshop gathers researchers from academia and industry. It aims to
stimulate research in Slavic NLP, and foster the creation of tools and
resources. The Workshops provides a forum for exchange of ideas and
experience, discussing current challenges, and making the available
resources widely-known. The structural similarity, as well as the easily
recognizable core vocabulary and inflectional inventory spanning this
large language group creates a special environment, where researchers
can appreciate the shared problems and communicate naturally — despite
the lack of mutual intelligibility.We are glad to have an opportunity to
organize Slav-NLP again in Central Europe.
This Workshop addresses Natural Language Processing (NLP) for the Slavic
languages. The NLP tasks in urgent need of attention include:
*
language modeling,
*
morphological, syntactic and semantic analysis,
*
lexical semantics,
*
named-entity recognition,
*
text normalization and processing non-standard language,
*
coreference resolution,
*
information extraction,
*
question answering,
*
text summarization,
*
machine translation,
*
development of linguistic resources,
*
development and assessment of large language models,
*
text classification,
*
text generation,
*
disinformation detection,
*
fact verification,
*
sentiment analysis.
This Workshop continues the proud tradition established by the 9
previous (B)SNLP Workshops.
IMPORTANT DATES
*
Submission deadline: 27 April 2025
*
Pre-reviewed ARR commitment20 May 2025
*
Notification of acceptance: 27 May 2025
*
Camera-ready papers due: 3 June 2025
*
Workshop: 31 July or 1 August 2025
SHARED TASK
This year's Slav-NLP features a Shared Task on Detection and
Classification of Persuasion Techniquesin Slavic languages in two types
of texts: (a) parliamentary debateson highly-contested topics, and (b)
social media postsrelated to the spread of disinformation.
Information about the Shared Task is available on the Workshop’s Web page
SUBMISSION
At the Workshop’s Web page: bsnlp.cs.helsinki.fi
<http://bsnlp.cs.helsinki.fi/call-for-papers.html>
Workshop contact: bsnlp(a)cs.helsinki.fi
*
--
Roman Yangarber
Professor, University of Helsinki, Finland
Digital Humanities
INEQ: Helsinki Inequality Initiative
<https://helsinki.fi/en/ineq-helsinki-inequality-initiative> —
Linguistic Inequalities and Translation Technologies
------------------------------------------------------------------------
e-Learning & language learning
Language Learning Lab
Unioninkatu 40, Metsätalo A214
helsinki.fi/revita <https://www.helsinki.fi/revita>
helsinki.fi/language-learning-lab
<https://www.helsinki.fi/language-learning-lab>
mobile: +358 50 41 51 71 3
------------------------------------------------------------------------
RЯ
Language Learning Lab <https://www.helsinki.fi/language-learning-lab>
Utrecht University, The Netherlands
In NLP, there is a growing recognition that data quality is key to better language models, yet we still know very little about the link between data and model behavior. In this project, we will develop methods to measure the diversity of NLP datasets, assess the impact of diversity on NLP models, and improve data collection and model training.
As a PhD student, you will develop innovative methods to measure the diversity of NLP datasets. A major focus will be on measuring the dataset diversity from a sociolinguistic perspective, considering language variation – such as styles and dialects - and combining (socio)linguistic insights with neural language modeling. You will also draw from relevant disciplines, particularly the social sciences, that have developed measurement approaches for diversity. Furthermore, you will carry out experiments to assess the impact of data diversity on NLP models, with a focus on fairness and robustness, and investigate ways to leverage data diversity to improve NLP models.
You will join the NLP & Society Lab, headed by Dong Nguyen, where we work on a variety of topics, including computational sociolinguistics, analysis of online conversations, data-centered NLP, and evaluation of NLP models. We are part of the wider NLP group within the Department of Information and Computing Sciences of Utrecht University (UU), the Netherlands.
For more details and to apply, visit the link below:
https://www.uu.nl/en/organisation/working-at-utrecht-university/jobs/phd-po… (Deadline: Jan 5)
Contact: Dong Nguyen (d.p.nguyen(a)uu.nl)
ICLC-11
11TH INTERNATIONAL CONTRASTIVE LINGUISTICS CONFERENCE
Second Call for Abstracts
September 17–19, 2025
Prague, Czech Republic
The Faculty of Arts at Charles University in Prague is pleased to announce the 11th International Contrastive Linguistics Conference. The ICLC conference series, running since 1998, aims to promote fine-grained cross-linguistic research comprising two or more languages from a broad range of theoretical and methodological perspectives. Following the success of ICLC-10 in Mannheim 2023, ICLC-11 wants to bring together researchers from different linguistic subfields and neighbouring disciplines to continue the interdisciplinary dialog on comparing languages, to foster the development of an international community and to advance possible new areas of cross-linguistic research. See https://iclc11.ff.cuni.cz/ for more and note the submission deadline of February 24, 2025.
We invite abstracts on a broad range of topics, including but not limited to:
(1) Comparison of phenomena in two or more languages focused on any area and level of linguistic analysis:
* lexicon
* phonetics and phonology
* morphology, syntax and morphosyntax, linguistic complexity
* semantics, pragmatics, register and socio-cultural context
(2) Methodological challenges and solutions in cross-linguistic research:
* language corpora (multilingual, learner, and multimodal) and issues of linguistic annotation (e.g., Universal Dependencies)
* comparability issues, tertia comparationis, language universals; experimental and naturalistic interaction data
* AI and new digital tools in linguistic analysis
* low-resourced languages
(3) Contrastive linguistics in touch with related disciplines:
* generative, model-theoretic, functional or cognitive (e.g., constructional) approches
* historical, sociolinguistic and variationist perspectives; registers, multimodality, pragmatics, interculturality; language contact; language policy
* cognitive and psycholinguistic approaches to bilingualism and multilingualism; language acquisition, language teaching and learning
* translation studies
The abstracts should present empirical research, well-defined research questions or hypotheses, details of the research approach and methods, theoretical insights, and (preliminary or expected) results. For details see https://iclc11.ff.cuni.cz/calls-and-circulars/call-for-papers/.
PRELIMINARY PROGRAM
* Parallel Oral Sessions
* Poster Sessions
* Keynote Speakers:
Sabine De Knop (Université Saint-Louis, Bruxelles, Belgium)
Volker Gast (Friedrich-Schiller-University, Jena, Germany)
Dan Zeman (Charles University, Prague)
* Panel Discussion
IMPORTANT DATES
24.02.2025: Deadline for abstract submission
26.05.2025: Notification of acceptance
02.06.2025: Registration opens
16.06.2025: Deadline for revised abstract submission
30.06.2025: Last day for early bird registration
01.09.2025: Online registration closes
16.09.2025: Arrival, Registration, Get-together
17–19.09.2025: Conference
ORGANIZING COMMITTEE
* Mirjam Fried (chair) 1)
* Viktor Elšík 1)
* Jana Kocková 2)
* Michal Křen 1)
* Olga Nádvorníková 1)
* Alexandr Rosen 1)
1) Charles University, Faculty of Arts
2) Czech Academy of Sciences, Institute of Slavonic Studies
PROGRAM COMMITTEE: tba
CONTACT INFORMATION
Website: https://iclc11.ff.cuni.cz/
Email: iclc11(a)ff.cuni.cz
MultiGEC-2025 shared task: test phase officially open
We invite you to participate in the shared task on Multilingual Grammatical Error Correction, MultiGEC-2025, covering 12 languages: Czech, English, Estonian, German, Greek, Icelandic, Italian, Latvian, Russian, Slovene, Swedish and Ukrainian.
The task has just entered its test phase, during which participants are invited to submit their system output to CodaLab: https://codalab.lisn.upsaclay.fr/competitions/20500 . The system submission deadline is November 20.
The results will be presented on March 5th 2025, at the NLP4CALL workshop, colocated with the NoDaLiDa conference to be held in Estonia, Tallinn, on 2--5 March 2025. https://spraakbanken.gu.se/en/research/themes/icall/nlp4call-workshop-serie…
The publication venue for system descriptions will be the proceedings of the NLP4CALL workshop, also co-published in the ACL anthology.
Official system evaluation will be carried out on CodaLab (link comes later).
To register for/express interest in the shared task, please fill in this form (https://forms.gle/nTPfARVqy1XmqT4t6).
Note that you will be prompted to sign Terms of Use for the data at https://forms.gle/VLJ18WbwsxitEBYi7. The data access is personal, please do not forget to fill in the form.
The GitHub page for the shared task is https://github.com/spraakbanken/multigec-2025/; task description and general information are also available on https://spraakbanken.gu.se/en/compsla/multigec-2025
* TASK DESCRIPTION
In this shared task, your goal is to rewrite learner-written texts to make them grammatically correct or both grammatically correct and idiomatic, that is either adhering to the "minimal correction" principle or applying fluency edits.
For instance, the text
> My mother became very sad, no food. But my sister better five months later.
can be corrected minimally as
> My mother became very sad, and ate no food. But my sister felt better five months later.
or with fluency edits as
> My mother was very distressed and refused to eat. Luckily, my sister recovered five months later.
For fair evaluation of both approaches to the correction task, we will provide two evaluation metrics, one favoring minimal correction, one suited for fluency-edited output (read more under Evaluation).
We particularly encourage development of multilingual systems that can process all (or several) languages using a single model, but this is not a mandatory requirement to participate in the task.
* DATA
We provide training, development and test data for each of the languages. The training and development dataset splits are available through Github. Evaluation will be performed on a separate test set.
See website for more detailed information: https://github.com/spraakbanken/multigec-2025/
Note: The English data is expected a bit later.
* EVALUATION
During the shared task, evaluation will be based on cross-lingually applicable automatic metrics:
- reference-based:
- GLEU score
- Precision, Recall, F0.5 score
- reference-free: Scribendi score
After the shared task, we also plan on carrying out a human evaluation experiment on a subset of the submitted results.
* TIMELINE
- June 18, 2024 - first call for participation ✓
- September 20, 2024 - second call for participation ✓
- October 20, 2024 - third call for participation. Training and validation data released ✓
- October 31, 2024 - reminder. CodaLab opens for team registrations, validation phase starts ✓
- November 13, 2024 - test phase starts ✓
- November 20, 2024 - system submission deadline (system output)
- November 29, 2024 - results announced
- December 16, 2024 - paper submission deadline with system descriptions
- January 20, 2025 - paper reviews sent to the authors
- February 3, 2025 - camera-ready deadline
- March 5, 2025 - presentations of the systems at the NLP4CALL workshop
* PUBLICATION
We encourage you to submit a paper with your system description to the NLP4CALL workshop special track. We follow the same requirements for paper submissions as the NLP4CALL workshop, i.e. we use the same template and apply the same page limit. All papers will be reviewed by the organizing committee. Upon paper publication, we encourage you to share models, code, fact sheets, extra data, etc. with the community through GitHub or other repositories.
* ORGANIZERS
- Arianna Masciolini, University of Gothenburg, Sweden
- Andrew Caines, University of Cambridge, UK
- Orphée De Clercq, Ghent university, Belgium
- Joni Kruijsbergen, Ghent university, Belgium
- Murathan Kurfali, Stockholm University, Sweden
- Ricardo Muñoz Sánchez, University of Gothenburg, Sweden
- Elena Volodina, University of Gothenburg, Sweden
- Robert Östling, Stockholm University, Sweden
* DATA PROVIDERS
- Czech:
-- Alexandr Rosen, Charles University, Prague
- English:
-- Andrew Caines, University of Cambridge
- Estonian:
-- Mark Fishel, University of Tartu, Estonia
-- Kais Allkivi-Metsoja, Tallinn University, Estonia
-- Kristjan Suluste, Eesti Keele Instituut, Estonia
- German:
-- Andrea Horbach, IPN / CAU Kiel, Germany
-- Josef Ruppenhofer, FernUniversität in Hagen, Germany
-- Katrin Wisniewski, Universität Leipzig
-- Torsten Zesch, FernUniversität in Hagen, Germany
- Greek:
-- Alex Tantos, Aristotle University of Thessaloniki
-- Konstantinos Tsiotskas, Aristotle University of Thessaloniki
-- Vassilis Varsamopoulos, Aristotle University of Thessaloniki
-- Pinelopi Kikilintza, Aristotle University of Thessaloniki
-- Elena Drakonaki, Aristotle University of Thessaloniki
-- Eleni Tsourilla, Aristotle University of Thessaloniki
-- Despoina-Ourania Touriki, Aristotle University of Thessaloniki
- Icelandic:
-- Isidora Glisič, University of Iceland
- Italian:
-- Jennifer-Carmen Frey, Eurac Research Bolzano, Italy
-- Lionel Nicolas, Eurac Research Bolzano, Italy
- Latvian:
-- Roberts Darģis, University of Latvia
-- Ilze Auzina, University of Latvia
- Russian:
-- Alla Rozovskaya, City University of New York (CUNY), USA
- Slovene:
-- Špela Arhar Holdt, University of Ljubljana, Slovenia
-- Aleš Žagar, University of Ljubljana, Slovenia
- Swedish:
-- Arianna Masciolini, University of Gothenburg, Sweden
- Ukrainian:
-- Oleksiy Syvokon, Microsoft
-- Mariana Romanyshyn, Grammarly
* CONTACT
Please join the MultiGEC-2025 Google group (https://groups.google.com/g/multigec-2025) in order to ask questions, hold discussions and browse for already answered questions.