CoNLL 2026: 1st Call for Papers
San Diego, California, United States, July 3-4, 2026 (co-located with ACL)
https://www.conll.org/
SIGNLL invites submissions to the 30th Conference on Computational Natural Language Learning (CoNLL 2026). The focus of CoNLL is on theoretically, cognitively and scientifically motivated approaches to computational linguistics. We welcome work targeting any aspect of language and its computational modeling, including:
Computational Psycholinguistics, Cognition and Linguistics
Computational Usage-Based Grammars (e.g., Construction Grammars)
Computational Social Science and Sociolinguistics
Interaction and Dialogue
Language Acquisition, Learning, Emergence, and Evolution
Multimodality and Grounding
Typology and Multilinguality
Speech and Phonology
Syntax and Morphology
Lexical, Compositional and Discourse Semantics
Theoretical Analysis and Interpretation of ML Models for NLP
Resources and Tools for Scientifically Motivated Research
We do not restrict the topic of submissions to fall into this list. However, the submissions’ relevance to the conference’s focus on theoretically, cognitively and scientifically motivated approaches will play an important role in the review process.
Submissions
CoNLL will accept only direct submissions this year. Submission will be via OpenReview. We accept two types of submission: archival, and non-archival.
Non-archival submissions are not anonymous. We will accept submissions that fit into CoNLL’s scope (see above for a description) and have been published in 2024, 2025, and 2026 in relevant conferences (*ACL, COLING, NeurIPS, ICLR, CogSci, …) and journals (TACL, Computational Linguistics, other journals in the areas of interest for CoNLL).
Archival submissions must be anonymous and use the same template as the ACL 2026. Submitted papers may consist of up to 8 pages of content plus unlimited space for references. Authors of accepted papers will have an additional page to address reviewers’ comments in the camera-ready version (9 pages of content in total, excluding references). Optional anonymized supplementary materials and a PDF appendix are allowed. Please refer to the ACL website for more details on the submission format. Note that, unlike ACL, we do not mandate that papers have a discussion section of the limitations of the work. However, we strongly encourage authors to have such a section in the appendix.
Timeline
(All deadlines are 11:59pm UTC-12h, AoE)
Submission deadline (archival and non-archival): February 19 2026
Notification of acceptance: April 21 2026
Camera-ready papers due: May 12 2026
Conference: July 3-4, 2026
Further information
Further information will be announced in the 2nd Call for Papers.
CoNLL 2026 Co-Chairs
Claire Bonial, DEVCOM U.S. Army Research Laboratory
Yevgeni Berzak, Technion
Contact
Questions? E-mail conll.chairs(a)gmail.com
*apologies for cross-postings*
Joint CODI CRAC 2026 Workshop: call for papers
�
July 2026 - ACL 2026 - San Diego, USA
�
We are pleased to announce that we are organizing the second joint CODI-CRAC workshop which will be held during ACL 2026! More information at:
�
<https://sites.google.com/view/codi-crac2026/home> https://sites.google.com/view/codi-crac2026/ �
�
CODI-CRAC is officially endorsed by SIGDial, the ACL Special Interest Group on Discourse and Dialogue.
�
Deadline for CODI CRAC papers: March 20 2026
�
The workshop will also host the CRAC shared task. More information at:
�
- CRAC shared task: <https://ufal.mff.cuni.cz/corefud/crac26> https://ufal.mff.cuni.cz/corefud/crac26
Aims and scope
�
Recent breakthroughs in NLP and Large Language Models have dramatically expanded our systems’ abilities to interpret and generate not just sentences, but whole documents and conversations. This shift has renewed interest in discourse-level challenges, driving new work on inter-sentential phenomena, coherence modeling, long-form summarization, discourse-aware representation learning, and large-scale resources for discourse understanding and parsing.
�
Discourse sits at the intersection of many NLP subfields, as it is where context, structure, and meaning come together beyond single sentences. Discourse shapes how we capture coherence, cohesion, and inference across long texts, and brings together researchers tackling the shared challenges of document structure, long-range dependencies, and the requirements of extended context.
�
In 2025, we organized the first joint CODI-CRAC workshop. The CODI workshop on Computational Approaches to Discourse has been a forum for a broad range of work at the discourse level. The CRAC workshop on Computational Models of Reference, Anaphora and Coreference has been a primary venue for researchers interested in the computational modeling of reference phenomena. Together, these workshops have catalyzed work to advance research on discourse-level problems and have served as a forum for discussing suitable datasets and reliable evaluation methods.
�
This joint edition corresponds to the 7th CODI workshop and the 9th CRAC workshop. It will welcome contributions from all the areas below, including state-of-the-art textual NLU and NLG work using LLMs, as well as classic structured work on automatic discourse analysis -- corresponding to challenging tasks such as coreference resolution or discourse parsing -- to encourage interaction between communities. The workshop is set to host the 5th edition of the CRAC shared task on Multilingual Coreference Resolution.
�
The workshop is planned as a 1-day event that brings together different subcommunities. It will feature regular papers and invited talks by Ruihong Huang (Texas A&M University) and Philippe Laban (Microsoft Research). We also accept papers accepted at other major conferences for non-archival presentation, including Findings papers.
�
Topics of interest
�
We welcome papers on symbolic and probabilistic approaches, corpus development and analysis, as well as machine and deep learning approaches to discourse. We appreciate theoretical contributions as well as practical applications, including demos of systems and tools. The goal of the workshop is to provide a forum for the community of NLP researchers working on all aspects of discourse.
�
Topics of interest include, but are not limited to:
�
- discourse structure
- discourse connectives
- discourse relations
- annotation tools and schemes for discourse phenomena
- corpora annotated with discourse phenomena
- discourse parsing
- cross-lingual discourse processing
- cross-domain discourse processing
- anaphora and coreference resolution
- event coreference
- argument mining
- coherence modeling
- discourse and semantics
- discourse in applications such as machine translation, summarization, etc.
- evaluation methodology for discourse processing
- discourse pretraining tasks
- long-text modeling and generation
�
Submissions
Double submission of papers is allowed, but this information will need to be disclosed at submission time.
�
We solicit three categories of papers: �
* (1) Regular workshop papers �
* (2) Demos
* (3) Extended abstracts
Only regular workshop papers and demos will be included in the proceedings as archival publications. Extended abstracts are non-archival and will be included in the workshop program and handbook, but will not appear in the workshop proceedings.
1- Regular papers must describe original unpublished research. �
* Long papers may consist of up to 8 pages of content, plus unlimited pages for references.
* Short papers can be up to 4 pages, plus unlimited pages for references.
2- Demo submissions may describe systems, tools, visualizations, etc., and may consist of up to 4 pages, plus unlimited pages for references.
3- Extended abstracts can describe work in progress. They may be two pages long (without references). Extended abstracts are non-archival. They will be included in the workshop program and handbook, but will not appear in the workshop proceedings.
Each submission can contain unlimited pages for Appendices, but the paper submissions need to remain fully self-contained, as these supplementary materials are completely optional, and reviewers are not even asked to review them.
Final versions of all types of papers will be given one additional page of content.
Paper accepted or rejected at one of the main conferences
�
We also invite presentations of papers accepted at another main conference. They will be included in the workshop program and handbook, but will not appear in the workshop proceedings.
We also fast-track ARR papers with existing reviews.
Submission website
�
All submissions must be anonymous and follow the ACL 2026 formatting instructions described here: <https://aclrollingreview.org/cfp> https://aclrollingreview.org/cfp �
�
Submission website:
* CODI-CRAC: <https://softconf.com/acl2026/codi-crac2026/> https://softconf.com/acl2026/codi-crac2026/ �
Schedule
Important dates for the workshop are listed below:
* CODI-CRAC papers due: March 20
* Pre-reviewed ARR fast-track (with reviews, can be accepted or rejected): April 5 �
* Notification of acceptance: April 28, 2026
* Grant application: May 5, 2026
* Camera-ready paper due: May 12, 2026
* Pre-recorded video due: June 4, 2026
* Workshop dates: July 3 or 4, 2026
�
�All deadlines are 11.59 pm UTC -12h ("anywhere on Earth").
Invited Speakers
�
- Ruihong Huang, Texas University
- Philippe Laban, Microsoft Research
Organizers
�
- Chloé Braud, CNRS-IRIT
- Christian Hardmeier, IT University of Copenhagen
- Chuyuan (Lisa) Li, � University of British Columbia
- Jessy Li, University of Texas, Austin
- Sharid Loáiciga, University of Gothenburg
- Vincent Ng, University of Texas at Dallas
- Michal Novák, Charles University, Prague
- Maciej Ogrodniczuk, Institute of Computer Science, Polish Academy of Sciences
- Massimo Poesio, Queen Mary University of London and University of Utrecht
- Michael Strube, Heidelberg Institute for Theoretical Studies
- Amir Zeldes, Georgetown University, Washington DC
�
To contact the organizers, please send an email to: <mailto:codi-crac-workshop@googlegroups.com> codi-crac-workshop(a)googlegroups.com �
�
*Call for Papers*
*CHiPSAL 2026*
*Second Workshop on Challenges in Processing South Asian Languages @LREC
2026*
*16 May 2026*
We are pleased to announce the *Second Workshop on Challenges in Processing
South Asian Languages (CHiPSAL 2026)*, to be held in *hybrid mode on 16 May
2026*, co-located with *LREC 2026*.
CHiPSAL 2026 invites *substantial, original, and unpublished research* on
all areas of natural language processing, language resources, and
evaluation—covering spoken, signed, and multimodal language—as well as
system demonstrations. We welcome long and short papers addressing
challenges, resources, tools, and innovations for *South Asian languages*.
Topics include, but are not limited to:
- Encoding and Unicode issues
- Orthographic complexities
- Morphology and generation
- Dialectal variation and standardisation
- Code-mixing and multilingualism
- Building linguistic resources
- Speech recognition and synthesis
- Technology for linguistic heritage preservation
- Benchmarking models
- Large language models for South Asian languages
------------------------------
*Important Dates (AoE)*
- Submission Deadline: *20 February 2026*
- Notification of Acceptance: *20 March 2026*
- Camera-ready Papers:* 30 March 2026*
- Workshop (Hybrid): *16 May 2026*
------------------------------
*Submission Guidelines*
CHiPSAL 2026 accepts *oral*, *poster*, and *poster+demo* papers.
- Short papers: 4 pages
- Long papers: 8 pages
(Excluding ethics/limitations, references, acknowledgements, and
data/code availability statements)
All submissions must:
- Follow the *LREC 2026 stylesheet*: https://lrec2026.info/authors-kit/
- Be *fully anonymised* for double-blind review
- Include required ethics/limitations and data/code availability
statements
- Be self-contained (no appendices or supplementary files at submission)
- Be relevant to South Asian language processing
Papers must report *original, unpublished work*. Concurrent submissions
must be declared. Accepted papers will appear in the workshop proceedings.
Authors are encouraged to submit related resources to the *LRE Map*:
https://lremap.elra.info/
------------------------------
*Shared Tasks*
CHiPSAL 2026 also hosts two shared tasks:
1.
*Multimodal Hate and Sentiment Understanding in Low-Resource Memes*
https://sites.google.com/view/chipsal/shared-tasks_1/shared-task-1
*Multilingual ASR for South Asian Languages*
https://sites.google.com/view/chipsal/shared-tasks_1/shared-task-2
We warmly encourage participation in these shared tasks.
------------------------------
*More Information*
Workshop website: https://sites.google.com/view/chipsal/
*Organising Committee*
- Kengatharaiyer Sarveswaran, University of Jaffna, Jaffna, Sri Lanka.
- Ashwini Vaidya, Indian Institute of Technology, Delhi, India.
- Bal Krishna Bal, Kathmandu University, Kathmandu, Nepal.
- Surendrabikram Thapa, Virginia Tech, USA.
- Tafseer Ahmed, Mohammad Ali Jinnah University, Karachi, Pakistan.
Do not miss the opportunity to submit your work, strengthen the South Asian
NLP community, and support the development of language technology in one of
the world’s most populous and linguistically diverse regions.
We look forward to your contributions.
Best regards,
*The CHiPSAL 2026 Organising Committee*
--
*Dr Kengatharaiyer Sarveswaran (Sarves)*
Senior Lecturer (Grade-I) in Computer Science
Department of Computer Science
Faculty of Science
University of Jaffna
Sri Lanka
sarves.github.io
International Conference
'LAnguage TEchnologies for Low-resource Languages' (LaTeLL '2026)
Fes, Morocco
30 September, 1 and 2 October 2026
www.latell.org/2026/ [1]
Second Call for Papers
The conference
Natural Language Processing (NLP) has witnessed remarkable progress in
recent years, largely driven by the emergence of deep learning
architectures and, more recently, large language models (LLMs).
Nevertheless, these advances have disproportionately benefited
high-resource languages that possess abundant data for model training.
By contrast, low-resource languages which account for at least 85% of
the world's linguistic diversity and are often spoken by smaller or
marginalised communities, have not yet reaped the full benefits of
contemporary NLP technologies.
This imbalance can be attributed to several interrelated factors,
including the scarcity of high-quality training data, limited
computational and financial resources, and insufficient community
engagement in data collection and model development. Developing NLP
applications for low-resource languages poses major challenges,
particularly the need for large, well-annotated datasets, standardised
tools, and robust linguistic resources.
Although several workshops have previously addressed NLP for
low-resource languages, _LaTeLL_ represents the first international
conference dedicated specifically to the automatic processing of such
languages. The event aims to provide a forum for researchers to present
and discuss their latest work in NLP in general, and in the development
and evaluation of language models for low-resource languages in
particular.
Conference topics
We invite submissions on a broad range of themes concerning linguistic
and computational studies focusing on low-resource languages, including
but not limited to the following topics:
Language resources for low-resource languages
* Dataset creation and annotation
* Evaluation methodologies and benchmarks for low-resource settings
* Lexical resources, corpora, and linguistic databases
* Crowdsourcing and community-driven data collection
* Tools and frameworks for low-resource language processing
Core language technologies for low-resource languages
* Language modelling and pre-training for low-resource languages
* Speech recognition, text-to-speech, and spoken language
understanding
* Phonology, morphology, word segmentation, and tokenisation
* Syntax: tagging, chunking, and parsing
* Semantics: lexical and sentence-level representation
NLP Applications for low-resource languages
* Information extraction and named entity recognition
* Question answering systems
* Dialogue and interactive systems
* Summarisation
* Machine translation
* Sentiment analysis, stylistic analysis, and argument mining
* Content moderation
* Information retrieval and text mining
Multimodality and Grounding for low-resource languages
* Vision and language for low-resource contexts
* Speech and text multimodal systems
* Low-resource sign language processing
Ethics, Equity, and Social Impact for low-resource languages
* Bias and fairness in low-resource language technologies
* Sociolinguistic considerations in technology development
* Cultural appropriateness and sensitivity
Human-Centred Approaches in low-resource languages
* Usability and accessibility of low-resource language technologies
* Educational applications and language learning
* Community needs assessment and technology adoption
* User experience research in low-resource contexts
Multilinguality and Cross-Lingual Methods for low-resource languages
* Multilingual language models and their adaptation
* Code-switching and code-mixing
* Cross-lingual transfer learning in low-resource languages.
Special Theme Track 1 -- Building Applications Based on Large Language
Models for Low-Resource Languages
_LaTeLL'2026_ will feature a Special Theme Track dedicated to the
development of applications based on Large Language Models (LLMs) for
low-resource languages.
This track aims to explore innovative methodologies, architectures, and
tools that leverage the power of LLMs to enhance linguistic processing,
accessibility, and inclusivity for underrepresented languages.
Contributions are encouraged on topics such as model adaptation and
fine-tuning, multilingual and cross-lingual transfer, ethical and
fairness considerations, and the creation of datasets and benchmarks
that facilitate the integration of LLM-based solutions in low-resource
settings.
Special Theme Track 2 -- Modern Standard Arabic (MSA) and Arabic
Dialects
This special track addresses the unique challenges and opportunities in
processing Modern Standard Arabic (MSA) and the rich landscape of Arabic
dialects. The diglossic nature of Arabic, where the formal MSA coexists
with numerous, widely used spoken dialects, presents a significant
hurdle for NLP. While MSA is relatively well-resourced, Arabic dialects
are quintessential examples of low-resource languages, often lacking
standardised orthographies, annotated corpora, and dedicated processing
tools. This track invites submissions on novel research and resources
aimed at bridging this gap and advancing the state of the art in Arabic
language technology. Topics of interest include, but are not limited to:
* Dialect identification and classification
* Creation of corpora and lexical resources for Arabic dialects
* Machine translation between MSA and dialects, and across different
dialects
* Speech recognition and synthesis for dialectal Arabic
* Computational modelling of morphology, syntax, and semantics for
dialects
* NLP applications (e.g., sentiment analysis, NER) for dialectal
user-generated content
* Code-switching between Arabic dialects, MSA, and other languages
Submissions and Publication
_LaTeLL'2026_ welcomes high-quality submissions in English, which may
take one of the following two forms:
* Regular (long) papers:Up to eight (8) pages in length, presenting
substantial, original, completed, and unpublished research.
* Short (poster) papers:Up to four (4) pages in length, suitable for
concise or focused contributions, ongoing research, negative results,
system demonstrations, and similar work. Short papers will be presented
during a dedicated poster session.
The conference will not consider submissions consisting of abstracts
only.
All accepted papers (both long and short) will be published as
electronic proceedings (with ISBN) and made available on the conference
website at the time of the event. The organisers intend to submit the
proceedings for inclusion in the ACL Anthology.
Authors of papers receiving exceptionally positive reviews will be
invited to prepare extended and substantially revised versions for
submission to a leading journal in the field of Natural Language
Processing (NLP).
Further details regarding the submission process will be provided in the
follow up Calls for Papers.
The conference will also feature a Student Workshop, and awards will be
presented to the authors of outstanding papers.
Important dates
* Submissions due: 1 May 2026
* Reviewing process: 20 May - 20 June 2026
* Notification of acceptance: 25 June 2026
* Camera-ready due: 10 July 2026
* Conference camera-ready proceedings ready 10 July 2026
* Conference: 30 September, 1 October and 2 October 2026
Organisation
Conference Chair
Ruslan Mitkov (Lancaster University and University of Alicante)
Programme Committee Chairs
Saad Ezzini (King Fahd University of Petroleum & Minerals)
Salima Lamsiyah (University of Luxembourg)
Tharindu Ranasinghe (Lancaster University)
Organising Committee
Maram Alharbi (Lancaster University)
Salmane Chafik (Mohammed VI Polytechnic University)
Ernesto Estevanell (University of Alicante)
Further information and contact details
The follow-up calls will provide more details on the conference venue
and list keynote speakers and members of the programme committee once
confirmed.
The conference website is www.latell.org/2026/ [1] and will be updated
on a regular basis. For further information, please email
2026(a)latell.org
Registration will open in March 2026.
--
Amal Haddad Haddad (She/her)
Facultad de Traducción e Interpretación
Universidad de Granada |https://www.ugr.es/personal/amal-haddad-haddad
Lexicon Research Group |http://lexicon.ugr.es/haddad
Co-Convenor, BAAL SIG 'Humans, Machines,
Language'|https://r.jyu.fi/humala
Event Coordinator, BAAL SIG 'Language, Learning and Teaching'
===============
Cláusula de Confidencialidad: "Este mensaje se dirige exclusivamente a
su destinatario y puede contener información privilegiada o
confidencial. Si no es Ud. el destinatario indicado, queda notificado de
que la utilización, divulgación o copia sin autorización está prohibida
en virtud de la legislación vigente. Si ha recibido este mensaje por
error, se ruega lo comunique inmediatamente por esta misma vía y proceda
a su destrucción.
This message is intended exclusively for its addressee and may contain
information that is CONFIDENTIAL and protected by professional
privilege. If you are not the intended recipient you are hereby notified
that any dissemination, copy or disclosure of this communication is
strictly prohibited by law. If this message has been received in error,
please immediately notify us via e-mail and delete it"
===============
Links:
------
[1] http://www.latell.org/2026/
There are two open PhD positions in Natural Language Processing available at the Institute for Computer Science at Leipzig University, in the group of Leonie Weissweiler.
Potential research topics include but aren’t limited to:
- Linguistic Interpretability
- Multilingual Evaluation
- Computational Typology
Positions are fully funded for at least three years and will be affiliated with the ScaDS.AI graduate school. Ideal PhD candidates have a master's degree in computational linguistics, computer science or a related discipline.
Positions: Full-time (TV-L E13) for 3 years
Preferred start date: 1st of April 2026
More information: https://leonieweissweiler.github.io/phd_leipzig.pdf
Application deadline: 15th of January 2026
-----------------------------------------------------------------------------
Call for submissions
1st International Workshop on Quality in Large Language Models and
Knowledge Graphs
In conjunction with EDBT/ICDT 2026
QuaLLM-KG @ EDBT/ICDT 2026
24 March 2026, Tampere, Finland
Website: https://quallmkg2026.github.io/
-----------------------------------------------------------------------------
**** Goal ****
QuaLLM-KG aims to bring together researchers and practitioners working
on quality issues at the intersection of large language models and
knowledge graphs. The workshop focuses on theories, methods, and
applications for assessing, improving, and monitoring the quality of
LLMs and KGs.
**** Important Dates ****
- Submission deadline: January 15th, 2026
- Notification: February 8th, 2026
- Camera-ready: February 20th, 2026
**** Topics ****
* Quality in Knowledge Graphs
- Accuracy, consistency, completeness, freshness
- Schema validation, constraint checking, error detection
- Entity resolution, link prediction, ontology alignment
- Provenance, explainability, trust in KG data
- KG quality in dynamic and large-scale settings
* Quality in Large Language Models
- Hallucination reduction & factual grounding
- Bias detection and mitigation
- Metrics & benchmarks for quality assessment
- Uncertainty estimation, calibration, interpretability
* Synergies Between KGs and LLMs
- KG-based grounding and fact-checking for LLMs
- LLM-based KG enrichment, extraction, entity linking
- Quality-driven prompting and fine-tuning
- Hybrid KG–LLM architectures for quality assurance
- Evaluation frameworks for integration and consistency
* Benchmarks and Evaluation Frameworks
- Datasets and metrics for KG & LLM quality
- Tools for monitoring, validation, maintenance
- Reproducibility, transparency, responsible AI
* Applications and Case Studies
- Scientific, industrial, enterprise use cases
- Quality at scale
- Human-in-the-loop quality control
**** Submissions ****
We invite submissions of full papers (up to 8 pages, excluding
references) and short papers describing work in progress, systems,
demos/systems/applications,
or vision/innovative ideas (up to 4 pages, excluding references).
Submissions should be in the CEUR-WS proceedings template.
Accepted papers will be published in the CEUR Workshop proceedings
(CEUR-WS.org).
**** Workshop Organizers ****
- Soror Sahri, Université Paris Cité, France
- Sven Groppe, University of Lübeck, Germany
- Farah Benamara, University of Toulouse, France & IPAL-CNRS, Singapore
--
========================
Farah Benamara Zitoune
Professor in Computer Science, Université de Toulouse
IRIT and IPAL-CNRS Singapore
118 Route de Narbonne, 31062, Toulouse.
Tel : +33 5 61 55 77 06
http://www.irit.fr/~Farah.Benamara
==================================
The next meeting of the Edge Hill Corpus Research Group will take place online (MS Teams) on Friday 19 December 2025, 2:00-3:30 pm (GMT<https://time.is/United_Kingdom>).
Topic: Philosophies of Language and Corpus Linguistics
Speaker: Alan Partington (SiBol Group / CoLiTec)
Title: Language Distrusted, Language Ignored, Language Recovered: From Plato to Corpus Linguistics and Beyond
The abstract and registration link are here: https://sites.edgehill.ac.uk/crg/next
Attendance is free. Registration closes on Wednesday 17 December.
If you have problems registering, or have any questions, please email the organiser, Costas Gabrielatos (gabrielc(a)edgehill.ac.uk<mailto:gabrielc@edgehill.ac.uk>).
________________________________
Edge Hill University<http://ehu.ac.uk/home/emailfooter>
Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter>
University of the Year, Educate North 2021/21
________________________________
This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>
CFP: LT4HALA 2026 - The Fourth Workshop on Language Technologies for Historical and Ancient Languages
Website: https://circse.github.io/LT4HALA/2026/
Date: Monday, May 11 2026
Place: co-located with LREC 2026, May 11-16, Palma, Mallorca (Spain)
Submission page: TBA
DESCRIPTION
LT4HALA 2026 is a one-day workshop that seeks to bring together scholars who are developing and/or are using Language Technologies (LTs) for historically attested languages, so to foster cross-fertilization between the Computational Linguistics community and the areas in the Humanities dealing with historical linguistic data, e.g. historians, philologists, linguists, archaeologists and literary scholars. LT4HALA 2026 follows LT4HALA 2020, 2022, 2024 that were organized in the context of LREC 2020, LREC 2022 and LREC-COLING 2024, respectively. Despite the current availability of large collections of digitized texts written in historical languages, such interdisciplinary collaboration is still hampered by the limited availability of annotated linguistic resources for most of the historical languages. Creating such resources is a challenge and an obligation for LTs, both to support historical linguistic research with the most updated technologies and to preserve those precious linguistic data that survived from past times.
Relevant topics for the workshop include, but are not limited to:
* creation and annotation of linguistic resources (both lexical and textual);
* role of digital infrastructures, such as CLARIN, in supporting research based on language resources for historical and ancient languages;
* handling spelling variation;
* detection and correction of OCR errors;
* deciphering;
* morphological/syntactic/semantic analysis of textual data;
* adaptation of tools to address diachronic/diatopic/diastratic variation in texts;
* teaching ancient languages with LTs;
* NLP-driven theoretical studies in historical linguistics;
* NLP-driven analysis of literary ancient texts;
* evaluation of LTs designed for historical and ancient languages;
* LLMs for the automatic analysis of ancient texts.
SHARED TASKS
LT4HALA 2026 will also host:
* the 4th edition of EvaLatin<https://circse.github.io/LT4HALA/2026/EvaLatin>, a campaign entirely devoted to the evaluation of NLP tools for Latin. This new edition will focus on two tasks: dependency parsing and Named Entity Recognition. Dependency parsing will be based on the Universal Dependencies framework.
* the 5th edition of EvaHan<https://circse.github.io/LT4HALA/2026/EvaHan>, the campaign for the evaluation of NLP tools for Ancient Chinese. EvaHan 2026 will focus on Ancient Chinese OCR (Optical Character Recognition) Evaluation.
* the 2nd edition of EvaCun<https://circse.github.io/LT4HALA/2026/EvaCun>, the campaign for the evaluation of Ancient Cuneiform Languages, with shared tasks on transliteration normalization, morphological analysis and lemmatization, Named Entity Recognition of Akkadian and/or Sumerian.
SUBMISSIONS
Submissions should be 4 to 8 pages in length and follow the LREC 2026 stylesheet (see below). The maximum number of pages excludes potential Ethics Statements and discussion on Limitations, acknowledgements and references, as well as data and code availability statements. Appendices or supplementary material are not permitted during the initial submission phase, as papers should be self-contained and reviewable on their own.
Papers must be of original, previously unpublished work. Papers must be anonymized to support double-blind reviewing. Submissions thus must not include authors’ names and affiliations. The submissions should also avoid links to non-anonymized repositories: the code should be either submitted as supplementary material in the final version of the paper, or as a link to an anonymized repository (e.g., Anonymous GitHub or Anonym Share). Papers that do not conform to these requirements will be rejected without review.
Submissions should follow the LREC stylesheet, which is available on the LREC 2026 website on the Author’s kit page<https://lrec2026.info/authors-kit/>.
Each paper will be reviewed by three independent reviewers.
Accepted papers will appear in the workshop proceedings, which include both oral and poster papers in the same format. Determination of the presentation format (oral vs. poster) is based solely on an assessment of the optimal method of communication (more or less interactive), given the paper content.
As for the shared tasks, participants will be required to submit a technical report for each task (with all the related sub-tasks) they took part in. Technical reports will be included in the proceedings as short papers: the maximum length is 4 pages (excluding references) and they should follow the LREC 2026 official format. Reports will receive a light review (we will check for the correctness of the format, the exactness of results and ranking, and overall exposition). All participants will have the possibility to present their results at the workshop. Reports of the shared tasks are not anonymous.
WORKSHOP IMPORTANT DATES
17 February 2026: submissions due
13 March 2026: reviews due
16 March 2026: notifications to authors
27 March 2026: camera-ready due
Shared tasks deadlines are available in the specific web pages: EvaLatin<https://circse.github.io/LT4HALA/2026/EvaLatin>, EvaHan<https://circse.github.io/LT4HALA/2026/EvaHan>, EvaCun<https://circse.github.io/LT4HALA/2026/EvaCun>.
Identify, Describe and Share your LRs!
When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones).
[http://static.unicatt.it/ext-portale/5xmille_firma_mail_2023.jpg] <https://www.unicatt.it/uc/5xmille>
Dear colleagues,
We are writing to invite your collaboration in a community-driven initiative
to develop annotation schemas for scientific process descriptions in
research articles. The effort is inspired by the spirit of schema.org
<https://schema.org/> , but focuses specifically on capturing experimental
and simulation workflows across scientific domains. The resulting schemas
will be openly published as templates in the Open Research Knowledge Graph
(ORKG, <https://orkg.org/> https://orkg.org/) and will form the basis of a
paper planned for Nature Scientific Data <https://www.nature.com/sdata/> .
Motivation
Scientific papers describe complex processes-e.g., ALD and CVD in materials
science, PCR and CRISPR in molecular biology, tensile and fatigue testing in
engineering, leaching experiments in environmental science, RCTs and
cognitive tasks in psychology-using highly variable narrative text. This
variability makes it difficult to:
* design consistent, interoperable annotation guidelines,
* build cross-domain corpora of scientific methods,
* compare and align experimental setups across papers, and
* create FAIR, reusable metadata about how studies are actually
carried out.
Our goal is to define annotation schemas for these processes (inputs,
conditions, outputs, roles, and relations) and to populate them from
full-text articles. These schemas and resulting corpora are intended as
shared resources for corpus linguistics, NLP, scientific text mining, and
downstream applications.
Why Collaborate
We are seeking contributors who can:
* provide collections of full-text articles (~50+) describing a
specific experimental or simulation process in their field,
* offer expert feedback on automatically mined process schemas, or
* run the schema-miner workflow themselves (with our support) and help
refine the resulting schema.
Individual or small-team participation is welcome, and co-authorship
opportunities are available depending on involvement.
A wide variety of processes can be included-thin-film deposition, synthetic
chemistry reactions, gene editing workflows, fatigue testing, soil leaching
experiments, drug dissolution assays, fMRI tasks, cognitive experiments, and
many more.
A broader (non-exhaustive) list is here:
<https://docs.google.com/document/d/1iyL1l9vCXhnQ0To7j79vlr-pW4JvPlQC95svygq
RDfg/edit>
https://docs.google.com/document/d/1iyL1l9vCXhnQ0To7j79vlr-pW4JvPlQC95svygqR
Dfg/edit
How to Participate
Please register your interest using this short form:
<https://forms.gle/9WEdouw4yMyNHcn19> https://forms.gle/9WEdouw4yMyNHcn19
We will notify selected contributors by January 31, 2026. Data collection
and schema mining will conclude by April 30, 2026, followed by manuscript
preparation.
We hope members of this community will consider contributing to this effort
to develop shared annotation schemas and corpora of scientific process
descriptions-a step toward more comparable, analyzable, and reusable
scientific text resources. Also please help us spread the word!
Best regards,
Jennifer D'Souza
TIB - Leibniz Information Centre for Science and Technology
(on behalf of the schema-miner coordination team)
[apologies for cross posting]
DeTermIt! Workshop @ LREC 2026
Second Workshop on Evaluating Text Difficulty in a Multilingual Context
Location: Palau de Congressos de Palma, Palma de Mallorca (Spain)
#####################
First Call for Papers
Schedule
- Paper submissions: 23 February 2026
- Notification of acceptance: 13 March 2026
- Camera-ready due: 30 March 2026
- Workshop: one of 11, 12, or 16 May 2026 (half-day)
All deadlines are 11:59PM UTC-12:00 AoE (“Anywhere on Earth”)
For more information, please visit:
Website: https://determit2026.dei.unipd.it/
#####################
In today’s interconnected world, where information dissemination knows no linguistic bounds, it is crucial to ensure that knowledge is accessible to diverse audiences, regardless of language proficiency and domain expertise. Automatic Text Simplification (ATS) and text difficulty assessment are central to this goal, especially in the age of Large Language Models (LLMs) and Generative AI (GenAI), which increasingly mediate access to information.
The second edition of the DeTermIt! workshop focuses on the evaluation and modeling of text difficulty in multilingual, terminology-rich contexts, with a particular emphasis on the interaction between:
- text simplification,
- terminology and conceptual complexity, and
- LLM/GenAI-based generation and rewriting.
The 2026 edition builds on the first DeTermIt! workshop held at LREC-COLING 2024 (https://determit2024.dei.unipd.it/), as well as related initiatives such as the CLEF SimpleText track (https://simpletext-project.com/), which provides reusable data and benchmarks for scientific text summarization and simplification. DeTermIt! 2026 aims to bring together researchers and practitioners interested in terminology-aware simplification, lexical and conceptual difficulty, and evaluation protocols for GenAI systems.
We welcome contributions that address theoretical, methodological, and applied aspects of text difficulty, including resource creation and evaluation (e.g., corpora, datasets, and benchmarks), with a focus on how linguistic complexity, specialized terminology, and domain knowledge interact with human understanding. In particular, we encourage work that explores how LLMs and GenAI can be evaluated, constrained, or guided to produce readable, faithful, and accessible texts.
#####################
Topics of Interest
#####################
We invite submissions on (but not limited to) the following themes:
1. Theoretical and Modeling Perspectives
- Cognitive and linguistic models of text and lexical complexity.
- Multilingual readability and text difficulty prediction.
- Modeling conceptual difficulty and domain-specific terminology.
- Theoretical connections between lexicography, terminology, and text simplification.
2. Terminology and Conceptual Complexity
- Identification and classification of specialized terms and concepts.
- Estimation of term difficulty for lay readers and second language learners.
- Use of terminological databases, ontologies, and knowledge graphs in simplification pipelines.
- Methods for adapting domain-specific terminology for accessible communication (e.g., in medicine, law, technology).
3. Generative and Explainable AI for Text Simplification
- LLM- and GenAI-based approaches to text simplification and paraphrasing.
- Terminology-Augmented Generation (TAG) and term-preserving simplification.
- Evaluation of GenAI outputs: readability, factuality, terminology fidelity, and hallucination analysis.
- Readability-controlled or difficulty-controlled generation; controllable simplification.
- Human-centered and explainable approaches to text accessibility in GenAI systems.
4. Resources, Benchmarks, and Evaluation Frameworks
- Corpora, annotation schemes, and benchmarks for text difficulty and simplification.
- Datasets and methods for evaluating terminology-aware simplification and explanation.
- FAIR and reusable resources for multilingual text accessibility.
- Evaluation protocols and metrics for cross-lingual and cross-domain simplification and GenAI-based rewriting.
5. Applications and Case Studies
- Domain-specific simplification (e.g., healthcare, legal, scientific communication).
- Tools and systems for educational settings, language learning, or accessible communication.
- User studies, human evaluation setups, and mixed-method approaches to assessing text difficulty and GenAI-assisted simplification.
- Industrial and real-world experiences with integrating ATS and terminology into LLM-driven workflows.
#####################
Submission Guidelines
#####################
We invite original contributions, including research papers, case studies, negative results, and system demonstrations.
When submitting a paper through the START system of LREC 2026, authors will be asked to provide essential information about language resources (in a broad sense: data, tools, services, standards, evaluation packages, etc.) that have been used for the work described in the paper or are a new result of the research. ELRA strongly encourages all authors to share the resources described in their papers to support reproducibility and reusability.
Papers must be compliant with the stylesheet adopted for the LREC 2026 Proceedings (see https://lrec2026.info/authors-kit/).
The workshop proceedings will be published in the LREC 2026 workshop proceedings.
PAPER TYPES
We accept three types of submissions:
- Regular long papers – up to eight (8) pages of content, presenting substantial, original, completed, and unpublished work.
- Short papers – up to four (4) pages of content, describing smaller focused contributions, work in progress, negative results, or system demonstrations.
- Position papers – up to eight (8) pages of content, discussing key open challenges, methodological issues, and cross-disciplinary perspectives on text difficulty, terminology, and GenAI.
References do not count toward the page limits.
#####################
Organizers
#####################
Chairs
Giorgio Maria Di Nunzio, University of Padua, Italy
Federica Vezzani, University of Padua, Italy
Liana Ermakova, Université de Bretagne Occidentale, France
Hosein Azarbonyad, Elsevier, The Netherlands
Jaap Kamps, University of Amsterdam, The Netherlands
Scientific Committee
Florian Boudin - Nantes University, France
Lynne Bowker - University of Ottawa, Canada
Sara Carvalho - Universidade NOVA de Lisboa / Universidade de Aveiro, Portugal
Rute Costa - Universidade NOVA de Lisboa, Portugal
Eric Gaussier - University Grenoble Alpes, France
Natalia Grabar - CNRS, France
Ana Ostroški Anić - Institute of Croatian Language and Linguistics, Croatia
Tatiana Passali - Aristotle University of Thessaloniki
Grigorios Tsoumakas - Aristotle University of Thessaloniki
Sara Vecchiato - University of Udine, Italy
Cornelia Wermuth - KU Leuven, Belgium
#####################
Contact
#####################
For inquiries, please contact:
giorgiomaria.dinunzio(a)unipd.it <mailto:giorgiomaria.dinunzio@unipd.it>