---------------------------------------------------------------------------------------------------------------------------
--- Doctoral and post-doctoral positions - AI4DH - University of
Ljubljana, Slovenia
---------------------------------------------------------------------------------------------------------------------------
The University of Ljubljana has *3 open positions (2 PhD and 1 Postdoc)*
in artificial intelligence for digital humanities (AI4DH) in the context
of the *European centre of excellence in AI4DH*, recently founded and
supported by Horizon Europe.
At the heart of the vibrant European capital city of Ljubljana, close to
both the Alps and the Mediterranean, you will be part of an AI research
group working, dedicated to AI research that can be applied to DH.
You will benefit from *competitive salaries* and *top-notch
infrastructure*, based in the faculty of computer and information
science, yet in an *interdisciplinary context*.
Full details are available here:
- *Posdoctoral position, for 2 years* with possible extension:
https://euraxess.ec.europa.eu/jobs/356445
- *Doctoral positions, for up to 4 years*:
https://euraxess.ec.europa.eu/jobs/356450
*Important dates:
* - NEW *information session* on Thursday 10 July 10am CEST (see
registration details on Euraxess)
- Application *deadline: 24 July,* CEST (Ljubljana time)
🚀 Call for Participation: DISRPT 2025 Shared Task on Discourse Relation Parsing and Treebanking.
🛎️ training data has been released and the submission is now open!
https://softconf.com/emnlp2025/disrpt2025/
In conjunction with CODI-CRAC & EMNLP 2025 - Suzhou, China, Nov. 5-9.
This year, we are organizing the fourth edition of the DISRPT shared task on discourse processing across formalisms, for a variety of languages and genres, with three subtasks:
* Task 1: Discourse segmentation
* Task 2: Connective identification
* Task 3: Relation classification
We will provide training, development and test datasets from (almost) all available languages in RST / eRST, SDRT, PDTB, ISO 24617, and discourse dependencies, using a uniform format. Because different corpora, languages, and frameworks use different guidelines, the shared task will promote the design of flexible methods for dealing with various guidelines, and will help to push forward the discussion of converging standards for discourse units. We will evaluate segmentation and connective detection in two different scenarios: with and without gold syntax. An automatically parsed version is provided for all corpora without a gold parse.
This year, the shared task will feature:
* The inclusion of more frameworks, with datasets from: RST / eRST, SDRT, PDTB, ISO 24617, and discourse dependencies * The inclusion of new corpora and new languages, some of them kept a surprise! * A unified set of labels for the discourse relations, to make easier the evaluation across datasets * A new constraint: only one multilingual model should be submitted per task, and it should be small (4B parameters max)! This will make our replication work easier, but more importantly, it will simplify using such a model and test the robustness of your solution.
We’re excited to announce the release of the training data for the DISRPT 2025 Shared Task! You can now access the data, format documentation, and tools on our GitHub 🔗 https://github.com/disrpt/sharedtask2025
The data covers five discourse frameworks — RST / eRST, PDTB, SDRT, and Discourse Dependencies — across 14 languages: Basque, Chinese, Czech, Dutch, English, Farsi, French, German, Italian, Portuguese, Russian, Spanish, Thai and Turkish Thai.
We invite researchers and teams interested in participating to register now. Registered participants will be added to our mailing list and receive all future updates.
📅 The full testing data will be released on July 14, 2025 — stay tuned!
To join the mailing list and stay informed, please email us at:
📧 disrpt_chairs(a)googlegroups.com
Let us know you're interested — we’d love to have you on board!
**Important dates**
* May 16 2025 – Sample data release * June 17 2025 – Training data release [NOW] * July 14 2025 – Test data release * August 1 2025 – System + paper submissions due * September 12 2025 – Notification of acceptance * September 19 2025 – Camera ready papers * November 8-9 2025 – CODI at EMNLP
All deadlines are 11.59 pm UTC -12h (AoE, "Anywhere on Earth").
**Information:**
Contact the organizers: disrpt_chairs(a)googlegroups.com
Official website: https://sites.google.com/view/disrpt2025/
Google group for participants, please join us on: disrpt2025_participants(a)googlegroups.com
**Organization:**
Chloé Braud (CNRS - IRIT, University of Toulouse, France)
Chuyuan Li (University of British Columbia, Canada)
Janet Yang Liu (LMU Munich, Germany)
Philippe Muller (CNRS - University of Toulouse, France)
Amir Zeldes (Georgetown University, Washington DC, USA)
CODI CRAC 2025 Workshop: joint call for papers
November 5-9 2025 - EMNLP 25 - Suzhou, China
We are pleased to announce that we are organizing in 2025 the first joint CODI-CRAC workshop that will be held during EMNLP! More information on: https://sites.google.com/view/codi-crac2025/
Deadline for CODI CRAC papers: July 30 2025
We will host 2 shared tasks, the CRAC and the DISRPT shared tasks. More information on:
- CRAC shared task: https://ufal.mff.cuni.cz/corefud/crac25
- DISRPT shared task: https://sites.google.com/view/disrpt2025/ Aims and scope
The last few years have seen a dramatic improvement in the ability of NLP systems and Large Language Models to understand and produce words, sentences and in some cases longer texts. This development has created a renewed interest in discourse problems as researchers move towards the processing of long-form documents and conversations. There is a surge of activity in discourse pretraining tasks, coherence models, summarization for long texts and conversations, corpora for discourse level reading comprehension and formal parsing, as well as discourse related/aided representation learning, to name a few.
Discourse, roughly the interactions of context, form and meaning above the sentence level, is at the intersection of many areas in Computational Linguistics and NLP, since it is concerned with all levels of linguistic representation, allowing the modeling of textual coherence and inference leveraging long-distance links within documents.It thus brings together researchers working on different areas but facing similar issues with coherence and cohesion, document-level structure, long text and long context.
In 2025, we organize the first joint CODI-CRAC workshop. The CODI workshop has been a forum for a broad range of work at the discourse level. The CRAC workshop has been a primary venue for researchers interested in the computational modeling of reference, anaphora, and coreference. Together, these workshops have catalyzed work to advance research on discourse level problems and have served as a forum for the discussion of suitable datasets and reliable evaluation methods.
This joint edition corresponds to the 6th CODI workshop and the 8th CRAC workshop. It will welcome contributions from all the areas below, including state of the art textual NLU and NLG work using LLMs, as well as classic structured work on automatic discourse analysis -- corresponding to challenging tasks such as coreference resolution or discourse parsing -- to encourage interaction between communities. The workshop is set to host the fourth edition of the DISRPT shared task on Discourse Relation Parsing and Treebanking and the fourth edition of the CRAC shared task on Multilingual Coreference Resolution.
The workshop is planned as a 1 day event which brings together different subcommunities. It will feature invited talks and regular papers. We also accept papers accepted at other major conferences for non-archival presentation, including Findings papers.
Topics of interest
We welcome papers on symbolic and probabilistic approaches, corpus development and analysis, as well as machine and deep learning approaches to discourse. We appreciate theoretical contributions as well as practical applications, including demos of systems and tools. The goal of the workshop is to provide a forum for the community of NLP researchers working on all aspects of discourse.
Topics of interest include, but are not limited to:
- discourse structure
- discourse connectives
- discourse relations
- annotation tools and schemes for discourse phenomena
- corpora annotated with discourse phenomena
- discourse parsing
- cross-lingual discourse processing
- cross-domain discourse processing
- anaphora and coreference resolution
- event coreference
- argument mining
- coherence modeling
- discourse and semantics
- discourse in applications such as machine translation, summarization, etc.
- evaluation methodology for discourse processing
- discourse pretraining tasks
- long-text modeling and generation
Submissions
We solicit three categories of papers: regular (long and short) workshop papers, demos and extended abstracts. Only regular workshop papers and demos will be included in the proceedings as archival publications.
Double submission of papers is allowed, but this information will need to be disclosed at submission time.
Regular papers must describe original unpublished research. Long papers may consist of up to 8 pages of content, plus unlimited pages for references. Short papers can be up to 4 pages, plus unlimited pages for references. Demo submissions may describe systems, tools, visualizations, etc., and may consist of up to 4 pages, plus unlimited pages for references.
Each submission can contain unlimited pages for Appendices but the paper submissions need to remain fully self-contained, as these supplementary materials are completely optional, and reviewers are not even asked to review them.
Extended abstracts can describe work in progress. These may be two pages long (without references). Extended abstracts are non-archival. They will be included in the workshop program and handbook, but will not appear in the workshop proceedings.Paper accepted or rejected at one of the main conferences
We also invite presentations of paper accepted at another main conference, a specific deadline and submission process will be communicated later on. They will be included in the workshop program and handbook, but will not appear in the workshop proceedings.
We also fast-track ARR papers with reviews, with timeline TBA.
Submission website
All submissions must be anonymous and follow the EMNLP 2025 formatting instructions described here: https://aclrollingreview.org/cfp
Submission websites:
* CODI: https://softconf.com/emnlp2025/codi2025/ * DISRPT: https://softconf.com/emnlp2025/disrpt2025/ * CRAC: https://softconf.com/emnlp2025/crac2025/ Schedule
- July 30 2025: CODI CRAC papers due
- September 5 2025:Notification of acceptance
- September 19 2025:Camera ready deadline
- November 8-9 2025-:CODI-CRAC workshop
All deadlines are 11.59 pm UTC -12h ("anywhere on Earth").
Invited Speakers
- Tanya Goyal, Cornell University.
- Nancy F. Chen, Institute of Infocomm Research (I2R), A-STAR, Singapore
Organizers
- Chloé Braud, CNRS-IRIT
- Christian Hardmeier, IT University of Copenhagen
- Chuyuan (Lisa) Li, University of British Columbia
- Jessy Li, University of Texas, Austin
- Sharid Loáiciga, University of Gothenburg
- Vincent Ng, University of Texas at Dallas
- Michal Novák, Charles University, Prague
- Maciej Ogrodniczuk, Institute of Computer Science, Polish Academy of Sciences
- Massimo Poesio, Queen Mary University of London and University of Utrecht
- Sameer Pradhan, University of Pennsylvania and cemantix
- Michael Strube, Heidelberg Institute for Theoretical Studies
- Amir Zeldes, Georgetown University, Washington DC
To contact the organizers, please send an email to: codi-crac-workshop(a)googlegroups.com
We are seeking qualified applicants for a position as Language Data Scientist on the ERC Synergy grant ‘NILOMORPH: The evolution of suprasegmental morphology in West Nilotic’, led by Matthew Baerman. The successful candidate will perform a key role in managing, processing and analyzing language data generated across the multiple teams that make up the project. The position is based at the Surrey Morphology Group at the University of Surrey, in Guildford, UK, and provides the opportunity to work in the vibrant and highly collegial research environment for which the SMG is renowned.
The NILOMORPH project aims to reconstruct the morphological evolution of the West Nilotic languages, spoken primarily in South Sudan and neighboring countries. These languages have developed some of the most remarkable morphological systems on the planet, where simultaneous manipulation of multiple phonological features (vowel length, vowel height, tone, phonation type) results in enormous paradigms marked solely by the modulation of vowel properties. NILOMORPH combines fieldwork, experimental methods, and historical linguistics to account for the phonological, morphological and psycholinguistic pathways that led to this unique outcome. The project is spread across multiple teams, based in the UK, France, Germany and the USA, and will make use of a diverse range of language data taken from multiple sources: ongoing fieldwork, previous studies, text corpora and newly-generated reconstructions, as well as well as the results of computational simulations and artificial language learning experiments.
The successful candidate will develop and execute a set of data management and analysis tools will enable the diverse international team of researchers to create, access and manipulate the language data that is central to the NILOMORPH project. As a core member of the Surrey-based team, the successful candidate will provide expert guidance on methods and tools for data analysis, including data design, data management protocols and statistical analysis. They will collaborate closely in the writing and dissemination of papers and presentations, and be expected to take initiative in the formulation of the research agenda. They will also participate in the broader activities both of the NILOMORPH group and of the Surrey Morphology Group (SMG). The candidate will have opportunities at SMG to develop their research leadership profile while interacting with world-class researchers.
Suitable candidates with a background in linguistics (or linguistics-adjacent fields) are strongly encouraged to apply. Please refer to the full job description available at the application website for essential and desirable criteria.
The following documents will be required:
CV
Contact details for 2 academic/industry referees.
Three relevant works of research (DOI/URL links if possible).
Research statement of one page, describing research interests and experience, and how you think this makes you suitable for the advertised post.
A completed application form
The website for applications is https://jobs.surrey.ac.uk/vacancy.aspx?ref=031325. Interviews are planned for 02 September 2025 and will be held online. Please contact Matthew Baerman m.baerman(a)surrey.ac.uk<mailto:m.baerman@surrey.ac.uk> with any questions.
--
Dr. Sacha Beniamine (he/him)
It's pronounced [saʃa benjamin] (stress is irrelevant)
Leverhulme Early Career Fellow
Surrey Morphology Group, School of Literature and Languages
s.beniamine(a)surrey.ac.uk<mailto:s.beniamine@surrey.ac.uk> | smg.surrey.ac.uk<http://smg.surrey.ac.uk/>
[University of Surrey] <http://www.surrey.ac.uk/?utm_source=emailsignature&utm_medium=internal&utm_…>
Senate House, University of Surrey, Guildford, Surrey, GU2 7XH, UK
The First Workshop on Optimal Reliance and Accountability in Interactions
with Generative Language Models (*ORIGen*) will be held in conjunction with
the Second Conference on Language Modeling (COLM) at the Palais des Congrès
in Montreal, Quebec, Canada, on October 10, 2025!
*ORIGen invites submission of Late Breaking papers, with a fast review
cycle. Late Breaking submissions are due July 10, 2025!*
With the rapid integration of generative AI, exemplified by large language
models (LLMs), into personal, educational, business, and even governmental
workflows, such systems are increasingly being treated as “collaborators”
with humans. In such scenarios, underreliance or avoidance of AI assistance
may obviate the potential speed, efficiency, or scalability advantages of a
human-LLM team, but simultaneously, there is a risk that subject matter
non-experts may overrely on LLMs and trust their outputs uncritically, with
consequences ranging from the inconvenient to the catastrophic. Therefore,
establishing optimal levels of reliance within an interactive framework is
a critical open challenge as language models and related AI technology
rapidly advances.
* What factors influence overreliance on LLMs?
* How can the consequences of overreliance be predicted and guarded against?
* What verifiable methods can be used to apportion accountability for the
outcomes of human-LLM interactions?
* What methods can be used to imbue such interactions with appropriate
levels of “friction” to ensure that humans think through the decisions they
make with LLMs in the loop?
The ORIGen workshop provides a new venue to address these questions and
more through a multidisciplinary lens. We seek to bring together broad
perspectives from AI, NLP, HCI, cognitive science, psychology, and
education to highlight the importance of mediating human-LLM interactions
to mitigate overreliance and promote accountability in collaborative
human-AI decision-making.
Please see the posted announcement
<https://origen-workshop.github.io/announcements/late-breaking-submission-tr…>
[1] and our call for papers
<https://origen-workshop.github.io/submissions/> [2] for more!
[1]
https://origen-workshop.github.io/announcements/late-breaking-submission-tr…
[2] https://origen-workshop.github.io/submissions/
Nikhil Krishnaswamy
Assistant Professor of Computer Science
*Colorado State University*
10th Symposium on Corpus Approaches to Lexicogrammar (LxGr2025)
LxGr2025 will be held online on Friday 11 and Saturday 12 July 2025.
Symposium programme and registration (free): https://ehu.ac.uk/lxgr.
Registration closes on Wednesday 9 July.
If you have any problems registering, or have questions, please contact lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>.
________________________________
Edge Hill University<http://ehu.ac.uk/home/emailfooter>
Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter>
University of the Year, Educate North 2021/21
________________________________
This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>
Dear colleagues,
We warmly invite you to take part in SyntaxFest 2025, a biennial international event dedicated to empirical approaches to syntax, including statistical language analysis, linguistic annotation, and natural language processing. The event will take place in Ljubljana, Slovenia, from 26 to 29 August 2025, at a central venue in Ljubljana's historic center.
This year, the program will consist of:
- Five co-located workshops: IWPT, TLT, DepLing, UDW, and QUASY
- Two pre-conference workshops by the UniDive COST Action "Universality, Diversity and Idiosyncrasy in Language Technology"
- Six keynote talks by leading researchers in linguistics and NLP:
- Isabel Papadimitriou (Kempner Institute for the Study of Natural and Artificial Intelligence at Harvard University)
- Miryam de Lhoneux (KU Leuven)
- Daniel Zeman (Charles University Prague)
- Artur Stepanov (University of Nova Gorica)
- Amir Zeldes (Georgetown University)
- Xiaofei Lu (Pennsylvania State University)
- Over 70 peer-reviewed papers on diverse topics in empirical syntactic analysis
For more details and registration, please visit:
https://syntaxfest.github.io/syntaxfest25/
Early registration deadline (extended): 15 July 2025
We look forward to seeing you this August!
The SyntaxFest 2025 Organizing Committee
[2nd CFP] - (R2LM) From Rules to Language Models: Comparative Performance Evaluation @ RANLP 2025 (Varna, Bulgaria) - 11-13 September 2025
Dear colleagues,
We are pleased to announce the second call for papers for the R2LM Workshop - From Rules to Language Models: Comparative Performance Evaluation at RANLP 2025.
https://r2lm2025.github.io/R2LM/
Workshop Description
Deep learning (DL) and large language models (LLMs) have driven major advances in natural language processing (NLP), enabling impressive performance across many tasks. However, they continue to face key challenges in handling complex linguistic phenomena such as multiword expressions, long-context reasoning, and robustness to adversarial inputs. In parallel, concerns remain about the scalability, interpretability, and domain adaptability of these models, particularly in applications requiring high precision, such as grammar checking, legal analysis, or medical NLP. These limitations have sparked renewed interest in rule-based and knowledge-based approaches, which often offer better explainability and remain competitive, especially in low-resource or high-stakes scenarios.
Our workshop aims to gather contributions that deal with the following topics:
• Role of rule-based and knowledge-based NLP methods in modern applications
• Comparative analysis of rule-based, machine-learning, deep-learning and large language models for different NLP tasks
• Emerging trends in NLP research beyond deep learning and Large Language Models
• Limitations and performance bottlenecks in scalability and accuracy of deep learning models
Submission Details
• Long papers: up to 8 pages (excluding references)
• Short papers: up to 4 pages (excluding references)
• Format: ACL-style (LaTeX or MS Word)
• Submission portal and template info available on the RANLP 2025 website
Important dates
Paper Submission Deadline: 15 July 2025 (NEW!!!!!)
Notification of Acceptance: 10 August 2025
Workshop date: 11, 12 or 13 September 2025
Organising Committee:
Alicia Picazo-Izquierdo, University of Alicante, Spain
Ernesto Luis Estevanell-Valladares, University of Alicante, Spain
Rafael Muñoz Guillena, University of Alicante, Spain
Ruslan Mitkov, Lancaster University, UK
Raúl García Cerdá, University of Alicante, Spain
Bonn Talks on Research Trends in Applied Linguistics - Large- and
fine-grained indices of syntactic complexity and language learner
development (Prof. Scott Crossley, Vanderbilt University, USA)
July 4, 2.15 pm – 5.45 pm CEST
Register here:
https://uni-bonn.zoom-x.de/meeting/register/JhKP5tfdT-eQAP4TxAepWQ
*Abstract:*This talk and its subsequent workshop will introduce
approaches to measuring syntactic properties in the English language,
with a specific focus on large- and fine-grained syntactic measures.
Approaches to measure syntax automatically through part-of-speech (POS)
taggers and dependency parsers will be covered. The follow-up workshop
will focus on how POS taggers and dependency parses can be used to
assess language learner data at the large- and fine-grained levels in a
large corpus focusing on English as a Second Language (EFL) learners.
Data analysis techniques and hands-on data exploration will provide
practical applications using learner corpora.
Prof. Dr. Robert Fuchs | Head of Department and Professor of English
Linguistics | Department of English, American and Celtic Studies |
University of Bonn | Rabinstr. 8 53113 Bonn, Germany |
https://uni-bonn.academia.edu/RFuchs |
https://www.iaak.uni-bonn.de/bael/en/people/chair/prof-dr-robert-fuchs |
https://sites.google.com/view/rflinguistics/
*Recent publications:*
Coats, S., Basile, A., Morin, C. & Fuchs, R. (to appear). *The YouTube
Corpus of Singapore English Podcasts*. /English World-Wide/
Fuchs, R. et al. (to appear). *Non-standard morphosyntactic variation in
L2 English varieties world-wide: A corpus-based study
<https://www.sciencedirect.com/science/article/pii/S0024384125000737>*.
/Lingua/.
Fuchs, R., Wiltshire, C. & Sarmah, P. (to appear). *The role of English
in the linguistic ecology of Northeast India
<https://www.academia.edu/125365118/The_role_of_English_in_the_linguistic_ec…>*.
In P. Siemund, et al. (Eds.), /World Englishes in their Local
Multilingual Ecologies/. Amsterdam: Benjamins.
Lange, C., & Fuchs, R. (to appear). *English in India*. In R. Hickey &
K. Burridge (Eds.), /New Cambridge History of the English Language/.
Cambridge: CUP.
Fuchs, R. (2025). *Influencing people around the globe - The linguistic
expression of persuasion across varieties of English worldwide*
<https://www.academia.edu/107491904/Influencing_people_around_the_globe_The_…>.
In D. Dayter, & S. Rüdiger (Eds.), /Manipulation, Influence, and
Deception: The Changing Landscape of Persuasive Language/, 135-156.
Cambridge: CUP.
Dear all,
If you weren't able to attend this year's Lancaster Summer Schools in Corpus Linguistics or if you'd like to go back to some of the key content, you can now access the recordings of the lectures online.
Watch the lectures here:
https://cass.lancs.ac.uk/free-lancaster-lectures-on-corpus-linguistics
Best,
Vaclav
Professor Vaclav Brezina
Professor in Corpus Linguistics
Co-Director of ESRC Centre for Corpus Approaches to Social Science
Lancaster University
Lancaster, LA1 4YD
Office: County South, room C05
T: +44 (0)1524 510828
@vaclavbrezina
[cid:image001.jpg@01DBEB60.87361D60]<http://www.lancaster.ac.uk/arts-and-social-sciences/about-us/people/vaclav-…>