Phraseology and Multiword Expressions (PMWE) (https://langsci-press.org/catalog/series/pmwe) is a book series at Language Science Press, a born-digital scholar-led open access publisher in linguistics.
The series publishes high-quality books about conventionalized, idiosyncratic combinations of words. Within the field of phraseology such word combinations are sometimes called phrasemes, while the computational linguistics community uses the term multiword expressions for them. Various subtypes of such word combinations are of interest, such as multiword compounds, multiword terms, multiword named entities, light-verb constructions, phrasal verbs, idioms, collocations, formulaic speech, proverbs, etc.
The series is open to different approaches to create a forum for an interdisciplinary and cross-framework exchange of research results, including but not limited to the following subdisciplines:
- Computational linguistics and natural language processing
- Computer science
- Corpus linguistics
- Lexicography
- Psycholinguistics
- Theoretical linguistics
We welcome volume proposals addressing all topics related to theoretical, computational, and empirical approaches to phraseology including:
- Linguistic properties and typologies of multiword expressions, especially in multilingual frameworks
- Digital lexical resources including multiword expressions
- Description and processing of multiword expressions in syntactic and semantic frameworks (e.g., CCG, CxG, HPSG, LFG, TAG, UD)
- Identification and annotation of multiword expressions in corpora and treebanks
- Multiword expressions in machine translation and other end-user applications
- Multiword expressions and lexical innovation
- Diachronic studies and semantic change in multiword expressions
- Representation and evaluation of multiword expressions in language models (e.g., LLMs) and text generation systems
All contributions should be in English.
To submit a volume proposal, please follow the guidelines at the series home page:
https://langsci-press.org/catalog/series/pmwe
Volumes published so far:
Voula Giouli, Verginica Barbu Mititelu (eds.) Multiword expressions in lexical resources: Linguistic, lexicographic, and computational perspectives. 2024.
Victoria Beatrix Fendel (ed.) Support-verb constructions in the corpora of Greek: Between lexicon and grammar?. 2024.
Aleksandar Trklja, Łukasz Grabowski (eds.) Formulaic language: Theories and methods. 2021
Sabine Schulte im Walde, Eva Smolka (eds.) The role of constituents in multiword expressions: An interdisciplinary, cross-lingual perspective. 2020.
Yannick Parmentier, Jakub Waszczuk (eds.) Representation and parsing of multiword expressions: Current trends. 2019.
Stella Markantonatou, Carlos Ramisch, Agata Savary, Veronika Vincze (eds.) Multiword expressions at length and in depth: Extended papers from the MWE 2017 workshop. 2018.
Manfred Sailer, Stella Markantonatou (eds.) Multiword expressions: Insights from a multi-lingual perspective. 2018.
(see https://langsci-press.org/catalog/series/pmwe for links to the volumes)
Dear Colleagues,
The ACL 2025 Conference is pleased to announce that registration is now
officially open. We encourage you to register early to take advantage of
reduced rates.
Please note the following important deadlines for registration:
* Early Registration: Concludes on Wednesday, July 2, 2025, AOE.
* Late Registration: Will close for both In-Person and Virtual
attendees on Friday, July 25, 2025, at 11:59 PM CET.
* Onsite Registration: Will be available for both In-Person and
Virtual attendees from Saturday, July 26, 2025, through August 1, 2025,
at 11:59 PM CET.
Detailed information regarding the registration process can be found on
the official conference website: https://acl.swoogo.com/acl2025
We look forward to welcoming you to ACL 2025 in beautiful Vienna!
Sincerely,
The ACL Organization Team
Dear List,
the 12th International Conference on CMC and Social Media Corpora for the Humanities (CMC-Corpora) will be held at the University of Bayreuth, Germany, on the 4th and 5th of September 2025
(CfP, extended deadline paper/abstract submission: 28th of May 2025, 23:59 CEST)
The conference (https://www.cmc2025.uni-bayreuth.de/en/ ) brings together language-centered research on CMC and social media in linguistics, philologies, communication sciences, media, and social sciences with research questions from the fields of corpus and computational linguistics, language technology, text technology, and machine learning.
We adhere to a wide definition of CMC and social media, covering various media of digital communication, including email, newsgroups, forums, chat and messenger applications (e.g. WhatsApp), social networks (e.g. Facebook, Instagram, X, TikTok), gaming platforms, as well as interactions in the communication areas of video portals (YouTube), learning platforms, gaming apps, online games and virtual worlds.
Our keynotes are Gavin Brookes (Lancaster University) and Stephanie Evert (Friedrich-Alexander University Erlangen-Nuremberg).
We invite submissions to the 12th conference on following topics (deadline paper/abstract submission: 28th of May 2025, 23:59 CEST)
* Development of CMC corpora / social media corpora
* Building CMC corpora: from data collection to publication
* Open access data for CMC research: ethical and GDPR issues
* Annotating CMC data: genres, linguistic aspects, metadata
* Multimodal corpora
* Big data corpora
* Analysis of CMC corpora / social media corpora
* Sociolinguistic studies of CMC
* Discourse analysis of CMC
* Linguistic characteristics of CMC
* Multimodal (incl. visual) aspects of CMC
* Multilingualism and code-switching in CMC
* CMC in language education
* Natural language processing (NLP) of CMC data / social media data
* Normalization
* PoS tagging
* Lemmatization
* Syntactic parsing
* CMC for the benefit of digital societies
* Interdisciplinary research design and research methods in CMC for the benefit of digital societies
* Exploration of Diversity and Inclusion in CMC
* Intersection of CMC and Social Sciences
* Intersection of CMC and Human-Centered Data Science
* Intersection of CMC and Computational Social Science
* Contrastive CM studies across different languages
The conference language is English. Submissions will consist of:
* Short papers (2-4 pages – maximal 6 pages including the list of references –, following the existing template) for oral presentations
* Abstracts (max. 300 words) for poster presentations
Submission and review
Authors of accepted papers are invited to present their work at the conference (30-minute timeslots: 20-minute talks, followed by 10 minutes of discussion). Authors of accepted abstracts can present their work in progress or early-stage research during the poster session. At the start of the conference, all accepted papers will be made available in online proceedings. After the conference, speakers with the best contributions will be invited to submit extended papers for one or more special issue journal or a volume publication.
Instructions for authors
All contributions will be collected through an online platform (ConfTool): https://www.cmc2025.uni-bayreuth.de/en/index.html
Templates for the submission:
Template for MSWord: https://www.cmc2025.uni-bayreuth.de/pool/dokumente/template_word.docx
Template for LaTeX: https://www.cmc2025.uni-bayreuth.de/pool/dokumente/template_latex.zip
Local organizing committee:
* Dr. Annamaria Fabian (University of Bayreuth/Bavarian Research Institute for Digital Transformation at the Bavarian Academy of Science)
* Prof. Dr. Igor Trost (Alpen-Adria University Klagenfurt/University of Passau)
For all enquiries, please contact the organizers at cmc2025(a)uni-bayreuth.de<mailto:cmc2025@uni-bayreuth.de> and see https://www.cmc2025.uni-bayreuth.de/en/
More information on the “International Conference Series on CMC and Social Media Corpora (cmc-corpora)”:
https://cmc-corpora.org/series/#<https://cmc-corpora.org/series/>
All the best,
Annamaria Fabian
This is the second call for participation for the *2nd SIGIR 2025 Workshop on Simulations for Information Access (Sim4IA)*.
The workshop will be held with SIGIR 2025 in Padua, Italy. It will provide a unique platform for researchers and practitioners to explore and discuss advancements in simulations for information access systems.
## tl;dr
----------
- 17 July 2025, co-located with SIGIR 2025 in Padua, Italy
- Micro shared task data and framework available
- Tech and infrastructure talks/presentations welcome
- Keynote by Christine Bauer confirmed
- We are on the ACM Slack: https://acmsigir.slack.com/archives/C08STM45N90
- Website: https://sim4ia.org/sigir2025/
## Micro Shared Task Data and Framework Available
-----------------------------------------------------------------------
To drive a more focused discussion at the workshop, we designed a micro shared task that demonstrates how a shared task in user simulations might look. On 16 May 2025, we released the first training data set as well as a prebundled and dockerized version of SimIIR to give everyone a head start on the shared task.
Our shared task concept is based on the fundamental design principle of validating user simulations instead of measuring system effectiveness. We envision users interacting with a particular IA system, such as a traditional search engine (Task A) or a conversational system (Task B). We challenge participants to design and implement user simulators that can mimic the interactions of real users with these systems with a high degree of fidelity. The workshop features a stripped-down version of this concept, a micro shared task.
We will discuss the submissions and ideas for the next steps or evaluation measures at the workshop. Non-binding expression of interest to take part in the micro shared tasks: https://forms.gle/ftV8cwjywHWsBhCw9
More information on the shared task, data sets, and framework: https://sim4ia.org/sigir2025/#micro-shared-task
## Keynote by Christine Bauer
-----------------------------------------
We are happy to announce that Christine Bauer has confirmed to give a keynote on “From toy models to tactics: What user simulation is good for”.
Christine Bauer is a Professor of Interactive Intelligent Systems at the Department of Artificial Intelligence and Human Interfaces (AIHI) at the University of Salzburg. She is involved in the EXDIGIT initiative, emphasizing interdisciplinary technologies in digital sciences. Her research lies at the intersection of human-centered computing, data science, and artificial intelligence, with a focus on context-aware recommender systems, particularly in the music and media domains. Her core interests include fairness and multi-method evaluation. Her multidisciplinary background drives her research activities.
More information on the keynote: https://sim4ia.org/sigir2025/#keynote
## Invitation of Tech/Infrastructure Talks
-----------------------------------------------------
We reserved a special time slot at the workshop for talks on recent technologies and/or infrastructures for (user) simulations, and invite you to submit your ideas for such talks at the workshop.
Send a short email with your idea in the form of a title and roughly half a page of abstract to sigir2025(a)sim4ia.org
Check out the tentative program, shared task data and description, the keynote announcement, and much more at https://sim4ia.org/sigir2025/
See you in Padua!
Sim4IA Organizers
Philipp Schaer, Christin Kreutz, Krisztian Balog, Timo Breuer, and Andreas Kruff
Dear all
You are warmly invited to submit an abstract to the Shifting Power in Language Learning and Applied Linguistics with GenAI conference, which will take place in Milton Keynes, UK and online on November 13-14, 2025.
This conference will explore how power is being shifted towards, away from, and between learners and educators by AI technologies, and the new dynamic and potential changes this is bringing about in applied linguistics, languages and cultures studies. Potential topics for the papers may include, but are not limited to:
*
AI and its impact on the training and evolving roles of languages and applied linguistics educators and their relationships with learners
*
AI and its potential to support inclusive and personalised learning in languages and applied linguistics;
*
AI integration into learning, teaching and assessment of languages, cultures and applied linguistics with a focus on ethical issues and sustainability challenges;
*
Core concepts and theoretical frameworks guiding the integration of AI in applied linguistics;
*
Core concepts and theoretical frameworks guiding the integration of AI in the learning and teaching of languages and cultures;
*
Questions around the use of AI in carrying out research in languages, cultures and applied linguistics, and its impact on research processes and outputs.
Instructions for submission
We welcome submissions in the following formats:
* 20-minute presentations (online or in person)
* 40-minute facilitated discussions with up to 3 facilitators (online or in person)
Proposals should be submitted via email, by May 31st, 2025: ai-languages-conference(a)open.ac.uk <mailto:ai-languages-conference@open.ac.uk>
The following information will be requested during the submission process:
* Names, titles, contact info, institutional or organisational affiliation and short bio (max 100 words) for each presenter and facilitator
* Conference topic (selected from the list above)
* Session format (selected from the list above)
* Title of the abstract
* Abstract (max. 300 words)
Kind regards
Rachele
Dr Rachele De Felice (she/her) | Lecturer in Applied Linguistics
School of Languages and Applied Linguistics
The Faculty of Wellbeing, Education and Language Studies
The Open University
https://profiles.open.ac.uk/rachele-de-felice
Dear Colleagues,
The ACL 2025 Conference is pleased to announce that *registration is now
officially open*. We encourage you to register early to take advantage of
reduced rates.
Please note the following important deadlines for registration:
- *Early Registration:* Concludes on *Wednesday, July 2, 2025, AOE*.
- *Late Registration:* Will close for both In-Person and Virtual
attendees on *Friday, July 25, 2025, at 11:59 PM CET*.
- *Onsite Registration:* Will be available for both In-Person and
Virtual attendees from *Saturday, July 26, 2025, through August 1, 2025,
at 11:59 PM CET*.
Detailed information regarding the registration process can be found on the
official conference website: https://acl.swoogo.com/acl2025
We look forward to welcoming you to ACL 2025 in beautiful Vienna!
Sincerely,
The ACL Organization Team
--
Horacio Saggion
Full Professor / Chair in Computer Science and Artificial Intelligence
Head of the Natural Language Processing Group - TALN
Project Coordinator iDEM Project (HE)
Co-PI of the AI-BOOST project (HE)
Co-PI of the IDEAL project (HE)
Universitat Pompeu Fabra
https://twitter.com/h_saggionhttps://www.linkedin.com/in/horacio-saggion-1749b916
--
Horacio Saggion
Full Professor / Chair in Computer Science and Artificial Intelligence
Head of the Natural Language Processing Group - TALN
Project Coordinator iDEM Project (HE)
Co-PI of the AI-BOOST project (HE)
Co-PI of the IDEAL project (HE)
Universitat Pompeu Fabra
https://twitter.com/h_saggionhttps://www.linkedin.com/in/horacio-saggion-1749b916
We have three postdoc position openings at Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi. The project is based on a collaboration with a leading industry partner on the development of a conversational booking agent.
* Postdoctoral Research Scientist in Conversational AI & NLP
* Postdoctoral Research Scientist in Recommendation & Personalization
* Postdoctoral Research Scientist in Persuasive Language Generation
More information regarding responsibilities and requirements can be found on our webpage:
https://mbzuai-hiring.github.io/
Start date: To be filled immediately, July 2025
- Duration: 1‑year contract with possibility of extension
- Location: MBZUAI<https://mbzuai.ac.ae/>, Abu Dhabi, UAE
- Apply via e‑mail: NLP.IndustryProject(a)mbzuai.ac.ae
We look forward to receiving your application!
Regards,
Teresa
Teresa Lynn , PhD
Head of NLP Research Engagement
Natural Language Processing
P +971 2 811 3284 W.www.mbzuai.ac.ae<https://www.mbzuai.ac.ae/>
[mbzuai logo.png] [cid:image002.png@01DBCB09.577C26E0] <https://www.instagram.com/mbzuai> [cid:image003.png@01DBCB09.577C26E0] <https://www.facebook.com/MBZUAI> [cid:image004.png@01DBCB09.577C26E0] <https://www.youtube.com/c/mbzuai> [cid:image005.png@01DBCB09.577C26E0] <https://www.linkedin.com/school/mbzuai/> [cid:image006.jpg@01DBCB09.577C26E0] <https://twitter.com/mbzuai>
This is the last call to participate in ADoBo 2025, the shared task on automatic detection of borrowings in Spanish.
To gain access to the data, make submissions and check the leaderboard please join the competition at Codabench. Systems submissions will be due on May 26th.
https://www.codabench.org/competitions/7284/
TIMELINE
April 21: Dev set released.
May 12: Test set released
May 26: Systems output submissions.
June 9: Working notes paper submission.
June 16: Notification of acceptance (peer-reviews).
June 23: Camera ready paper submission.
September: ADoBo results to be presented at IberLEF 2025.
ORGANIZATION COMMITTEE
Elena Álvarez Mellado, Universidad Nacional de Educación a Distancia (UNED).
Julio Gonzalo, Universidad Nacional de Educación a Distancia (UNED).
Constantine Lignos, Brandeis University.
Jordi Porta Zamorano, Universidad Autónoma de Madrid (UAM).
AVISO LEGAL. Este mensaje puede contener información reservada y confidencial. Si usted no es el destinatario no está autorizado a copiar, reproducir o distribuir este mensaje ni su contenido. Si ha recibido este mensaje por error, le rogamos que lo notifique al remitente.
Le informamos de que sus datos personales, que puedan constar en este mensaje, serán tratados en calidad de responsable de tratamiento por la UNIVERSIDAD NACIONAL DE EDUCACIÓN A DISTANCIA (UNED) c/ Bravo Murillo, 38, 28015-MADRID-, con la finalidad de mantener el contacto con usted. La base jurídica que legitima este tratamiento, será su consentimiento, el interés legítimo o la necesidad para gestionar una relación contractual o similar. En cualquier momento podrá ejercer sus derechos de acceso, rectificación, supresión, oposición, limitación al tratamiento o portabilidad de los datos, ante la UNED, Oficina de Protección de datos<https://www.uned.es/dpj>, o a través de la Sede electrónica<https://sede.uned.es/> de la Universidad.
Para más información visite nuestra Política de Privacidad<https://descargas.uned.es/publico/pdf/Politica_privacidad_UNED.pdf>.
Tokshop: Tokenization Workshop (ICML 2025)
Submission to the Tokenization Workshop begins on April 14, 2025, via OpenReview. The deadline for submissions is May 30, 2025, at 11:59pm (anywhere on earth). Notifications of acceptance will be sent out on June 9, 2025, and camera-ready papers will be due shortly afterward at 11:59pm (anywhere on earth). The workshop will take place on July 18, 2025.
Workshop Description The Tokenization Workshop (TokShop) at ICML aims to bring together researchers and practitioners from all corners of machine learning to explore tokenization in its broadest sense. We will discuss innovations, challenges, and future directions for tokenization across diverse data types and modalities.
Call for Papers
Topics of interest include:
- Subword Tokenization in NLP: Analysis of techniques such as BPE, WordPiece, and UnigramLM, as well as improvements for efficiency, interpretability, and adaptability. - Multimodal Tokenization: Tokenization strategies for images, audio, video, and other modalities, including methods to align representations across different types of data. - Multilingual Tokenization: Development of tokenizers that work robustly across languages and scripts, and investigation into failure modes tied to tokenization. - Tokenizer Modification Post-Training: Methods for updating tokenizers after model training to boost performance and/or efficiency without retraining from scratch. - Alternative Input Representations: Exploration of non-traditional tokenization approaches, such as byte-level, pixel-level, or patch-based representations. - Statistical Perspectives on Tokenization: Empirical analysis of token distributions, compression properties, and correlations with model behavior. By broadening the scope of tokenization research beyond language, this workshop seeks to foster cross-disciplinary dialogue and inspire new advances at the intersection of representation learning, data efficiency, and model design.
Submission guidelines Our author guidelines follow the ICML requirements unless otherwise specified. - Paper submission is hosted on OpenReview. - Each submission should contain up to 9 pages, not including references or appendix (shorter submissions also welcome). - Please use the provided LaTeX template (Style Files) for your submission. Please follow the paper formatting guidelines general to ICML as specified in the style files. Authors may not modify the style files or use templates designed for other conferences. - The paper should be anonymized and uploaded to OpenReview as a single PDF. - You may use as many pages of references and appendix as you wish, but reviewers are not required to read the appendix. - Posting papers on preprint servers like ArXiv is permitted. - We encourage each submission to discuss the limitations as well as ethical and societal implications of their work, wherever applicable (but neither are required). These sections do not count towards the page limit. - This workshop offers both archival and non-archival options for submissions. Archival papers will be indexed with proceedings, while non-archival submissions will not. - The review process will be double-blind
Read more: https://tokenization-workshop.github.io/
(apologies for multiple postings)
Dear colleagues,
We would like to inform you that the registration for the eLex 2025 conference has now opened (https://elex.link/elex2025/registration/). The deadline for early-bird fee is 5 September 2025.
A call for Hornby bursary applications is also out (https://elex.link/elex2025/hornby-bursary/). The bursaries cover participants' registration fee, so if you intend to apply, please wait for results before paying the registration fees (you can still complete all the steps of the registration process and pay later).
Finally, the special rates for rooms at the venue and partner hotels are available: https://elex.link/elex2025/venue/. There are a limited number of rooms available so early booking is advisable (there is a very friendly cancellation option).
Please monitor the conference website for further updates on the programme, proceedings and related news.
Looking forward to seeing you at the conference.
Best regards
Iztok Kosem
Head of the eLex 2025 organising committee