[Apologies for cross-postings]
********************************************************************************
First Call for Papers
21st Workshop on Multiword Expressions (MWE 2025)
Organized, sponsored and endorsed by SIGLEX, the Special Interest Group on the Lexicon of the ACL
Full-day workshop collocated with NAACL 2025, Albuquerque, New Mexico, U.S.A., May 3 or 4, 2025
Hybrid (on-site & on-line)
Submission deadline: January 30, 2025
MWE 2025 website: https://multiword.org/mwe2022/ https://multiword.org/mwe2025/
********************************************************************************
Multiword expressions (MWEs), i.e., word combinations that exhibit lexical, syntactic, semantic, pragmatic, and/or statistical idiosyncrasies (Baldwin and Kim, 2010), such as “by and large”, “hot dog”, “make a decision” and “break one's leg” are still a pain in the neck for Natural Language Processing (NLP). The notion encompasses closely related phenomena: idioms, compounds, light-verb constructions, phrasal verbs, rhetorical figures, collocations, institutionalized phrases, etc. Given their irregular nature, MWEs often pose complex problems in linguistic modeling (e.g. annotation), NLP tasks (e.g. parsing), and end-user applications (e.g. natural language understanding and Machine Translation), hence still representing an open issue for computational linguistics (Constant et al., 2017).
For more than two decades, modelling and processing MWEs for NLP has been the topic of the MWE workshop organised by the MWE section https://multiword.org/ of ACL-SIGLEX http://www.siglex.org/ in conjunction with major NLP conferences since 2003. Impressive progress has been made in the field, but our understanding of MWEs still requires much research considering their need and usefulness in NLP applications. This is also relevant to domain-specific NLP pipelines that need to tackle terminologies most often realised as MWEs. Following previous years, for this 21st edition of the workshop, we identified the following topics on which contributions are particularly encouraged:
-
MWE processing to enhance end-user applications. MWEs gained particular attention in end-user applications, including Machine Translation (MT) (Zaninello and Birch, 2020), simplification (Kochmar et al., 2020), language learning and assessment (Paquot et al., 2020), social media mining (Pelosi et al., 2017), and abusive language detection (Zampieri et al. 2020). We believe that it is crucial to extend and deepen these first attempts to integrate and evaluate MWE technology in these and further end-user applications. -
MWE processing and identification in the general language, as well as in specialized languages and domains: Multiword terminology extraction from domain-specific corpora (Lossio-Ventura et al, 2014) is of particular importance to various applications, such as MT (Semmar and Laib, 2017), or for the identification and monitoring of neologisms and technical jargon (Chatzitheodorou and Kappatos, 2021). -
MWE processing in low-resource languages: The PARSEME shared tasks (2017 https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_05_MWE_2017___lb__EACL__rb__&subpage=CONF_40_Shared_Task, 2018 https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_04_LAW-MWE-CxG_2018___lb__COLING__rb__&subpage=CONF_40_Shared_Task, 2020 https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_02_MWE-LEX_2020___lb__COLING__rb__&subpage=CONF_40_Shared_Task) among others, have fostered significant progress in MWE identification, providing datasets that include low-resource languages, evaluation measures, and tools that now allow fully integrating MWE identification into end-user applications. There are continuous efforts in this direction (Diaz Hernandez, 2024) and a few of them have also explored methods for the automatic interpretation of MWEs (Bhatia et al., 2018), and their processing in low-resource languages (Eder et al., 2021). Resource creation and sharing should be pursued in parallel with the development of multilingual benchmarks for MWE identification (Savary et al., 2023). -
MWE identification and interpretation in LLMs: Most current MWE processing is limited to their identification and detection using pre-trained language models, but we still lack understanding about how MWEs are represented and dealt with therein (Garcia et al., 2021), how to better model the compositionality of MWEs from semantics (Phelps et al., 2024). Now that NLP has shifted towards end-to-end neural models like BERT, capable of solving complex tasks with little or no intermediary linguistic symbols, questions arise about the extent to which MWEs should be implicitly or explicitly modelled (Shwartz and Dagan, 2019). -
New and enhanced representation of MWEs in language resources and computational models of compositionality as gold standards for formative intrinsic evaluation.
Through this workshop, we will bring together and encourage researchers in various NLP subfields to submit their MWE-related research, We also intend to consolidate the converging results of previous joint workshops LAW-MWE-CxG 2018 http://multiword.sourceforge.net/lawmwecxg2018/, MWE-WN 2019 http://multiword.sourceforge.net/mwewn2019/ and MWE-LEX 2020 http://multiword.sourceforge.net/mwelex2020/, the joint MWE-WOAH panel in 2021 https://multiword.org/mwe2021/#program, the MWE-SIGUL 2022 joint session https://multiword.org/mwe2022/, and the MWE-UD 2024 https://multiword.org/mweud2024/, extending our scope to MWEs in e-lexicons, and WordNets, MWE annotation, as well as grammatical constructions. Correspondingly, we call for papers on research related (but not limited) to MWEs and constructions in:
-
Computationally-applicable theoretical work in psycholinguistics and corpus linguistics; -
Annotation (expert, crowdsourcing, automatic) and representation in resources such as corpora, treebanks, e-lexicons, WordNets, constructions (also for low-resource languages); -
Processing in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG, LFG, TAG, UD, etc.); -
Discovery and identification methods, including for specialized languages and domains such as clinical or biomedical NLP; -
Interpretation of MWEs and understanding of text containing them; -
Language acquisition, language learning, and non-standard language (e.g. tweets, speech); -
Evaluation of annotation and processing techniques; -
Retrospective comparative analyses from the PARSEME shared tasks; -
Processing for end-user applications (e.g. MT, NLU, summarisation, language learning, etc.); -
Implicit and explicit representation in pre-trained language models and end-user applications; -
Evaluation and probing of pre-trained language models; -
Resources and tools (e.g. lexicons, identifiers) and their integration into end-user applications; -
Multiword terminology extraction; -
Adaptation and transfer of annotations and related resources to new languages and domains including low-resource ones.
Submission formats:
The workshop invites two types of submissions:
-
archival submissions that present substantially original research in both long paper format (8 pages + references) and short paper format (4 pages + references). -
non-archival submissions of abstracts describing relevant research presented/published elsewhere which will not be included in the MWE proceedings.
Paper submission and templates
Papers should be submitted via the workshop's submission page https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/MWE ( https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/MWE). Please choose the appropriate submission format (archival/non-archival). Archival papers with existing reviews will also be accepted through the ACL Rolling Review. Submissions must follow the ACL stylesheet https://github.com/acl-org/acl-style-files.
Important Dates
Paper Submission Deadline: January 30, 2025
Notification of acceptance: March 1, 2025
Camera-ready papers due: March 10, 2025
Workshop: May 3 or 4, 2025
All deadlines are at 23:59 UTC-12 (Anywhere on Earth).
Organizing Committee
Verginica Barbu Mititelu, Voula Giouli, Grazina Korvel, A. Seza Doğruöz, Alexandre Rademaker, Atul Kr. Ojha, Mathieu Constant
Anti-harassment policy
The workshop follows the ACL anti-harassment policy https://www.aclweb.org/adminwiki/index.php?title=Anti-Harassment_Policy.
Contact
For any inquiries regarding the workshop, please send an email to the Organizing Committee at mweworkshop2023@googlegroups.com.