First CfP: Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD 2024)
Co-located with LREC-COLING 2024 Torino, Italy and online
May 25, 2024
Workshop Webpage: https://multiword.org/mweud2024/
We are pleased to announce that the multiword expressions (MWE) and Universal Dependencies (UD) research communities are joining forces in 2024 to organize a joint workshop. This is a timely collaboration because the two communities clearly have overlapping interests. For instance, while UD has several dependency relations that can be used to annotate MWEs, both annotation guidelines (i.e. is syntactic irregularity and inflexibility or semantic non-compositionality the leading criterion?) and annotation practice (both across treebanks for a single language and across languages) for these relations can be improved (Schneider and Zeldes, 2021). The PARSEME MWE-annotated corpora for 26 languages build on UD annotated corpora (Savary et al., 2023). Both communities share an interest in developing guidelines, data-sets, and tools that can be applied to a wide range of typologically diverse languages, raising fundamental questions about tokenization, lemmatization, and morphological decomposition of tokens. Proposals for harmonizing annotation practices between what has been achieved in PARSEME and UD and expanding PARSEME MWE annotation to non-verbal MWEs are also central to the recently started UniDive COST action (CA21167) https://unidive.lisn.upsaclay.fr/doku.php?id=start.
The workshop invites submissions of original research on MWE, UD, and the interplay of both. In particular, the following topics are especially relevant:
-
Sensitivity of LLMs to MWE and syntactic dependencies. Studies along the lines of Manning et al. (2020) (UD), Nedumpozhimana and Kelleher (2021), Garcia et al. (2021), Fakharian and Cook (2021), Moreau et al. (2018) (MWE), and others on the question to what extent LLMs make use of syntactic dependencies or are capable of detecting MWEs and capturing their semantics. -
Applicability of UD and MWE annotation and discovery for low-resource and typologically diverse languages and language varieties. Both UD and PARSEME aim at universal applicability across a wide range of languages. Much theoretical, computational, and empirical work concentrates on high-resource languages however. Applying these frameworks to typologically diverse languages may lead one to reconsider the notion of token, word, and morphological segmentation, and to reassess the notion of MWE for languages that feature compounding or incorporation (Baldwin et al., 2021; Haspelmath, 2023). -
Case studies. Studies on the consistency, coverage or universal applicability of MWE annotation in the UD or PARSEME frameworks, as well as studies on automatic detection and interpretation of MWEs in corpora. -
MWE and UD processing to enhance end-user applications. MWEs have gained particular attention in end-user applications, including MT (Zaninello and Birch, 2020; Han et al., 2021), simplification (Kochmar et al., 2020), language learning and assessment (Paquot et al., 2019; Christiansen and Arnon, 2017), social media mining (Maisto et al., 2017), and abusive language detection (Zampieri et al., 2020; Caselli et al., 2020). We believe that it is crucial to extend and deepen these first attempts to integrate and evaluate MWE technology in these and further end-user applications. -
Testing developed systems on the latest dataset versions. Authors are also encouraged to submit papers that test the developed systems using the recent UD 2.13 and/or PARSEME 1.3 releases.
Organizational Details
-
The workshop is sponsored by ACL-SIGLEX https://siglex.org/ and UniDive https://unidive.lisn.upsaclay.fr/doku.php?id=start. -
UniDive members with accepted papers may be eligible for travel reimbursement. -
If you are based in an underrepresented country or work on low-resource languages and have an accepted paper, you may be eligible for an ACL-SIGLEX travel grant of up to 500 USD. -
The workshop follows LREC-COLING’s hybrid online/onsite format. -
Workshop proceedings will be published in the ACL Anthology. -
The workshop follows the ACL anti-harassment policy https://www.aclweb.org/adminwiki/index.php/Anti-Harassment_Policy.
Submission Instructions
The workshop invites two types of submissions:
-
archival submissions that present substantially original research in both long paper format (8 pages + references) and short paper format (4 pages + references) -
non-archival submissions of abstracts describing relevant research presented/published elsewhere which will not be included in the MWE-UD proceedings.
Papers should be submitted via the workshop’s START submission page (link will be provided once available). Please choose the appropriate submission format (archival/non-archival). Submissions must follow the LREC-COLING 2024 stylesheet https://lrec-coling-2024.org/authors-kit/.
When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones)
Archival papers with existing reviews from ACL Rolling Review will also be considered. A paper may not be simultaneously under review through ARR and MWE-UD. A paper that has or will receive reviews through ARR may not be submitted for review to MWE-UD. Important Dates (Tentative)
Paper submission: Feb 25, 2024
ARR paper commitment: Mar 25, 2024
Notification of acceptance: Apr 1, 2024
Camera ready papers due: Apr 8, 2024
Workshop: May 25, 2024
All deadlines are at 23:59 UTC-12 (Anywhere on Earth). Organizing Committee
Archna Bhatia, Gosse Bouma, Kilian Evang, Marcos Garcia, Voula Giouli, Lifeng Han, Joakim Nivre.
For any inquiries contact the Organizing Committee at mweud2024-organizers@uni-duesseldorf.de.