PhD Studentship at University of Exeter (UK) and the Université Paris Saclay (France): Multimodal and Multilingual models for Multiword Language Processing - Corpora

21 Mar 2025


      Dear all,
We have a fully funded PhD studentship to work on the project Multimodal and Multilingual models for Multiword Language Processing as a joint PhD between the University of Exeter (UK) and the Université Paris Saclay (France). Supervisors: Aline Villavicencio, Agata Savary, Michèle Gouiffes and Rodrigo Wilkens.
Deadline for expressions of interest: March 28, 2025
Applications are done via this page 
https://adum.fr/as/ed/voirproposition.pl?site=PSaclay&matricule_prop=619... 
Once these have been supported the deadline for applications is March 31, 2025
Please contact if you're interested via this email: mmmweproject@gmail.com
All the best,
Aline
----------------------------------------------------
Aline Villavicencio https://sites.google.com/view/alinev (she/her)
Professor in Natural Language Processing
Director of the Institute for Data Science and Artificial Intelligence https://www.exeter.ac.uk/research/institutes/idsai/, University of Exeter (UK) and
University of Sheffield (UK)
----------------
Multimodal and Multilingual models for Multiword Language Processing
Idiomatic and multiword expressions (MWE) along with  domain specific (multiword) terminology and metaphors pose concrete challenges for models in various tasks (e.g., generation, reasoning, sentiment analysis, machine translation) because their meanings are often not directly linked to the meanings of the individual words that form them (Sag et al. 2002). These expressions are often conventional ways of expressing and compressing the knowledge of a particular domain or language community, and may be more familiar than their paraphrases or synonyms to native speakers, and are a mark of fluency for non-native speakers. Comparisons between human and language model preference, including large language models (LLMs), reveal that these models still lag behind human level understanding of idiomatic expressions (He et al. 2025). Moreover, when different modalities are involved, their complementarity may facilitate disambiguation. However, it is important to ensure accurate understanding across modalities (e.g. treating an expression as idiomatic in both text and images). This project aims to assess how well models handle idiomatic expressions by integrating multimodal inputs, specifically visual and textual data, and seeking to address
model shortcomings by taking into account visual and visual-temporal modalities rather than relying on a single modality. With this project, we aim to address idiom and figurative understanding by evaluating models on their ability to represent their meanings and precisely employ them as part of their generative abilities. This project is also aligned with the UniDive Cost Action by advancing the generation of data
in multiple languages and evaluating models on idiom understanding tasks across various languages. It is also linked to the SemEval 2025 AdMIRe Shared Task.
This project involves a joint supervision between the University of Exeter and Université Paris Saclay.
Deadline for expressions of interest: March 28, 2025
Applications are done via this page 
https://adum.fr/as/ed/voirproposition.pl?site=PSaclay&matricule_prop=619...  
Once these have been supported the deadline for applications is March 31, 2025