[Corpora-List] Job Offer: PhD Causal Machine Learning Applied to NLP and the Study of Large Language Models.

22 May 2023


      Job Offer: PhD Causal Machine Learning Applied to NLP and the Study of 
Large Language Models.
Starting date: November 1st, 2023 (flexible)
Application deadline: From now until the position is filled
Interviews (tentative):  Beginning of June and latter if the position is 
still open
Salary: ~2000€ gross/month (social security included)
Mission: research oriented (teaching possible but not mandatory)
Place of work (no remote): Laboratoire d'Informatique de Grenoble, CNRS, 
Grenoble, France
Keywords: natural language processing, causal machine learning, 
interpretability, analysis, robustness, large language models, 
controllability
Description:
Natural language processing (NLP) has undergone a paradigm shift in 
recent years, owing to the remarkable breakthroughs achieved by large 
language models (LLMs). Despite being purely "correlation machines" 
[CorrelationMachine], these models have completely altered the landscape 
of NLP by demonstrating impressive results in language modeling, 
translation, and summarization. Nonetheless, the use of LLMs has also 
surfaced crucial questions regarding their reliability and transparency. 
As a result, there is now an urgent need to gain a deeper understanding 
of the mechanisms governing the behavior of LLMs, to interpret their 
decisions and outcomes in principled and scientifically grounded ways.
A promising direction to carry out such analysis comes from the fields 
of causal analysis and causal inference [CausalAbstraction]. Examining 
the causal relationships between the inputs, outputs, and hidden states 
of LLMs, can help to build scientific theories about the behavior of 
these complex systems. Furthermore, causal inference methods can help 
uncover underlying causal mechanisms behind the complex computations of 
LLMs, giving hope to better interpret their decisions and understand 
their limitations [Rome].
Thus, the use of causal analysis in the study of LLMs is a promising 
research direction to gain deeper insights into the workings of these 
models.
As a Ph.D student working on this project, you will be expected to 
develop a strong understanding of the principles of causal inference and 
their application to machine learning, see for example the invariant 
language model framework [InvariantLM]. You will have the opportunity to 
work on cutting-edge research projects in NLP, contributing to the 
development of more reliable and interpretable LLMs. It is important to 
note that the Ph.D. research project should be aligned with your 
interests and expertise. Therefore, the precise direction of the 
research can and will be influenced by the personal taste and research 
goals of the students. It is encouraged that you bring your unique 
perspective and ideas to the table.
SKILLS
Master degree in Natural Language Processing, computer science or data 
science.
Mastering Python programming and deep learning frameworks.
Experience in causal inference or working with LLMs
Very good communication skills in English, (French not needed).
SCIENTIFIC ENVIRONMENT
The thesis will be conducted within the Getalp teams of the LIG 
laboratory (https://lig-getalp.imag.fr/). The GETALP team has a strong 
expertise and track record in Natural Language Processing. The recruited 
person will be welcomed within the team which offer a stimulating, 
multinational and pleasant working environment.
The means to carry out the PhD will be provided both in terms of 
missions in France and abroad and in terms of equipment. The candidate 
will have access to the cluster of GPUs of both the LIG. Furthermore, 
access to the National supercomputer Jean-Zay will enable to run large 
scale experiments.
The Ph.D. position will be co-supervised by Maxime Peyrard and François 
Portet.
Additionally, the Ph.D. student will also be working with external 
academic collaborators at EPFL and Idiap (e.g., Robert West and Damien 
Teney)
INSTRUCTIONS FOR APPLYING
Applications must contain: CV + letter/message of motivation + master 
notes + be ready to provide letter(s) of recommendation; and be 
addressed to Maxime Peyrard (maxime.peyrard@epfl.ch) and François Portet 
(francois.Portet@imag.fr)
[InvariantLM] Peyrard, Maxime and Ghotra, Sarvjeet and Josifoski, Martin 
and Agarwal, Vidhan and Patra, Barun and Carignan, Dean and Kiciman, 
Emre and Tiwary, Saurabh and West, Robert, "Invariant Language Modeling" 
Conference on Empirical Methods in Natural Language Processing (2022): 
5728–5743
[CorrelationMachine] Feder, Amir and Keith, Katherine A. and Manzoor, 
Emaad and Pryzant, Reid and Sridhar, Dhanya and Wood-Doughty, Zach and 
Eisenstein, Jacob and Grimmer, Justin and Reichart, Roi and Roberts, 
Margaret E. and Stewart, Brandon M. and Veitch, Victor and Yang, Diyi, 
"Causal Inference in Natural Language Processing: Estimation, 
Prediction, Interpretation and Beyond" Transactions of the Association 
for Computational Linguistics (2022), 10:1138–1158.
[CausalAbstraction] Geiger, Atticus and Wu, Zhengxuan and Lu, Hanson and 
Rozner, Josh and Kreiss, Elisa and Icard, Thomas and Goodman, Noah and 
Potts, Christopher, "Inducing Causal Structure for Interpretable Neural 
Networks" Proceedings of Machine Learning Research (2022): 7324-7338.
[Rome] Meng, Kevin, et al. "Locating and Editing Factual Associations in 
GPT." Advances in Neural Information Processing Systems 35 (2022): 
17359-17372.
-- 
François PORTET
Professeur - Univ Grenoble Alpes
Laboratoire d'Informatique de Grenoble - Équipe GETALP
Bâtiment IMAG - Office 333
700 avenue Centrale
Domaine Universitaire - 38401 St Martin d'Hères
FRANCE

Phone:  +33 (0)4 57 42 15 44
Email:  francois.portet@imag.fr
www:    http://membres-liglab.imag.fr/portet/

2026

2025

2024

2023

2022

[Corpora-List] Job Offer: PhD Causal Machine Learning Applied to NLP and the Study of Large Language Models.